Comparing ANOVA Approaches to Detect Significantly Different IR Systems
Contributo in Atti di convegno
Data di Pubblicazione:
2022
Abstract:
The ultimate goal of the evaluation is to understand when two IR systems are (significantly) different. To this end, many comparison procedures have been developed over time. However, to date, most reproducibility efforts focused just on reproducing systems and algorithms, almost fully neglecting to investigate the reproducibility of the methods we use to compare our systems. In this paper, we focus on methods based on ANalysis Of VAriance (ANOVA), which explicitly model the data in terms of different contributing effects, allowing us to obtain a more accurate estimate of significant differences. In this context, we compare statistical analysis methods based on “traditional” ANOVA (tANOVA) to those based on a bootstrapped version of ANOVA (bANOVA) and those performing multiple comparisons relying on a more conservative Family-wise Error Rate (FWER) controlling approach to those relying on a more lenient False Discovery Rate (FDR) controlling approach. Our findings highlight that, compared to the tANOVA approaches, bANOVA presents greater statistical power, at the cost of lower stability.
Tipologia CRIS:
04.01 - Contributo in atti di convegno
Elenco autori:
Faggioli, G.; Ferro, N.
Link alla scheda completa:
Titolo del libro:
Proc. 12th Italian In- formation Retrieval Workshop (IIR 2022)
Pubblicato in: