Skip to Main Content (Press Enter)

Logo UNIPD
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Terza Missione
  • Competenze

UNI-FIND
Logo UNIPD

|

UNI-FIND

unipd.it
  • ×
  • Home
  • Persone
  • Pubblicazioni
  • Strutture
  • Terza Missione
  • Competenze
  1. Pubblicazioni

Improving Conversational Evaluation via a Dependency-Aware Permutation Strategy

Contributo in Atti di convegno
Data di Pubblicazione:
2022
Abstract:
The rapid growth in number and complexity of conversational agents has highlighted the need for suitable evaluation tools to describe their performance. Current offline conversational evaluation approaches rely on collections composed of multiturn conversations, each including a sequence of utterances. Such sequences represent a snapshot of reality: a single dialog between the user and a hypothetical system on a specific topic. We argue that this paradigm is not realistic enough: multiple users will ask diverse questions in variable order, even for a conversation on the same topic. In this work1 we propose a dependency-aware utterances sampling strategy to augment data available in conversational collections while maintaining temporal dependencies within conversations. Using the sampled conversations, we show that the current evaluation framework favours specific systems while penalizing others, leading to biased evaluation. We further show how to exploit dependency-aware utterances permutations in our current evaluation framework and increase the power of statistical evaluation tools such as ANOVA.
Tipologia CRIS:
04.01 - Contributo in atti di convegno
Elenco autori:
Faggioli, G.; Ferrante, M.; Ferro, N.; Perego, R.; Tonellotto, N.
Autori di Ateneo:
FAGGIOLI GUGLIELMO
FERRANTE MARCO
FERRO NICOLA
Link alla scheda completa:
https://www.research.unipd.it/handle/11577/3464723
Titolo del libro:
Proc. 30th Italian Symposium on Advanced Database Systems (SEBD 2022)
Pubblicato in:
CEUR WORKSHOP PROCEEDINGS
Journal
CEUR WORKSHOP PROCEEDINGS
Series
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0