Data di Pubblicazione:
2020
Abstract:
Semantic change detection task in a rel atively low-resource language like Italian is challenging. By using contextualized word embeddings, we formalize the task as a distance metric for two flexible-size sets of vectors. Various distance met rics like average Euclidean Distance, av erage Canberra distance, Hausdorff dis tance, as well as Jensen Shannon diver gence between cluster distributions based on K-means clustering and Gaussian mix ture model are used. The final predic-tion is given by an ensemble of top-ranked words based on each distance metric. The proposed method achieved better perfor-mance than a frequency and collocation based baselines.
Tipologia CRIS:
04.01 - Contributo in atti di convegno
Elenco autori:
Wang, B.; Di Buccio, E.; Melucci, M.
Link alla scheda completa:
Link al Full Text:
Titolo del libro:
CEUR Workshop Proceedings
Pubblicato in: