Data di Pubblicazione:
2021
Abstract:
The semantic mismatch between query and document terms – i.e., the semantic gap – is a long-standing
problem in Information Retrieval (IR). Two main linguistic features related to the semantic gap that can
be exploited to improve retrieval are synonymy and polysemy. Recent works integrate knowledge from
curated external resources into the learning process of neural language models to reduce the effect of
the semantic gap. However, these knowledge-enhanced language models have been used in IR mostly
for re-ranking. We propose the Semantic-Aware Neural Framework for IR (SAFIR), an unsupervised
knowledge-enhanced neural framework explicitly tailored for IR. SAFIR jointly learns word, concept,
and document representations from scratch. The learned representations encode both polysemy and
synonymy to address the semantic gap. We investigate SAFIR application in the medical domain, where
the semantic gap is prominent and there are many specialized and manually curated knowledge resources.
The evaluation on shared test collections for medical retrieval shows the effectiveness of SAFIR to address
the semantic gap.
Tipologia CRIS:
04.01 - Contributo in atti di convegno
Keywords:
Knowledge-enhanced retrieval, representation learning, semantic gap, medical literature
Elenco autori:
Agosti, M.; Marchesin, S.; Silvello, G.
Link alla scheda completa:
Titolo del libro:
Proceedings of the 11th Italian Information Retrieval Workshop, September 13–15, 2021, Bari, Italy
Pubblicato in: