SEUPD@CLEF Team RISE at LongEval: Improving Search by Crafting Titles and Matching URLs
Conference Paper
Publication Date:
2025
abstract:
This report outlines the development of the RISE group’s Information Retrieval (IR) system for the LongEval-WebRetrieval CLEF 2025 Lab. The objective was to design an efficient, scalable search engine capable of handling large-scale French collections with a focus on consistent performance. The proposed system incorporates a modular architecture, including a parser, an analyzer, an indexer and a searcher, then also query translation and expansion using the Gemini LLM, and a non-neural reranking component to enhance retrieval quality. Emphasis was put on optimizing indexing and searching speed through multi-threading, improving relevance via crafting a title for each document and an URL-based document boosting based on the alignment between user queries and the document’s URL. The evaluation has followed a stepwise enhancement approach, beginning with a Lucene-based baseline.
Iris type:
04.01 - Contributo in atti di convegno
Keywords:
CLEF 2025; Document Parsing; Information Retrieval; LongEval-WebRetrieval; Query Expansion; Query Translation; Search Engine; Temporal Evolution; URL Manipulation
List of contributors:
Furlan, D.; Gibellato, G.; Nazirialhashem, S. S.; Pase, E.; Pasqualetto, A.; Tiberio, F.; Ferro, N.
Book title:
26th Working Notes of the Conference and Labs of the Evaluation Forum, CLEF 2025
Published in: