sidebar |
---|
2024_sidebar.html |
Last Updated: 17 July 2024
The Neural Cross-Language Information Retrieval (NeuCLIR) track is a TREC shared task that studies the impact of neural approaches in cross-language information retrieval and generation.
You can participate in the shared task by submitting a retrieval or generation system for evaluation. Once completed, the track provides reusable test collections for future investigations.
This year, we continue the Cross-Language (CLIR) and Multi-lingual (MLIR) news and technical document retrieval tasks. We also introduce a new Report Generation task.
Details on each of the 2024 tasks are provided below:
- Cross-Langauge Retrieval
- Cross-Langauge Technical Documents
- Multilingual Retrieval
- 🆕 Cross-Language Report Generation
Skip to Important Dates.
See the CLIR & MLIR Guidelines for full task details.
In the Cross-Language Retrieval (CLIR) task, systems receive queries in one language (English) and retrieve from a corpus of news articles written in another language (Chinese, Persian, or Russian).
ir_datasets
or TREC website. See the CLIR & MLIR Guidelines for full task details.
Technical language poses a particular problem for cross-language retrieval systems, so the Cross-Language Technical Document task focuses on testing this phenomenon in particular. Systems receive queries in one language (English) and retrieve from a corpus of technical abstracts written in another language (Chinese).
ir_datasets
or TREC website. See the CLIR & MLIR Guidelines for full task details.
The Multilingual Retrieval (MLIR) task provides systems with queries in one language (English) and a corpus of documents written in multiple languages (Chinese, Persian, and Russian). The task is to retrieve and produce a ranked list from all three languages. The queries are written in a way that there should be relevant documents in more than one language.
ir_datasets
or TREC website. See the Report Generation Guidelines for full task details.
The Cross Language Report Generation task asks the system to generate an English report with citations to documents in one of the news collection used in the CLIR task (see guideline for destils on length and citation requirements) based on a report request (example report request). The reports will be evaluated based on the information included in the text and the appropriateness of the citations (example evalaution data).
ir_datasets
or TREC website. -
March 2022Document Collections ReleasedMLIR:
neuclir/1/multi
(Persian, Russian, and Chinese)Technical:csl
(Chinese Technical Abstracts) -
25 March 2024Track Guidelines Released
-
June 2024CLIR/MLIR Topics and Report Requests released on NIST website
-
July 2024CLIR Technical Topics released on NIST website
-
6 August 2024CLIR Technical Task Submissions due to NIST
-
13 August 2024CLIR/MLIR News Task and Report Generation Task Submissions due to NIST
-
October 2024Results distributed to participants
-
November 2024TREC 2024
- The technical documents pilot has been promoted to a full task.
- Later submission deadline!
- A pilot of a query-driven report generation task (Generative IR) from the multilingual document set.
In alphabetical order:
- Dawn Lawrie, Johns Hopkins University, HLTCOE
- Sean MacAvaney, University of Glasgow
- James Mayfield, Johns Hopkins University, HLTCOE
- Paul McNamee, Johns Hopkins University, HLTCOE
- Douglas W. Oard, University of Maryland
- Luca Soldaini, Allen Institute for AI
- Eugene Yang, Johns Hopkins University, HLTCOE
- Mailing List (for the latest announcement and news)
- TREC Slack #neuclir-2024 channel (once registered for TREC)
- Twitter @neuclir
- Any questions: neuclir-organizers@googlegroups.com
NeuCLIR previously ran in 2022 and 2023. You can find the previous versions of the task below: