ICTNET@ TREC 2018 News Track
This is the code I tried on the News Track 2018 Task.
Data Process and Observation code here
- lower case
- stemmer
- remove stop words
- Elasticsearch BM25 (
Background Linking
) code here- Build index : title + body
- Query: title + body (Other query extension method: name entity, TFIDF, are no better than this)
- results here
Dataset | 2018 | 2019 |
---|---|---|
ndcg@5 | 0.4541 | 0.5801 |
- Elasticsearch BM25 + Wiki Dump (
Entity ranking
) code here- Build index : Wiki page with enlink refer to exact one entity, extract 100 wiki page per entity
- Query: news title + body
- Ranking entities by wiki page bm25 score
- results here
Dataset | 2018 | 2019 |
---|---|---|
ndcg@5 | 0.7191 | 0.7315 |