(cleaning in progress)
This repository contains resources developed within the following manuscript:
Xinshi Lin, Kwun Ping Lai, Zihao Wang and Wai Lam. “Entity Retrieval via Query Graph Inference”, EYRE 2018
-
collect data from DBpedia and store them into a MongoDB database (see https://github.com/linxinshi/DBpedia-Wikipedia-Toolkit)
-
build graph representation of the Wikipedia Category System (see folder "wikipedia_category_system")
-
build index (see folder "build_index")
-
edit config.py, config_object.py and mongo_object.py to specify parameters for retrieval models and index path etc.
-
execute command "python main.py"
-
check results in folder Retrieval_results (created by program and name it after the time executed)
*this implementation supports multi-processing, specify NUM_PROCESS in config.py. The program will split the queries into several parts and each process will handle one of them. Finally the program merges all results and output a complete one.
Python 3.4+
NLTK, Gensim
NetworkX <= 1.11
PyLucene 6.x
(This implementation works both on Linux and Windows. If you have PyLucene install issues on Windows, please refer to http://lxsay.com/archives/365)