Skip to content

nlpaueb/aueb-bioasq6

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AUEB at BioASQ 6: Document and Snippet Retrieval

This software accompanies the following paper:

G. Brokos, P. Liosis, R. McDonald, D. Pappas and I. Androutsopoulos, "AUEB at BioASQ 6: Document and Snippet Retrieval". Proceedings of the workshop BioASQ: Large-scale Biomedical Semantic Indexing and Question Answering, at the Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), Brussels, Belgium, 2018. [PDF]

Instructions

This is a Python 3.6 project.

Step 1: Install the required Python packages:

pip3 install -r requirements.txt

Step 2: Download the necessary data that will be used as input to the models.

sh get_bioasq6_data.sh

The following data are provided (among other files):

  • Top-k documents retrieved by a BM25 based search engine (Galago) for each BioASQ query.
  • Biomedical pre-trained word embeddings
  • IDF values

Note: Downloading time may vary depending on server availability.

Step 3: Navigate to a models directory to train the specific model and evaluate its performance on each one of the five test batches. E.g. navigate to the TERM-PACRR model for document ranking:

cd models/documents/term-pacrr

Consult the README file of each model for dedicated instructions (e.g. instructions for TERM-PACRR).

Releases

No releases published

Packages

No packages published