Experiments

Please understand that we lack the time and resources to maintain this code repository. The code here is mainly for transparency and to validate the findings in our paper. The code is not optimized for production settings.

In order to run an annotation experiment proceed as follows:

predict label candidates for a batch using ./qa/predict.py
simulate annotations on that batch using ./annotation/annotate.py
retrain QA model on annotated samples using ./bert/run_squad.py

A bash script executing that experiment could look like the following:

export BERT_BASE_DIR=

#python ./qa/predict.py --vocab_file=${BERT_BASE_DIR}/vocab.txt --bert_config_file=${BERT_BASE_DIR}/bert_config.json --output_dir=./data/reader_output/ --init_checkpoint=/tmp/trainonsquad/ --do_predict=True --predict_file=./data/nq_train_questions_title.json  --retriever_model=./data/wikidump/database-tfidf-ngram\=2-hash\=16777216-tokenizer\=simple.npz --doc_db=./data/wikidump/database --out_name=logits_train_final --batch_number=0 --max_seq_length=512 --predict_batch_size=32
#python ./qa/predict.py --vocab_file=${BERT_BASE_DIR}/vocab.txt --bert_config_file=${BERT_BASE_DIR}/bert_config.json --output_dir=./data/reader_output/ --init_checkpoint=/tmp/trainonsquad/ --do_predict=True --predict_file=./data/nq_train_questions_title.json  --retriever_model=./data/wikidump/database-tfidf-ngram\=2-hash\=16777216-tokenizer\=simple.npz --doc_db=./data/wikidump/database --out_name=logits_train_final --batch_number=1 --max_seq_length=512 --predict_batch_size=32

k=100
while [[ ${k} -lt 108 ]]
do
    python annotation/annotate.py --current-batch ${k} --mc-feature-template logits_train_final-feat-batch-{}.txt --out-name final
    
    rm -r /tmp/finetune
    mkdir /tmp/finetune
    python ./bert/run_squad.py  --vocab_file="${BERT_BASE_DIR}"/vocab.txt --bert_config_file="${BERT_BASE_DIR}"/bert_config.json --init_checkpoint=/tmp/trainonsquad --do_train=True --train_file=./final.json --do_predict=False --train_batch_size=16 --learning_rate=3e-5 --num_train_epochs=2.0 --max_seq_length=368 --doc_stride=128 --output_dir=/tmp/finetune
    k=$((k+1))
    python ./qa/predict.py --vocab_file=${BERT_BASE_DIR}/vocab.txt --bert_config_file=${BERT_BASE_DIR}/bert_config.json --output_dir=./data/reader_output/ --init_checkpoint=/tmp/finetune/ --do_predict=True --predict_file=./data/nq_train_questions_title.json  --retriever_model=./data/wikidump/database-tfidf-ngram\=2-hash\=16777216-tokenizer\=simple.npz  --max_seq_length=512 --predict_batch_size=16 --doc_db=./data/wikidump/database --out_name=logits_train_final --batch_number=${k}
done

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Experiments

Files

README.md

Latest commit

History

README.md

File metadata and controls

Experiments