This repository holds an implementation of the Deep Learning Dual Encoder LSTM Model and the Vector Space Model, both used to evaluate and analyze the impact of the context size over the quality of responses.
Some of the codes used here were produced in this hands-on, which implement the Dual Encoder LSTM Model, from this paper, also implement the Vector Space Model, which in this research was used as a baseline.
The codes use Python 3. Clone the repository and install all necessary packages:
1. install tensorflow (version 0.11 and above work correctly, version 0.10 not tested)
2. (optional) install cuda + cudnn (recommend for gpu support)
2. pip install -U pip
3. pip install -r requirements.txt
Experiments can be performed using Ubuntu Dialogue Corpus version 2.0 featured in this paper, whose generation script is available in this repository. However, since the goal of the research was to understand the impact of context size on predicting the next utterance, it was necessary to modify the generation script to get training sets with the number of turns informed by argument. Thus, the modified script can be found in the scripts folder.
For the generation of training sets, follow the steps described in this repository, except for the addition of a sub parser to the training parser to determine the desired number of turns.
train
: training set generator
-t
: desired number of turns
Example for generating a set consisting of contexts with 2 turns:
python create_ubuntu_dataset_modificado.py --data_root ./dados -o 'train.csv' -t -s -l train -t 2
Run training set generation with the modified script, but for validation and test sets use the original script, or download all required sets here. Finally, move all files to the ./Data
folder.
Before moving to Deep Learning model training, sets need to be transformed from CSV to TFRecord.
cd scripts
python prepare-data.py
python udc_train.py
python udc_test.py --model_dir=...
Example:
python udc_test.py --model_dir=./runs/1481183770/
python udc_predict.py --model_dir=...
Example:
python udc_predict.py --model_dir=./runs/1481183770/
As a baseline, we used the Vector Space Model, available in the notebooks folder.