This repository is the implementation of SARG: A Novel Semi Autoregressive Generator for Multi-turn Incomplete Utterance Restoration in python 3.6 environment and pytorch 1.5.1.
To install requirements:
pip install -r requirements.txt
Note: install the torch-gpu version corresponding to the version of cuda.
- First, you can first download the pretrained models RoBERTa-wwm-ext, Chinese for chinese dataset (rename it
chinese_roberta_wwm_ext_pytorch
) and bert-base-uncased for english dataset , - Second, rename the
bert_config.json
toconfig.json
inchinese_roberta_wwm_ext_pytorch
. - Final, convert the BERT pretrained weights to initial weights of SARG by
python covert_weight_from_bert_to_sarg.py.
For the model with coverage mechanism, we first optimize the model 14000 steps with no coverage loss and then train it until convergence with coverage loss weighted to .
Our experiments of Restoration-200k are conducted on 7 Tesla P40. To obtain the best performance as reported in paper, we recommend to do this train as below:
sh scripts/run_train_chinese.sh
Or if the less GPUs you have, the possible solution is to set the gradient_accumulation_steps
to be an appropriate value.
Our experiments of CANARD are conducted on a single GPU. And we also find that the added coverage loss does no help to the overall model. The training is as below:
sh scripts/run_train_english.sh
To evaluate the model on Restoration-200k, run:
sh scripts/run_eval_chinese.sh
To evaluate the model on CANARD, run:
sh scripts/run_eval_english.sh
If you use this code in your research, you can cite our paper.
@article{huang2020sarg,
title={SARG: A Novel Semi Autoregressive Generator for Multi-turn Incomplete Utterance Restoration},
author={Huang, Mengzuo and Li, Feng and Zou, Wuhe and Zhang, Weidong},
journal={arXiv preprint arXiv:2008.01474},
year={2020}
}