Code for Findings of EMNLP 2022 short paper "CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model".
paper/
: "CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model"models/
: models in CDGPCSG/
: the models as Candidate Set GeneratorDS/
: the models as Distractor Selector
datasets/
: datasets for fine-tuneing and testingCLOTH.zip
: CLOTH datasetDGen.zip
: DGen dataset
fine-tune/
: code for fine-tuningtest/
: code for testingdis_generator(BERT).py
: distractors generator based on BERTdis_generator(SciBERT).py
: distractors generator based on SciBERTdis_generator(RoBERTa).py
: distractors generator based on RoBERTadis_generator(BART).py
: distractors generator based on BARTdis_evaluator.py
: distractors evaluatorresults/
: results of distractors generatorevaluations/
: evaluations of distractors evaluator
demo.ipynb
: code for CDGP demo
Models are available at Hugging Face.
Its input are stem and answer, and output is candidate set of distractors.
Models | CLOTH | DGen |
---|---|---|
BERT | cdgp-csg-bert-cloth | cdgp-csg-bert-dgen |
SciBERT | cdgp-csg-scibert-cloth | cdgp-csg-scibert-dgen |
RoBERTa | cdgp-csg-roberta-cloth | cdgp-csg-roberta-dgen |
BART | cdgp-csg-bart-cloth | cdgp-csg-bart-dgen |
Its input are stem, answer and candidate set of distractors, and output are top 3 distractors.
fastText: cdgp-ds-fasttext
Datasets are available at Hugging Face and GitHub.
CLOTH is a dataset which is a collection of nearly 100,000 cloze questions from middle school and high school English exams. The detail of CLOTH dataset is shown below.
Number of questions | Train | Valid | Test |
---|---|---|---|
Middle school | 22056 | 3273 | 3198 |
High school | 54794 | 7794 | 8318 |
Total | 76850 | 11067 | 11516 |
You can download CLOTH dataset from Hugging Face or GitHub.
DGen is a cloze questions dataset which covers multiple domains including science, vocabulary, common sense and trivia. It is compiled from a wide variety of datasets including SciQ, MCQL, AI2 Science Questions, etc. The detail of DGen dataset is shown below.
DGen dataset | Train | Valid | Test | Total |
---|---|---|---|---|
Number of questions | 2321 | 300 | 259 | 2880 |
You can download CLOTH dataset from Hugging Face or GitHub.
The evaluations of these model as a Candidate Set Generator in CDGP is shown as follows:
Models | P@1 | F1@3 | F1@10 | MRR | NDCG@10 |
---|---|---|---|---|---|
cdgp-csg-bert-cloth | 18.50 | 13.80 | 15.37 | 29.96 | 37.82 |
cdgp-csg-scibert-cloth | 8.10 | 9.13 | 12.22 | 19.53 | 28.76 |
cdgp-csg-roberta-cloth | 10.50 | 9.83 | 10.25 | 20.42 | 28.17 |
cdgp-csg-bart-cloth | 14.20 | 11.07 | 11.37 | 24.29 | 31.74 |
Models | P@1 | F1@3 | MRR | NDCG@10 |
---|---|---|---|---|
cdgp-csg-bert-dgen | 10.81 | 7.72 | 18.15 | 24.47 |
cdgp-csg-scibert-dgen | 13.13 | 12.23 | 25.12 | 34.17 |
cdgp-csg-roberta-dgen | 13.13 | 9.65 | 19.34 | 24.52 |
cdgp-csg-bart-dgen | 8.49 | 8.24 | 16.01 | 22.66 |
- Clone or download this repo.
git clone https://github.com/AndyChiangSH/CDGP.git
- Move into this repo.
cd ./CDGP/
- Setup a virtual environment.
python -m venv CDGP-env
Python version: 3.8.8
- Pip install the required packages.
pip install -r requirements.txt
Our model is fine-tuned on Colab, so you can upload these Jupyter Notebook to Colab and run it by yourself!
We are testing in local, so we need to download the datasets and models.
- Unzip the CLOTH or DGen datasets in
/datasets/
. - CSG models will download from Hugging Face when you run the code, so you don't have to do anything!
- If you want to use your own CSG model, you can put it in the new directory
/models/CSG/
. - However, you have to download the DS models by yourself.
- Then, move the DS models into the new directory
/models/DS/
. - Run
/test/dis_generator(BERT).py
to generate the distractors based on BERT. - Run
/test/dis_generator(SciBERT).py
to generate the distractors based on SciBERT. - Run
/test/dis_generator(RoBERTa).py
to generate the distractors based on RoBERTa. - Run
/test/dis_generator(BART).py
to generate the distractors based on BART. - Check the generating results as
.json
files in/test/results/
. - Run
/test/dis_evaluator.py
to evaluate the generating results. - Check the evaluations as
.csv
file in/test/evaluations/
.
@inproceedings{chiang-etal-2022-cdgp,
title = "{CDGP}: Automatic Cloze Distractor Generation based on Pre-trained Language Model",
author = "Chiang, Shang-Hsuan and
Wang, Ssu-Cheng and
Fan, Yao-Chung",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2022",
month = dec,
year = "2022",
address = "Abu Dhabi, United Arab Emirates",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.findings-emnlp.429",
pages = "5835--5840",
abstract = "Manually designing cloze test consumes enormous time and efforts. The major challenge lies in wrong option (distractor) selection. Having carefully-design distractors improves the effectiveness of learner ability assessment. As a result, the idea of automatically generating cloze distractor is motivated. In this paper, we investigate cloze distractor generation by exploring the employment of pre-trained language models (PLMs) as an alternative for candidate distractor generation. Experiments show that the PLM-enhanced model brings a substantial performance improvement. Our best performing model advances the state-of-the-art result from 14.94 to 34.17 (NDCG@10 score). Our code and dataset is available at https://github.com/AndyChiangSH/CDGP.",
}
- Shang-Hsuan Chiang (@AndyChiangSH)
- Ssu-Cheng Wang (@shiro-wang)
- Yao-Chung Fan (@yfan)