In recent years, an interpersonal attraction in the conversation has been studied extensively.
Creating figurative language generation modules can make chatbots more human-like.
Some recent studies have proposed chatbots that generate sarcasm.
However, they do not focus on generating rhetorical questions (RQ).
It is necessary for chatbots to generate RQs to be more human-like because RQs are usually used in daily conversation
and social media dialog.
RQs are questions but not meant to obtain an answer.
People usually use them to express their opinions in conversation. However, a question cannot be recognized as an RQ if
the answer of the question is only known by the speaker.
To recognize that it is an RQ, the listener needs to use the knowledge shared between them.
Furthermore, there is a specific interrelation between irony and RQs. Therefore RQs are always used to express their
negative opinions.
Questions based on the valence-reversed commonsense knowledge can be easily recognized as RQs because both speaker and
listener know their answers are negative.
For example, the commonsense knowledge Giving money to the poor will make good world
can be converted into an RQ: Will giving money to the rich make a good world?
This study aims to generate a negative-answering RQ by using valence-reversed commonsense knowledge sentences to make the chatbot more appropriate and human-like in a conversation. Additionally, we use a situation classifier analyzing previous contexts to decide when to generate a literal response, sarcastic response, and RQ.
You can get more information by reading the report (2 pages) and the thesis for the Degree of Master of Engineering (53 pages).
- Clone this project:
github clone https://github.com/sun510001/RQ_Chatbot.git cd RQ_Chatbot/
- Install the environment
conda env create -f environment.yml
- Download models
- Transformers pre-training models
cd Situation_Classification_for_SRL/ python run_preproc.py cd ..
- Fine-tuned models
Download files from https://drive.google.com/drive/folders/1XlXAV2fIEeTSwyBMx0dKCVevsA3XfWM_?usp=sharing mv Master_research_model/roberta-base_model Situation_Classification_for_SRL/data/ mv Master_research_model/bert-base-uncased-model RQ_generator/data/
- Transformers pre-training models
- Download the sarcasm generation module, set the module by reading it's README.md and then replace
files.
clone https://github.com/tuhinjubcse/SarcasmGeneration-ACL2020.git cd SarcasmGeneration-ACL2020/ cat README.md do settings... mv ../sg_file/* . cd ..
- Setting for RQ generator module.
- Download bert-gec
cd RQ_generator/ git clone https://github.com/kanekomasahiro/bert-gec.git
- Commonsense knowledge representation model for scoring arbitrary tuples.
cd data/ wget https://ttic.uchicago.edu/~kgimpel/comsense_resources/ckbc-demo.tar.gz tar -xvzf ckbc-demo.tar.gz rm ckbc-demo.tar.gz
- Download stanford-parser-4.2.0.zip
wget https://nlp.stanford.edu/software/stanford-parser-4.2.0.zip tar -xvzf stanford-parser-4.2.0.zip rm stanford-parser-4.2.0.zip
- Download bert-gec
You can run the run_chatbot.py directly after you did preparations.
Python run_chatbot.py
The evaluation mode can output all types of responses in any situation of a conversation.
-
Uncomment codes that are under the
predict/for evaluation
in every python file, which is run below, and comment out all codes that are under thefor chatbot
. -
Generate literal responses
- Run the literal generator
python run_generate_evaluation.py
- Run the literal generator
-
Generate the situation classification
cd Situation_Classification_for_SRL/ python run_predict.py cd ..
-
Generate the sarcastic responses
clone https://github.com/tuhinjubcse/SarcasmGeneration-ACL2020.git cd SarcasmGeneration-ACL2020/ cat README.md do settings... mv ../sg_file/* . cd ..
- Change the conda_path to your python environment path
cd SarcasmGeneration-ACL2020/ vim generate_sarcasm.py conda_path = '/home/aquamarine/sunqifan/anaconda3/envs/r_cla/bin/python3.6' python generate_sarcasm.py
- Change the conda_path to your python environment path
-
Generate the RQ responses
python run_train_classifier.py
- If your memory or GPU memory is not enough for running whole data in the dataset, you can run it in parts. Just change codes in run_train_classifier.py that is in lines 153 and 203-217.
If you want to train models for the situation classification and the RQ generator by yourself, please read it.
- We use the dataset from Twitter and Reddit data for the Shared Task
- Pre-processed dataset is
sarcasm_merge_triple_v8.csv
inSituation_Classification_for_SRL/data/
. - You can set the type of training models in
__init__.py/TrainModelConfig
.cd Situation_Classification_for_SRL/ python run_train.py
- You can set the type of training models in
__init__.py/TrainModelConfigV2
.cd RQ_generator/ python run_train.py
Please email me at sqf121@gmail.com for any problems/doubts. Further, you can raise issues on Github or suggest improvements. Please leave a star and cite us if you use our code, data, or thesis.
@misc{weko_9919_1,
author = "Sun,Qifan",
title = "Automatic Generation of Rhetorical Questions and Its Application to a Chatbot"
}