This repository contains the dataset and code for our ACL 2020 paper: Sentiment and Emotion help Sarcasm? A Multi-task Learning Framework for Multi-Modal Sarcasm, Sentiment and Emotion Analysis
The original MUStARD dataset released in Towards Multimodal Sarcasm Detection (An Obviously Perfect Paper). The MUStARD dataset is a multimodal video corpus for research in automated sarcasm discovery. The dataset is compiled from popular TV shows including Friends, The Golden Girls, The Big Bang Theory, and Sarcasmaholics Anonymous. MUStARD consists of audiovisual utterances annotated with sarcasm labels. Each utterance is accompanied by its context, which provides additional information on the scenario where the utterance occurs.
We manually annotate this multi-modal MUStARD sarcasm dataset with sentiment and emotion classes, both implicit and explicit. You can download SE-MUStARD datasets from here (text only). For rest of the modalities i.e., visual and acoustic, please follow this GitHub repository.
Key | Value |
---|---|
utterance | The text of the target utterance to classify. |
speaker | Speaker of the target utterance. |
context | List of utterances (in chronological order) preceding the target utterance. |
context_speakers | Respective speakers of the context utterances. |
sarcasm | Binary label for sarcasm tag. |
implicit-sentiment | Three labels for implcit sentiment tag. |
explicit-sentiment | Three labels for explcit sentiment tag. |
implicit-emotion | Nine labels for implicit-emotion tag. |
explicit-emotion | Nine labels for explicit-emotion tag. |
There are two setups which are as follows;
- datasetTrue_fasttext.zip: This file contains only text features (using fasttext 300d).
Note: see function featuresExtraction_fastext(foldNum, exMode) in trimodal_true.py, where foldNum belongs to [0-4] and exMode = True
- datasetTrue_original.zip: This file contains acoustic and visual features (from here).
Note: see function featuresExtraction_original(foldNum, exMode) in trimodal_true.py, where foldNum belongs to [0-4] and exMode = True
- datasetFalse_fasttext.zip: This file contains only text features (using fasttext 300d).
Note: see function featuresExtraction_fastext(foldNum, exMode) in trimodal_false.py, where foldNum = 3 and exMode = False
- datasetFalse_original.zip: This file contains acoustic and visual features (from here).
Note: see function featuresExtraction_original(foldNum, exMode) in trimodal_false.py, where foldNum = 3 and exMode = False
Download all the features and put into the folder **feature_extraction** and then run the code.
There are two setups which are as follows;
- Five folds Speaker dependent weights
- Speaker independent Weights
python2 trimodal_true.py (for speaker dependent)
python2 trimodal_false.py (for speaker independent)
Please cite the following paper if you find this dataset useful in your research:
@inproceedings{chauhan-etal-2020-sentiment,
title = "Sentiment and Emotion help Sarcasm? A Multi-task Learning Framework for Multi-Modal Sarcasm, Sentiment and Emotion Analysis",
author = "Chauhan, Dushyant Singh and
S R, Dhanush and
Ekbal, Asif and
Bhattacharyya, Pushpak",
booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
month = jul,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.acl-main.401",
pages = "4351--4360",
}
python: 2.7
keras: 2.2.8
tensorflow: 1.9.0