GitHub - GussailRaat/ACL2020-SE-MUStARD-for-Multimodal-Sarcasm-Detection: Sentiment and Emotion help Sarcasm? A Multi-task Learning Framework for Multi-Modal Sarcasm, Sentiment and Emotion Analysis

SE-MUStARD Dataset for Multimodal Sarcasm Detection

This repository contains the dataset and code for our ACL 2020 paper: Sentiment and Emotion help Sarcasm? A Multi-task Learning Framework for Multi-Modal Sarcasm, Sentiment and Emotion Analysis

MUStARD Dataset

The original MUStARD dataset released in Towards Multimodal Sarcasm Detection (An Obviously Perfect Paper). The MUStARD dataset is a multimodal video corpus for research in automated sarcasm discovery. The dataset is compiled from popular TV shows including Friends, The Golden Girls, The Big Bang Theory, and Sarcasmaholics Anonymous. MUStARD consists of audiovisual utterances annotated with sarcasm labels. Each utterance is accompanied by its context, which provides additional information on the scenario where the utterance occurs.

SE-MUStARD Dataset with Sentiment and Emotion Classes

We manually annotate this multi-modal MUStARD sarcasm dataset with sentiment and emotion classes, both implicit and explicit. You can download SE-MUStARD datasets from here (text only). For rest of the modalities i.e., visual and acoustic, please follow this GitHub repository.

Data Format

Key	Value
utterance	The text of the target utterance to classify.
speaker	Speaker of the target utterance.
context	List of utterances (in chronological order) preceding the target utterance.
context_speakers	Respective speakers of the context utterances.
sarcasm	Binary label for sarcasm tag.
implicit-sentiment	Three labels for implcit sentiment tag.
explicit-sentiment	Three labels for explcit sentiment tag.
implicit-emotion	Nine labels for implicit-emotion tag.
explicit-emotion	Nine labels for explicit-emotion tag.

Feature Extraction

There are two setups which are as follows;

(1) Speaker Dependent Setup (exMode=True)

datasetTrue_fasttext.zip: This file contains only text features (using fasttext 300d).

Note: see function featuresExtraction_fastext(foldNum, exMode) in trimodal_true.py, where foldNum belongs to [0-4] and exMode = True

datasetTrue_original.zip: This file contains acoustic and visual features (from here).

Note: see function featuresExtraction_original(foldNum, exMode) in trimodal_true.py, where foldNum belongs to [0-4] and exMode = True

(2) Speaker Independent Setup (exMode=False)

datasetFalse_fasttext.zip: This file contains only text features (using fasttext 300d).

Note: see function featuresExtraction_fastext(foldNum, exMode) in trimodal_false.py, where foldNum = 3 and exMode = False

datasetFalse_original.zip: This file contains acoustic and visual features (from here).

Note: see function featuresExtraction_original(foldNum, exMode) in trimodal_false.py, where foldNum = 3 and exMode = False

 Download all the features and put into the folder **feature_extraction** and then run the code.

Download Trained Weights

There are two setups which are as follows;

(1) Speaker Dependent Setup (exMode=True):

Five folds Speaker dependent weights

(2) Speaker Independent Setup (exMode=False):

Speaker independent Weights

Run the code

python2 trimodal_true.py (for speaker dependent)

python2 trimodal_false.py  (for speaker independent)

Citation

Please cite the following paper if you find this dataset useful in your research:

@inproceedings{chauhan-etal-2020-sentiment,
    title = "Sentiment and Emotion help Sarcasm? A Multi-task Learning Framework for Multi-Modal Sarcasm, Sentiment and Emotion Analysis",
    author = "Chauhan, Dushyant Singh  and
      S R, Dhanush  and
      Ekbal, Asif  and
      Bhattacharyya, Pushpak",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.401",
    pages = "4351--4360",
}

--versions--

python: 2.7

keras: 2.2.8

tensorflow: 1.9.0

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
feature_extraction		feature_extraction
results		results
weights		weights
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
sarcasm_data.json		sarcasm_data.json
trimodal_false.py		trimodal_false.py
trimodal_true.py		trimodal_true.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SE-MUStARD Dataset for Multimodal Sarcasm Detection

MUStARD Dataset

SE-MUStARD Dataset with Sentiment and Emotion Classes

Data Format

Feature Extraction

(1) Speaker Dependent Setup (exMode=True)

(2) Speaker Independent Setup (exMode=False)

Download Trained Weights

(1) Speaker Dependent Setup (exMode=True):

(2) Speaker Independent Setup (exMode=False):

Run the code

Citation

--versions--

About

Releases

Packages

Languages

License

GussailRaat/ACL2020-SE-MUStARD-for-Multimodal-Sarcasm-Detection

Folders and files

Latest commit

History

Repository files navigation

SE-MUStARD Dataset for Multimodal Sarcasm Detection

MUStARD Dataset

SE-MUStARD Dataset with Sentiment and Emotion Classes

Data Format

Feature Extraction

(1) Speaker Dependent Setup (exMode=True)

(2) Speaker Independent Setup (exMode=False)

Download Trained Weights

(1) Speaker Dependent Setup (exMode=True):

(2) Speaker Independent Setup (exMode=False):

Run the code

Citation

--versions--

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages