MuSe-CarASTE: A comprehensive dataset for aspect sentiment triplet extraction in automotive review videos

Code and Data for the following paper MuSe-CarASTE: A comprehensive dataset for aspect sentiment triplet extraction in automotive review videos

MuseASTE

Aspect Sentiment Triplet Extraction (ASTE) Annotations for the MuSe-Car Dataset - A multi-modal dataset consisting of many hours of video footage from YouTube and transcripts of reviewing automotive vehicles, mainly in English language. The largest dataset repository curated for ASTE.

Task Description

Aspect Sentiment Triplet Extraction (ASTE) within the automotive review domain. ASTE as a task was introduced by Peng et al. [1], which is one of the tasks among the 7 sub-tasks of aspect-based sentiment analysis (ABSA). It gives a complete picture or story about a product by extracting triplets (a,s,o) from review sentences. These triplets are of the form <a,o,s> consist of an aspect a, an opinion o, and a sentiment s. For example, from the sentence “the gearbox is rubbish”, the triplet (gearbox, rubbish, NEG) is extracted. Sometimes, an additionaly aspect category is also predicted.

Key Aspects

Benchmark- Current Benchmark for Aspect Sentiment Triplet Extraction i.e., ASTE-V2[9] Sem-Eval datasets(14lap, 14 res, 15res,16res) all four datasets combined.

Domain

This dataset can be applied to:

Aspect Sentiment Triplet Extraction (ASTE)

Aspect Based Sentiment Analysis (ABSA)

Target Aspect Sentiment Detection (TASD)

Aspect Category Classification (ACC)

Sentiment Classification, Opinion Mining

Research Citation

Cite the following paper if you use this dataset, also star our repo, and follow the instructions mentioned below and cite additional relevant citations :

Main paper:

Muse-ASTE:

@article{USMANI2025125695,
title = {MuSe-CarASTE: A comprehensive dataset for aspect sentiment triplet extraction in automotive review videos},
journal = {Expert Systems with Applications},
volume = {262},
pages = {125695},
year = {2025},
issn = {0957-4174},
doi = {https://doi.org/10.1016/j.eswa.2024.125695},
url = {https://www.sciencedirect.com/science/article/pii/S0957417424025624},
author = {Atiya Usmani and Saeed {Hamood Alsamhi} and Muhammad {Jaleed Khan} and John Breslin and Edward Curry}
}

Additional citations:

Original Muse-Dataset and MuSe-challenge:

@article{stappen2021multimodal,
  title={The multimodal sentiment analysis in car reviews (muse-car) dataset: Collection, insights and improvements},
  author={Stappen, Lukas and Baird, Alice and Schumann, Lea and Schuller, Bj{\"o}rn},
  journal={IEEE Transactions on Affective Computing},
  volume={14},
  number={2},
  pages={1334--1350},
  year={2021},
  publisher={IEEE}
}
@inproceedings{stappen2020muse,
  title={Muse 2020 challenge and workshop: Multimodal sentiment analysis, emotion-target engagement and trustworthiness detection in real-life media: Emotional car reviews in-the-wild},
  author={Stappen, Lukas and Baird, Alice and Rizos, Georgios and Tzirakis, Panagiotis and Du, Xinchen and Hafner, Felix and Schumann, Lea and Mallol-Ragolta, Adria and Schuller, Bj{\"o}rn W and Lefter, Iulia and others},
  booktitle={Proceedings of the 1st International on Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop},
  pages={35--44},
  year={2020}
}

3.If you use any baseline code - BMRC[7], BART-ABSA[5], Span-ASTE[6], GAS[8], then cite them too, if relevant.

Instructions

The main gold dataset and annotations are in the dataset folder. The processed dataset for comparison with baselines is in the processed dataset folder.
The text transcripts are already added after seeking permission from MuSe [2]. Please manditorily cite our paper[10] (provided under Research Citation), and also cite the Muse-Dataset[2] and challenge[4].
Code for baselines is taken from the original repositories and adapted to our datatset, the orginial repositories are cited in respective readme files. Provided you use any baseline code (see Baselines) then cite them. We also provide our experimental settings and environment file.
Additionally, in case you also want to do supervised topic modelling or ACC (Aspect Category Classification), go to the primary dataset MuseCar-2020 [2,3] (Link: MuSeTopic 2020 - ACM MM 2020 (google.com), and acquire the Muse-Topic dataset to get access to topic/category label that maybe of interest to you.
The baselines run on both our and SemEval dataset, hence all 4 Sem-Eval datasets [8] are also contained in the repository.

Baselines

-BRMC Machine Comprehension based [7].

-GAS Large Language Model (T5) based generative approach [8].

-BART-ABSA Pointer based indices generation/prediction approach. Predicts start and end positions of a tag [5].

-Span-ASTE Tagging based span prediction method [6].

Demo

Added November 11 Demo gives you a sneak-peek in to one of our ASTE knowledge graphs and allows you to play with it. We created topic and ASTE labels for one car in our dataset (alternatively you can substitute segment-wise topic labels from primary dataset for more), and implemented a demo online graph inspection tool using streamlit. It also gives an insight into the aspect, sentiment, opinion annotations.

Requirements:

pip install pandas
pip install matplotlib
pip install streamlit
pip install networkx
pip install streamit-extras
pip install scipy

To run the (demo) code:

streamlit run demo/demo.py

Usecase

Contact

museaste@gmail.com

Acknowledgement

[1] Haiyun Peng, Lu Xu, Lidong Bing, Fei Huang, Wei Lu, and Luo Si. Knowing what, how and why: A near complete solution for aspect-based sentiment analysis. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 8600–8607, 2020.

[2] Stappen, Lukas, Alice Baird, Lea Schumann, and Schuller Bjorn. "The multimodal sentiment analysis in car reviews (muse-car) dataset: Collection, insights and improvements." IEEE Transactions on Affective Computing (2021).

[3] Muse 2020 - ACM MM 2020 (no date) MuSe 2020 - ACM MM 2020. Available at: https://sites.google.com/view/muse2020 (Accessed: 07 November 2023).

[4] Stappen, L., Baird, A., Rizos, G., Tzirakis, P., Du, X., Hafner, F., Schumann, L., Mallol-Ragolta, A., Schuller, B.W., Lefter, I. and Cambria, E., 2020, October. Muse 2020 challenge and workshop: Multimodal sentiment analysis, emotion-target engagement and trustworthiness detection in real-life media: Emotional car reviews in-the-wild. In Proceedings of the 1st International on Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop (pp. 35-44).

[5] Yan, H., Dai, J., Ji, T., Qiu, X., & Zhang, Z. (2021, August). A Unified Generative Framework for Aspect-based Sentiment Analysis. In C. Zong, F. Xia, W. Li, & R. Navigli (Eds.), Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 2416–2429). doi:10.18653/v1/2021.acl-long.188

[6] Xu, L., Chia, Y. K., & Bing, L. (2021, August). Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction. In C. Zong, F. Xia, W. Li, & R. Navigli (Eds.), Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 4755–4766). doi:10.18653/v1/2021.acl-long.367

[7] Chen, S., Wang, Y., Liu, J., & Wang, Y. (2021, May). Bidirectional machine reading comprehension for aspect sentiment triplet extraction. In Proceedings of the AAAI conference on artificial intelligence (Vol. 35, No. 14, pp. 12666-12674).

[8] Zhang, W., Li, X., Deng, Y., Bing, L., & Lam, W. (2021, August). Towards generative aspect-based sentiment analysis. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers) (pp. 504-510).

[9] xuuuluuu. (2020). GitHub - xuuuluuu/SemEval-Triplet-data: Aspect Sentiment Triplet Extraction (ASTE) dataset in AAAI 2020, EMNLP 2020 and ACL 2021. GitHub. (https://github.com/xuuuluuu/SemEval-Triplet-data) . Accessed 12 Mar. 2024.

[10] Usmani, A., Alsamhi, S.H., Khan, M.J., Breslin, J. and Curry, E., 2025. MuSe-CarASTE: A comprehensive dataset for aspect sentiment triplet extraction in automotive review videos. Expert Systems with Applications, 262, p.125695.

License

. This work is licensed under a Creative Commons Attribution-NonCommercial-4.0-International License for Noncommercial (academic & research) purposes only and must not be used for any other purpose without the author's explicit permission.

Name		Name	Last commit message	Last commit date
Latest commit History 501 Commits
.devcontainer		.devcontainer
BARTABSA		BARTABSA
BMRC		BMRC
Generative-ABSA		Generative-ABSA
Processed dataset		Processed dataset
Span-ASTE		Span-ASTE
dataset		dataset
demo		demo
LICENSE		LICENSE
Muse Poster2.pdf		Muse Poster2.pdf
Muse-CAR ASTE dataset (1).pptx		Muse-CAR ASTE dataset (1).pptx
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MuSe-CarASTE: A comprehensive dataset for aspect sentiment triplet extraction in automotive review videos

MuseASTE

Task Description

Key Aspects

Domain

Research Citation

Instructions

Baselines

Demo

Usecase

Contact

Acknowledgement

License

About

Releases

Packages

Languages

License

AtiUsm/MuseASTE

Folders and files

Latest commit

History

Repository files navigation

MuSe-CarASTE: A comprehensive dataset for aspect sentiment triplet extraction in automotive review videos

MuseASTE

Task Description

Key Aspects

Domain

Research Citation

Instructions

Baselines

Demo

Usecase

Contact

Acknowledgement

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages