To download female utterances (zip file):
wget -O female.zip "https://www.dropbox.com/s/4t6mep8mo4yf81f/female.zip?dl=0"
To download male utterances (zip file):
wget -O male.zip "https://www.dropbox.com/s/xfi3hi927yxixa9/male.zip?dl=0"
To download labels & transcripts (json file):
wget https://github.com/pariajm/sharif-emotional-speech-dataset/raw/master/shemo.json
Credits to Mehrdad Farahani
- Speech emotion detection in Persian (fa) using wav2vec 2.0
- Speech emotion detection in Persian (fa) using HuBERT
- Speech geneder detection in Persian (fa) using HuBERT
- Automatic speech recognition in Persian (fa) using XLSR-53
Feature | Status |
---|---|
access | open source |
language | Persian (fa) |
modality | speech |
duration | 3 hours and 25 minutes |
#utterances | 3000 |
#speakers | 87 (31 females, 56 males) |
#emotions | 5 basic emotions (anger, fear, happiness, sadness and surprise) and neutral state |
orthographic transcripts | available |
phonetic transcripts | available |
Read our paper on Springer or arxiv
The characters used in the filenames and their corresponding meaning:
- A: angry
- F: female speaker (if used at the beginning of the label e.g.
F14A09
) or fearful (if used in the middle of the label e.g.M02F01
) - H : happy
- M : male speaker
- N : neutral
- S : sad
- W : surprised
e.g. F03S02
F means the speaker is female, 03 denotes the speaker code, S refers to the underlying emotion of the utterance which is sadness, 02 means this is the second utterance for this speaker in sad emotion.
Here is a sample of data instances:
"F21N37": {
"speaker_id": "F21",
"gender": "female",
"emotion": "neutral",
"transcript": "مگه من به تو نگفته بودم که باید راجع به دورانت سکوت کنی؟",
"ipa": "mӕge mæn be to nægofte budӕm ke bɑyæd rɑdʒeʔ be dorɑnt sokut koni"
}
برای دریافت مقاله اینجا کلیک کنید
If you use this dataset, please cite the following paper:
@Article{MohamadNezami2019,
author = {Mohamad Nezami, Omid and Jamshid Lou, Paria and Karami, Mansoureh},
title = {ShEMO: a large-scale validated database for Persian speech emotion detection},
journal = {Language Resources and Evaluation},
year = {2019},
volume = {53},
number = {1},
pages = {1--16},
issn = {1574-0218},
doi = {10.1007/s10579-018-9427-x},
url = {https://doi.org/10.1007/s10579-018-9427-x}
}
Paria Jamshid Lou paria.jamshid-lou@hdr.mq.edu.au
Omid Mohamad Nezami omid.mohamad-nezami@hdr.mq.edu.au