Automatically generated CSV and README from JSON update

SuperKogito · Sep 29, 2024 · 57936b1 · 57936b1
1 parent bee7bc3
commit 57936b1
Show file tree

Hide file tree

Showing 2 changed files with 3 additions and 1 deletion.
diff --git a/README.md b/README.md
@@ -1,4 +1,4 @@
-***Speech Emotion Recognition (SER) Datasets:*** *A collection of datasets (count=75) for the purpose of emotion recognition/detection in speech.
+***Speech Emotion Recognition (SER) Datasets:*** *A collection of datasets (count=76) for the purpose of emotion recognition/detection in speech.
 The table is chronologically ordered and includes a description of the content of each dataset along with the emotions included.
 The table can be browsed, sorted and searched under https://superkogito.github.io/SER-datasets/*
 | Dataset                                                                                                                                           | Year            | Content                                                                                                                                                                                                                                                                          | Emotions                                                                                                                                                                                                                                                                     | Format                        | Size                 | Language                                                          | Paper                                                                                                                                                                                                                                                                                                                                                     | Access                    | License                                                                                                                                        |
@@ -9,6 +9,7 @@ The table can be browsed, sorted and searched under https://superkogito.github.i
 | <sub>[CAVES](https://rds.westernsydney.edu.au/Institutes/MARCS/2024/Christopher_Davis/)</sub>                                                     | <sub>2023</sub> | <sub>Full hd visual recordings of 10 native cantonese speakers uttering 50 sentences.</sub>                                                                                                                                                                                      | <sub>Anger, happiness, sadness, surprise, fear, disgust and neutral</sub>                                                                                                                                                                                                    | <sub>Audio</sub>              | <sub>47 GB</sub>     | <sub>Chinese (cantonese)</sub>                                    | <sub>[A Cantonese Audio-Visual Emotional Speech (CAVES) dataset](https://link.springer.com/article/10.3758/s13428-023-02270-7)</sub>                                                                                                                                                                                                                      | <sub>Open</sub>           | <sub>Available for research purposes only</sub>                                                                                                |
 | <sub>[BANSpEmo](https://data.mendeley.com/datasets/rdwn4bs5ky/2)</sub>                                                                            | <sub>2023</sub> | <sub>792 utterance recordings from 22 unprofessional speakers (11 males and 11 females) of six basic emotional reactions of two sets of sentences.</sub>                                                                                                                         | <sub>angry, disgusted, happy, surprised, sad, fear</sub>                                                                                                                                                                                                                     | <sub>Audio</sub>              | <sub>0.555 GB</sub>  | <sub>Bangla</sub>                                                 | <sub>[BANSpEmo: A Bangla Emotional Speech Recognition Dataset](https://arxiv.org/abs/2312.14020)</sub>                                                                                                                                                                                                                                                    | <sub>Open</sub>           | <sub>[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)</sub>                                                                           |
 | <sub>[KBES](https://data.mendeley.com/datasets/vsn37ps3rx/4)</sub>                                                                                | <sub>2023</sub> | <sub>900 audio signals from 35 actors (20 females and 15 males). Each emotion is represented with two intensity levels (low & high)</sub>                                                                                                                                        | <sub>angry, disgusted, happy, neutral, sad</sub>                                                                                                                                                                                                                             | <sub>Audio</sub>              | <sub>0.337 GB</sub>  | <sub>Bangla</sub>                                                 | <sub>[KBES: A dataset for realistic Bangla speech emotion recognition with intensity level](https://www.sciencedirect.com/science/article/pii/S2352340923008107)</sub>                                                                                                                                                                                    | <sub>Open</sub>           | <sub>[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)</sub>                                                                           |
+| <sub>[RESD](https://huggingface.co/datasets/Aniemore/resd_annotated)</sub>                                                                        | <sub>2022</sub> | <sub>Russian emotional speech dialogue dataset ~3.5 hours of actor-voiced dialogues, each ~3 minutes long, with speech files (16000 or 44100Hz), with speech-to-text transcripts</sub>                                                                                           | <sub>anger, disgust, fear, enthusiasm, happiness, neutral, sadness</sub>                                                                                                                                                                                                     | <sub>Audio</sub>              | <sub>0.48 GB</sub>   | <sub>Russian</sub>                                                | <sub>[EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark](https://arxiv.org/abs/2406.07162)</sub>                                                                                                                                                                                                                         | <sub>Open</sub>           | <sub>[MIT](https://choosealicense.com/licenses/mit/)</sub>                                                                                     |
 | <sub>[Hi, KIA](https://zenodo.org/records/7091465)</sub>                                                                                          | <sub>2022</sub> | <sub>A shared short Wakeup Word database focusing on perceived emotion in speech The dataset contains 488 Wakeup Word speech</sub>                                                                                                                                               | <sub>angry, happy, sad, neutral</sub>                                                                                                                                                                                                                                        | <sub>Audio</sub>              | <sub>0.75 GB</sub>   | <sub>Korean</sub>                                                 | <sub>[Hi, KIA: A Speech Emotion Recognition Dataset for Wake-Up Words](https://arxiv.org/abs/2211.03371)</sub>                                                                                                                                                                                                                                            | <sub>Open</sub>           | <sub>[CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/)</sub>                                                                     |
 | <sub>[Emozionalmente](https://zenodo.org/records/6569824)</sub>                                                                                   | <sub>2022</sub> | <sub>6902 labeled samples acted out by 431 amateur actors while verbalizing 18 different sentences</sub>                                                                                                                                                                         | <sub>anger, disgust, fear, joy, sadness, surprise, neutral</sub>                                                                                                                                                                                                             | <sub>Audio</sub>              | <sub>0.581 GB</sub>  | <sub>Italian</sub>                                                | <sub>--</sub>                                                                                                                                                                                                                                                                                                                                             | <sub>Open</sub>           | <sub>[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)</sub>                                                                           |
 | <sub>[BanglaSER](https://data.mendeley.com/datasets/t9h6p943xy/5)</sub>                                                                           | <sub>2022</sub> | <sub>1467 Bangla speech-audio recordings by 34 non-professional participating actors (17 male and 17 female) from diverse age groups between 19 and 47 years.</sub>                                                                                                              | <sub>angry, happy, neutral, sad, surprise</sub>                                                                                                                                                                                                                              | <sub>Audio</sub>              | <sub>0.425 GB</sub>  | <sub>Bangla</sub>                                                 | <sub>[BanglaSER: A speech emotion recognition dataset for the Bangla language](https://www.sciencedirect.com/science/article/pii/S235234092200302X)</sub>                                                                                                                                                                                                 | <sub>Open</sub>           | <sub>[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)</sub>                                                                           |

diff --git a/src/ser-datasets.csv b/src/ser-datasets.csv
@@ -5,6 +5,7 @@ Dataset,Year,Content,Emotions,Format,Size,Language,Paper,Access,License
 `CAVES <https://rds.westernsydney.edu.au/Institutes/MARCS/2024/Christopher_Davis/>`_,2023,Full hd visual recordings of 10 native cantonese speakers uttering 50 sentences.,"Anger, happiness, sadness, surprise, fear, disgust and neutral",Audio,47 GB,Chinese (cantonese),`A Cantonese Audio-Visual Emotional Speech (CAVES) dataset <https://link.springer.com/article/10.3758/s13428-023-02270-7>`_,Open,Available for research purposes only
 `BANSpEmo <https://data.mendeley.com/datasets/rdwn4bs5ky/2>`_,2023,792 utterance recordings from 22 unprofessional speakers (11 males and 11 females) of six basic emotional reactions of two sets of sentences.,"angry, disgusted, happy, surprised, sad, fear",Audio,0.555 GB,Bangla,`BANSpEmo: A Bangla Emotional Speech Recognition Dataset <https://arxiv.org/abs/2312.14020>`_,Open,`CC BY 4.0 <https://creativecommons.org/licenses/by/4.0/>`_
 `KBES <https://data.mendeley.com/datasets/vsn37ps3rx/4>`_,2023,900 audio signals from 35 actors (20 females and 15 males). Each emotion is represented with two intensity levels (low & high),"angry, disgusted, happy, neutral, sad",Audio,0.337 GB,Bangla,`KBES: A dataset for realistic Bangla speech emotion recognition with intensity level <https://www.sciencedirect.com/science/article/pii/S2352340923008107>`_,Open,`CC BY 4.0 <https://creativecommons.org/licenses/by/4.0/>`_
+`RESD <https://huggingface.co/datasets/Aniemore/resd_annotated>`_,2022,"Russian emotional speech dialogue dataset ~3.5 hours of actor-voiced dialogues, each ~3 minutes long, with speech files (16000 or 44100Hz), with speech-to-text transcripts","anger, disgust, fear, enthusiasm, happiness, neutral, sadness",Audio,0.48 GB,Russian,`EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark <https://arxiv.org/abs/2406.07162>`_,Open,`MIT <https://choosealicense.com/licenses/mit/>`_
 "`Hi, KIA <https://zenodo.org/records/7091465>`_",2022,A shared short Wakeup Word database focusing on perceived emotion in speech The dataset contains 488 Wakeup Word speech,"angry, happy, sad, neutral",Audio,0.75 GB,Korean,"`Hi, KIA: A Speech Emotion Recognition Dataset for Wake-Up Words <https://arxiv.org/abs/2211.03371>`_",Open,`CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0/>`_
 `Emozionalmente <https://zenodo.org/records/6569824>`_,2022,6902 labeled samples acted out by 431 amateur actors while verbalizing 18 different sentences,"anger, disgust, fear, joy, sadness, surprise, neutral",Audio,0.581 GB,Italian,--,Open,`CC BY 4.0 <https://creativecommons.org/licenses/by/4.0/>`_
 `BanglaSER <https://data.mendeley.com/datasets/t9h6p943xy/5>`_,2022,1467 Bangla speech-audio recordings by 34 non-professional participating actors (17 male and 17 female) from diverse age groups between 19 and 47 years.,"angry, happy, neutral, sad, surprise",Audio,0.425 GB,Bangla,`BanglaSER: A speech emotion recognition dataset for the Bangla language <https://www.sciencedirect.com/science/article/pii/S235234092200302X>`_,Open,`CC BY 4.0 <https://creativecommons.org/licenses/by/4.0/>`_