Mel-ResNet

ResNet for classifying Mel-spectrograms

We developed a generative model that inputs EEG data during Inner Speech (Imagined Speech) and the corresponding mel spectrogram of Spoken Speech (target) into a GAN.
Through this approach, the mel spectrogram generated from the EEG is fed into a ResNet trained on actual speech mel spectrograms to predict the imagined word.

Requirements

`Python >= 3.7`

All the codes are written in Python 3.7.

You can install the libraries used in our project by running the following command:

pip install -r requirements.txt

Dataset

We extracted word utterance recordings for a total of 13 classes using the voices of 5 contributors and TTS technology.
Additionally, to address any data scarcity issues, we applied augmentation techniques such as time stretching, pitch shifting, and adding noise.

The pairs of recorded words are as follows:

Call
Camera
Down
Left
Message
Music
Off
On
Receive
Right
Turn
Up
Volume

Model & Training (ongoing)

ResNet-50

(We are continuously collecting data and will need to undergo a hyperparameter tuning process after training.)

Results (ongoing)

Performance metrics : Accuracy, F1 score

(learning curve and metrics will be here)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Mel-ResNet

Requirements

Dataset

Model & Training (ongoing)

Results (ongoing)

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

Mel-ResNet

Requirements

Dataset

Model & Training (ongoing)

Results (ongoing)

References