Project for recording and training subvocalization EMG data with the Cyton Board.
by Mateus de Aquino Batista for the Bachelor's Degree Final Project.
Dysarthria is a change in the normal pronunciation of words, usually caused by neurological disturbances. In advanced cases, when speech therapy is not enough to enable communication in a practical way, a silent speech interface can be used to carry out a conversation with a smaller vocabulary, through the identification of subtle movements in speech captured by an electromyography device of surface.
The present study deals with an instrumentation project of a low-cost silent speech interface and the development of a neural network to identify subvocalized words together plus an experimental case study for the equipment validation. An OpenBCI electromyography modular board will be used to read muscle activity data from the face, and these data will be later on processed, and classified by a convolutional neural network for the identification of subvocalized words.
Regarding the validation, the success rates of the neural network of the silent speech interface will be used to evaluate the general and individual performance for each participant. A hit rate between 90 and 95% is expected from the general validation.
Henceforth, with positive results coming from the interface, more words can be added in the advance, in order to guarantee greater usefulness in the daily lives of patients with communication difficulties.
This project requires setting up a Cyton Biosensing Board (8-Channels), which is a neural interface board developed by OpenBCI with a 32-bit processor that can be used to sample EEG, EMG, and ECG activity. Follow the starter guide to make sure you get it right.
This project also allows using the Synthetic board as a mock to the Cyton board. However, as it generates random data, training and validation of the Neural Networks with the Synthetic
option will not work.
This study is currently underway, and as such, the findings outlined in this article are preliminary and subject to change. The ongoing development phase of the project means that its present iteration is intended solely for research purposes. If you're interested, you can explore the progress in the Papers section (Portuguese).
After setting up your Cyton Board, you'll need to install the package dependencies:
python -m venv .venv # optional: install requirements into a virtual env
source .venv/bin/activate # optional: activate virtual env
pip install -r requirements.txt
Once you're done simply run:
python ./start.py
It should be accessible at localhost:8000
. In case the Cyton dongle is not available you might need to run with administrator privileges.
The main page includes the Time Series (filtered) for all 8 channels, you can also see some logging information and access to the board session on top of the page. Once the session is started you'll have access to the Recording tab, a page to setup the words and amount of information you'll want to train later. Note that all the default existing words are currently hardcoded into the HTML file, but they can be changed anytime:
EMG Tab | Recording Tab |
---|---|
After recording your first session (automatically saved as a csv file), the Neural Network tab will be available for training. This is where you include all recordings and setup all the training configs. Once started, you can check the training progress in real time. After the training is complete, you'll have access to the Evaluation tab, where you can test the predicting capability of the models you've trained.
Neural Network Tab | Evaluator Tab |
---|---|
The montage was defined using MIT's preliminary study about top muscle regions evaluated in a pilot user study:
Electrodes placement schema on Cyton Board |
---|
Region | Color | Pin |
---|---|---|
Earlobe | Black | BIAS |
Mental | Yellow | N3P (upper) |
Inner laryngeal | Blue | N1P (upper) |
Outer laryngeal | Red | N1P (lower) |
Hyoid | Green | N3P (lower) |
Inner infra-orbital | Purple | N2P (upper) |
Outer infra-orbital | Brown | N4P |
Buccal | Orange | N2P (lower) |
Selection of the final electrode target areas through feature selection on muscular areas of interest |
---|
Arnav Kapur, S. Kapur, P. Maes, et al. (2018) |
The PoC containing the steps of processing and training can be accessed by Complete processing.ipynb. This Jupyter Notebook has all the important pieces of code to reproduce the experiment, and also some visual graphs for a better understanding.
Synthetic 8-Channels Input | Words visualization |
---|---|
If you want to see the PoC with public EMG data instead, you can check Public data.ipynb, processing a public EMG hand gesture dataset.
Also, if you want to run the ipynb notebook in a virtualenv, make sure you setup jupyter correctly:
source .venv/bin/activate
python -m pip install ipykernel # install ipykernel / jupyter in the venv if not present
python -m ipykernel install --user --name=venv # self-install
# > Then, open Jupyter Notebook and select venv in "Switch kernel" option
- Desenvolvimento de um software para gravação e processamento de dados de eletromiografia para reconhecimento de comandos e termos:
- This paper was showcased during the XXII Congresso Saúde e Qualidade de Vida - Qualivitae (2023), marking the initial phase of our research. The primary objective of this phase was the creation of a customizable backend and frontend capable of capturing and analyzing EMG data. It's important to note that this paper exclusively employed synthetic data; no human data was involved in this particular study.
- Desenvolvimento de uma interface de fala silenciosa utilizando deep learning e emg no processamento de subvocalização
- This paper is currently in progress and is scheduled for presentation at the XXVII Encontro Latino Americano de Iniciação Científica. The main objective of this endeavor is to employ the identical processing algorithms and neural network that were developed during the previous research, this time using publicly available EMG data.
This marks the concluding phase of our research. For this publicly available repository, we included data from human participants and shared the electrodes placement, as we have obtained approval from Brazil's Ethics Committee (CAAE: 65587722.5.0000.5503, Parecer 6112574).
You can view the processing and results obtained from the human sEMG in the third PoC file: Subvocalization.ipynb.
The dataset folder /saves
is where all the EMG data is stored by default in the application, however this folder was organized to include the data from all 10 participants of the study. The all sessions recorded from all participants, it is grouped by: Participants Code
-> Speech style
-> Words
.
The Speech style could be any of:
- F: Normal speech ("Fala")
- A: Lip articulation ("Articulação labial")
- S: Subvocalization ("Subvocalização")
The four possible words selected in this study was: Yes ("Sim"), No ("Não"), Maybe ("Talvez") and Silence, whereas the silence itself is stored inside the folders between the words.
The results demonstrate that this processing and training method is effective for detecting strong gestures, such as speech and articulation. However, for more subtle muscle movements, such as subvocalization, further improvements in noise reduction and data augmentation are necessary.
It is important to note that the same methods, montage, and algorithm yielded varying results across different participants. Some exhibited lower accuracy in detecting easily recognizable speech, while others achieved higher accuracy in identifying less discernible subvocalizations.
All source code is made available under a BSD 3-clause license. You can freely use and modify the code, without warranty, so long as you provide attribution to the author. See LICENSE for the full license text.
The author reserve the rights to the article content.