African language Speech Recognition - Speech-to-Text

The World Food Program wants to deploy an intelligent form that collects nutritional information of food bought and sold at markets in two different countries in Africa - Ethiopia and Kenya. This project attempts to create a web app that does just that. It will allow users to register the list of items they bought using just their voice. This project utilizes deep learning models, Which are capable of transcribing a speech to text and deliver speech-to-text technology for the choosen two African languages: Amharic and Swahili.

Project Structure

Data

Dataset for Amharic https://github.com/getalp/ALFFA_PUBLIC

Data Features

Input features (X): audio clips of spoken words
Target labels (y): a text transcript of what was spoken

Requirements

Pytorch/Tensorflow ,
librosa, scikit-learn, Python,

Model Architecture

CNN (Convolutional Neural Network) plus RNN-based (Recurrent Neural - Network) architecture
RNN-based sequence-to-sequence network

Tasks:

Setting up DVC and MLflow
Exploring the data and Extracting useful information
Preprocessing and Augmenting the data
Extracting features
Modelling and Deployment using MLOps
Serving predictions on a web interface

Current Status

Integrating Preprocessing and Augmentation to the code base

Coming Changes

Modelling and Deployment using MLOps

Reference

https://towardsdatascience.com/audio-deep-learning-made-simple-automatic-speech-recognition-asr-how-it-works-716cfce4c706 https://www.kaggle.com/CVxTz/audio-data-augmentation

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
.dvc		.dvc
.github/workflows		.github/workflows
dashboard		dashboard
data		data
deeplmodel		deeplmodel
docs		docs
ml-flow		ml-flow
models		models
notebooks		notebooks
scripts		scripts
tests		tests
.dvcignore		.dvcignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
final_train9.ipynb		final_train9.ipynb
labels.json		labels.json
language_model.json		language_model.json
packages.txt		packages.txt
requirements.txt		requirements.txt
sample_result.JPG		sample_result.JPG
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

African language Speech Recognition - Speech-to-Text

Project Structure

Data

Data Features

Requirements

Model Architecture

Tasks:

Current Status

Coming Changes

Reference

About

Releases

Packages

Contributors 5

Languages

License

10Acad-WFP-App/AMH-STT

Folders and files

Latest commit

History

Repository files navigation

African language Speech Recognition - Speech-to-Text

Project Structure

Data

Data Features

Requirements

Model Architecture

Tasks:

Current Status

Coming Changes

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages