Speech Keyword Detection

Speech keyword detection is a deep learning model that recognizes a keyword when spoken.

Dependencies

librosa
numpy
pandas
scikit-learn
tensorflow>=1.15

How it Works

0️⃣ Dependency Installation

pip install -r requirements.txt

1️⃣ Feature Extraction

Extract Mel-spectrogram and MFCC features from the audio dataset.

python ./utils/feature_extraction.py

2️⃣ Train Model

python train.py --model [MODEL_TYPE] --data [DATA_FEATURE_TYPE]

3️⃣ Test Model

python test.py --model [MODEL_TYPE] --data [DATA_FEATURE_TYPE]

Models

FC DNN
CNN
ResNet

Environment

The experiment was designed specifically to run on a small embedded system such as NVIDIA Jetson Nano 2GB.

Dataset

In this experiment, the dataset was created manually. It is created with three different mic location and three different reverberation time, providing nine different combination. Also, it is combined with six different genre of TV programs.

Training dataset = 24,006
Testing dataset = 31.968

Result

CNN with no-normalization and log mels feature extracted was the best model in this experiment.

Normalization	Feature	Model	Validation Accuracy	EER
No-Normalization	Log Mels	CNN	95.75%	4.04%
		FC DNN	-	-
		ResNet	88.59%	13.60%
	MFCC	CNN	77.43%	15.32%
		FC DNN	-	-
		ResNet	92.08%	7.91%
Max-Normalization	Log Mels	CNN	89.15%	13.37%
		FC DNN	91.58%	8.30%
		ResNet	81.06%	18.58%
	MFCC	CNN	81.13%	15.78%
		FC DNN	90.33%	17.18%
		ResNet	87.24%	12.84%

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
config		config
utils		utils
README.md		README.md
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Keyword Detection

Dependencies

How it Works

0️⃣ Dependency Installation

1️⃣ Feature Extraction

2️⃣ Train Model

3️⃣ Test Model

Models

Environment

Dataset

Result

About

Releases

Packages

Languages

nogamsung/speech-keyword-detection-tf

Folders and files

Latest commit

History

Repository files navigation

Speech Keyword Detection

Dependencies

How it Works

0️⃣ Dependency Installation

1️⃣ Feature Extraction

2️⃣ Train Model

3️⃣ Test Model

Models

Environment

Dataset

Result

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages