Speech Transcriber

A web-app/library for transcribing speech

Installation

Install Python 3.9
Install ffmpeg
- Windows: Download zip & add ffmpeg/bin to environment path
- Linux: apt-get install ffmpeg
pip install -r requirements.txt
(Optional) Download punctuator model and save as INTERSPEECH-T-BRNN.pcl

Usages

Web app

Run pip install flask before running the web app.

Then run python app.py to open the web app at http://localhost:5000/

CLI

python main.py --path filename --transcriber transcriber

Path: Path to the audio/video file to transcribe
Transcriber: Transcription model to use, choose from:
- cmu_sphinx
- librispeech
- silero
- vosk
- wav2vec2
- wav2vec2_commonvoice
- whisper

Transcription models

When selecting transcription models, the following requirements were used:

Must be supported in Python 3.9
Must work locally (without the usage of an API)
Must have a straightforward installation process
- Should not require building from source
- Should not require additional OS libraries
- Should not require manually downloading additional files

Below is a comparison of transcription model performance produced using the Librispeech test clean dataset and analysis script

Name	Dependencies	Model Size	Average processing time	Score
Wav2Vec2 CommonVoice	speechbrain	1.18GB	3.351s	0.87
Librispeech	torch, transformers, torchaudio, librosa	113MB	0.558s	0.85
Wav2Vec2	torch, transformers, torchaudio, librosa	360MB	1.325s	0.8
Whisper	whisper	138MB	3.848s	0.77
Vosk	vosk	67.7MB	1.206s	0.76
Silero	torch, transformers, torchaudio, librosa, omegaconf	111MB	0.261s	0.68
CMU Sphinx	SpeechRecognition, pocketsphinx	33.9MB*	1.123s	0.55

*size of pocketsphinx package

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.circleci		.circleci
static		static
tests		tests
transcribers		transcribers
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
analysis.py		analysis.py
app.py		app.py
audio.py		audio.py
main.py		main.py
output.py		output.py
requirements.txt		requirements.txt
select_transcriber.py		select_transcriber.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Transcriber

Installation

Usages

Web app

CLI

Transcription models

About

Releases

Packages

Languages

License

BenAAndrew/speech-transcriber

Folders and files

Latest commit

History

Repository files navigation

Speech Transcriber

Installation

Usages

Web app

CLI

Transcription models

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages