Skip to content

POC repository for set speech to text bizdev project

Notifications You must be signed in to change notification settings

thinkingmachines/speechtotext-poc

Repository files navigation

Overview

This project aims to fine-tune an open-source Whisper model for (Thai!) speech to text task on open source Whisper model.

Whisper is a state-of-the-art transformer model that can transcribe speech signals into text with high accuracy and low latency. We will use the huggingface's whisper implementation to fine-tune the model on our own GPU infrastructure, using a various custom dataset of audio recordings and transcripts.

We will also monitor the training process and evaluate the model performance with tensorboard, a visualization tool for machine learning experiments.

The tools used in this repository for finetuning can be described below:

tools

Setup dev environment

poetry env use python3.10 poetry update poetry install poetry run pre-commit install

About

POC repository for set speech to text bizdev project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published