Whisper: Speech Synthesis with Whisper ASR

This repository contains an implementation of training the Whisper Automatic Speech Recognition (ASR) model using the LibriSpeech dataset. The Whisper model is part of the Whisper ASR system, designed for speech synthesis.

Requirements

Install the required libraries using the following command:

pip install torch torchaudio git+https://github.com/snakers4/whisper

Dataset

The code uses the LibriSpeech dataset for training. It automatically downloads the specified split (e.g., "test-clean") and preprocesses the audio data.

Usage Clone the repository:

git clone https://github.com/Aktharnvdv/whisper.git
cd whisper

Install the required libraries as mentioned above.

Run the training script:

python train_whisper.py

Configuration You can customize the training configuration, such as the batch size, number of workers, learning rate, and model dimensions, by modifying the corresponding variables at the beginning of the script.

Model

The Whisper model is initialized with the specified dimensions in the Model Dimensions class. The model is trained using the LibriSpeech dataset for a specified number of epochs.

Training

The train_whisper function initializes the dataset, data loader, model, and optimizer. It then trains the Whisper model for the specified number of epochs.

Results

After training, the Whisper model's state dictionary is saved to a file named whisper_model.pth.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
API		API
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
audio.mp3		audio.mp3
cog.yaml		cog.yaml
predict.py		predict.py
train_whisper.py		train_whisper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper: Speech Synthesis with Whisper ASR

Requirements

Install the required libraries using the following command:

Dataset

Model

Training

Results

About

Releases

Packages

Languages

License

Aktharnvdv/whisper

Folders and files

Latest commit

History

Repository files navigation

Whisper: Speech Synthesis with Whisper ASR

Requirements

Install the required libraries using the following command:

Dataset

Model

Training

Results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages