Transcription Stream

Created by https://transcription.stream with special thanks to MahmoudAshraf97 and his work on whisper-diarization, and to jmorganca for Ollama and its amazing simplicity in use.

Overview

Create a turnkey self-hosted offline transcription diarization service with Transcription Stream.

FORK

SSH and web interface removed in this fork, diarize hardcoded and whisper model set to medium. By setting the example.env to .env, the local path of the volume is defined.

Prerequisite: NVIDIA GPU

Warning: The resulting ts-gpu image is 23.7GB and might take a hot second to create

Build and Run Instructions

Automated Setup and Run

chmod +x install.sh;
./install.sh;

Manual Setup

Creating Volume

Transcription Stream Volume:

docker volume create --name=transcriptionstream

Build Images from their respective folders

ts-gpu Image: (23.7GB - includes necessary models and files to run offline)
```
docker build -t ts-gpu:latest .
```

Run the Service

Start the service using docker-compose. This provides updates from running jobs:
```
docker-compose -p transcriptionstream up
```

Additional Information

Customization and Troubleshooting

Change the password for transcriptionstream in the ts-gpu Dockerfile.
Uncomment ts-gpt section in docker-compose.yml to enable built-in Ollama mistral. Update install.sh and run.sh for mistral model install and updates.
Update the Ollama api endpoint url in /ts-gpu/transcribe_example_d.sh if not running ts-gpt
The transcription option uses whisperx, but was designed for whisper. Note that the raw text output for transcriptions might not display correctly in the console.
Both the large-v3 and large-v2 models are included in the initial build.
Update the Ollama api url in ts-gpu/transcribe_example_d.sh prior to install/build
Change the prompt text in ts-gpu/ts-summarize.py to fit your needs. Update ts-web/templates/transcription.html if you want to call it something other than summary.
12GB of vram is not enough to run both whisper-diarization and ollama mistral. Whisper-diarization is fairly light on gpu memory out of the box, but Ollama's runner holds over 10GB of gpu memory open after generating for quite sometime, causing the next diarization/transcription to run our of CUDA memory. Since I can't run both on the same host, I've set the batch size for both whisper-diarization and whisperx to 16, from their default 8.
I need to fix an issue with ts-web that throws an error to console when loading a transcription when a summary.txt file does not also exist.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
ts-gpu		ts-gpu
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
docker-compose.yml		docker-compose.yml
example.env		example.env
install.sh		install.sh
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transcription Stream

Overview

FORK

Build and Run Instructions

Automated Setup and Run

Manual Setup

Creating Volume

Build Images from their respective folders

Run the Service

Additional Information

Customization and Troubleshooting

About

Releases

Packages

Languages

License

SpawnY0815/transcriptionstream

Folders and files

Latest commit

History

Repository files navigation

Transcription Stream

Overview

FORK

Build and Run Instructions

Automated Setup and Run

Manual Setup

Creating Volume

Build Images from their respective folders

Run the Service

Additional Information

Customization and Troubleshooting

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages