This is a transcriber that uses Whisper (by OpenAI) to transcribe recorded lectures from the TU Wien and Uni Wien. It is a docker container that can be run on any machine that supports docker and has it installed.
I noticed that some of the lectures at TU Wien and Uni Wien have poor or even no scripts available. This is a problem for students that want to learn from the lectures but don't have the time to watch them. This project aims to help solve this problem by providing transcripts for the lectures. Of course, this is not a perfect solution, but used in combination with some new AI-Tools like ChatGPT or the Bing Chat Bot, it can help students to easier consume lectures and use this scripts as a lookup place.
Note: You need to have docker installed on your machine. If you don't have docker installed, you can find the installation instructions here.
I do not recommend to install this project on your own machine and rather put this project onto a server. The reason for this is that the transriber uses a lot of resources and even only one transcription could take many hours to complete. If you want to use your own machine, you can but you are warned.
-
First of all you need to clone this repository to your local machine:
git clone git@github.com:bananensplit/VO-Transcriber.git cd VO-Transcriber
-
Before you build the image please check in the
Dockerfile
if the correct lines are commented out. Depending on your architecture you need to uncomment the correct lines.# This is for ARM64 RUN wget https://github.com/wkhtmltopdf/packaging/releases/download/0.12.6-1/wkhtmltox_0.12.6-1.buster_arm64.deb && \ apt install -y ./wkhtmltox_0.12.6-1.buster_arm64.deb && \ apt install -y openssl build-essential libssl-dev libxrender-dev git-core libx11-dev libxext-dev libfontconfig1-dev libfreetype6-dev fontconfig # This is for AMD64 # RUN wget https://github.com/wkhtmltopdf/packaging/releases/download/0.12.6-1/wkhtmltox_0.12.6-1.buster_amd64.deb && \ # apt install -y ./wkhtmltox_0.12.6-1.buster_amd64.deb && \ # apt install -y openssl build-essential libssl-dev libxrender-dev git-core libx11-dev libxext-dev libfontconfig1-dev libfreetype6-dev fontconfig
To check which architecture you have, you can run the following command
uname -m # on Linux $Env:PROCESSOR_ARCHITECTURE # on Windows
-
Then you can build the image:
docker build -t vo-transcriber:0.1.0 .
The following section(s) contain(s) information on how to run the container and how to set the correct parameters.
-
--help
- If this parameter is set, the help will be printed. -
--uni
- The university you want to transcribe the VOs from. Possible values aretu
anduw
. -
--vos
- The name of the VO you want to transcribe. This parameter can be used multiple times. If you don't provide this parameter, nothing will be transcribed. -
-p
- The path to the file containing the VO-data. Can be used with the--uni
parameter set totu
oruw
-
-k
- The link to the VO-data. Can be used with the--uni
parameter set touw
.usually the link to the data looks something like this:
https://ustream.univie.ac.at/search/episode.json?limit=200&offset=0&sid=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
-
-m
- The name of the Whisper model you want to use for transcribing. Possible values can be looked up on the Whsiper Github. -
-v
- If this parameter is set, the verbose parameter will be passed to whisper and you will be able to see realtime translations. -
-l
- The language of the VO. Possible languages are listed here. Default isde
. -
--txt
- If this parameter is set, the transcription will be saved as atxt
file. -
--vtt
- If this parameter is set, the transcription will be saved as avtt
file (format for subtitles). -
--srt
- If this parameter is set, the transcription will be saved as asrt
file (format for subtitles). -
--pdf
- If this parameter is set, the transcription will be saved as apdf
file. -
-o
- The output folder. Must beoutput
(Docker and stuff).
Note: This only works for the Uni Wien. The TU-Wien (unlike the Uni Wien) has authentication which complicates things a bit. You can't provide a link to the data because this project doesn't support authentication (yet).
Before you start make sure the link you provide is correct and public (no login is required to retrieve the data).
-
To look up which vos are available, run the following command:
docker run --rm --name vo-transcribe \ vo-transcriber:0.1.0 \ --uni "uw" \ -k "https://ustream.univie.ac.at/search/episode.json?limit=200&offset=0&sid=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX"
-
When you choose which VOs you want to transcribe, run the following command:
mkdir output docker run -d --name vo-transcribe \ -v $(pwd)/output:/usr/src/app/output \ vo-transcriber:0.1.0 \ --uni "uw" \ -k "https://ustream.univie.ac.at/search/episode.json?limit=200&offset=0&sid=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX" --vos "<name of the VO you want to transcribe>" \ --vos "<second VO you want to transcribe>" \ --vos "<... VO you want to transcribe>" \ --vos "<last VO you want to transcribe>" \ -o "output" \ -m "medium" \ -l "de" \ --txt \ --vtt \ --srt \ --pdf
You can of course adjust the parameters to your needs. This command would
- transcribe 4 VOs
- use the
medium
model (takes about 3 hours for 1,5 hours of VO) - use the
de
language - save the transcription as
txt
,vtt
,srt
andpdf
files
Note: In the following instructions the
--uni
parameter is set totu
(TU Wien) but the same steps would also work for Uni Wien (uw
).
-
Create the mount folder and the data file which contains the VO-data.
mkdir ouput echo "data" > output/data.json
-
To look up which vos are available, run the following command:
docker run --rm --name vo-transcribe \ -v $(pwd)/output:/usr/src/app/output \ vo-transcriber:0.1.0 \ --uni "tu" \ -p "output/data.json"
-
When you choose which VOs you want to transcribe, run the following command:
docker run -d --name vo-transcribe \ -v $(pwd)/output:/usr/src/app/output \ vo-transcriber:0.1.0 \ --uni "tu" \ -p "output/data.json" \ --vos "<name of the VO you want to transcribe>" \ --vos "<second VO you want to transcribe>" \ --vos "<... VO you want to transcribe>" \ --vos "<last VO you want to transcribe>" \ -o "output" \ -m "medium" \ -l "de" \ --txt \ --vtt \ --srt \ --pdf
You can of course adjust the parameters to your needs. This command would
- transcribe 4 VOs
- use the
medium
model (takes about 3 hours for 1,5 hours of VO) - use the
de
language - save the transcription as
txt
,vtt
,srt
andpdf
files
This are instructions for people who might want to contribute to this project on how to get this project running on their local machine:
Note: This project uses Python 3.10 and i recommend to use virtualenv to manage all the dependencies.
# Clone this repository
git clone git@github.com:bananensplit/VO-Transcriber.git
cd VO-Transcriber
# Create a virtual environment
python -m virtualenv venv
source venv/bin/activate
# Install the dependencies
pip install -r requirements.txt
This is as far as the setup goes. You can now run the project with the following command:
python main.py
# The main.py takes the same parameters as the docker image
# When no parameters are provided, the program will return its help message
To build the docker image, run the following command:
docker build -t vo-transcriber:0.1.0 .
If you find a bug in the source code or a mistake in the documentation, you can help me by submitting an issue in the Issuetracker. Even better you can submit a Pull Request with a fix.
Furthermore if you have an idea for a new feature, feel free to submit an issue with a proposal for your new feature. Please add as much detail as possible to the issue description. This will help me to understand your idea and to discuss it with you.
Thanks for making this project better!
Jeremiasz Zrolka - jeremiasz.zrolka@gmail.com
- Twitter: @jeremiasz_z
- Instagram: @jeremiasz_z
- LinkedIn: jeremiasz-zrolka