A Python tool that downloads YouTube videos as audio and generates detailed transcriptions with speaker diarization using Google's Gemini AI. The tool supports batch processing of multiple URLs.
Here's a sample transcript generated by the tool:
From this 48-minute video Elon Musk - CEO of Tesla Motors and SpaceX | Entrepreneurship | Khan Academy, you can find this example at ExampleTranscript.md
.
Elon Musk: Um yeah, I mean I I I mean the goal is not uh just it's not sort of to become a bra big brand or to compete with uh Honda Civics uh but rather to advance the course of electric vehicles.
Interviewer: Hmm.
Elon Musk: So we're just going to keep making more and more electric cars and driving the price point down until the industry is very firmly electric. You know, like maybe half of all cars made are electric or something like that. Which is not to say that we expect to make all half of all cars. We we we want to just have that catalytic effect until at least that occurs. And I think at the point which there's you know we're approaching half of all new cars made are electric then I think that's I would consider that to be kind of the victory condition.
Interviewer: Wow.
Elon Musk: Um And and so the faster we can bring that day the the better.
Interviewer: Wow. When when would be your guess when that happens?
Elon Musk: Um well, I made a bet with someone about 3 years ago that it would be sooner than 20 years, so it's 17 years from now. But I but that's I think I I that's conservative. I think it's probably you know but maybe maybe 13 or 14 years, something like that.
Interviewer: Wow. Right right about the time Right when we're going to Mars. It'll it'll be it'll be exciting exciting times.
Elon Musk: Yeah. Absolutely. True. That's that's it just could yeah. Exactly. I was just thinking about that. It was like oh those time frames are kind of coincident.
So it does transcript the video, and also identify the speaker (I mean, kind of-).
- Downloads YouTube videos as MP3 audio files
- Splits long audio files into manageable chunks
- Generates detailed transcriptions with speaker identification
- Supports batch processing of multiple URLs
- Includes rate limiting and retry mechanisms
- Progress tracking with tqdm
- Concurrent processing with ThreadPoolExecutor
- Google Gemini API key, get yours here
- Clone the repository:
git clone https://github.com/madeyexz/youtube2transcripts.git
cd youtube2transcripts
- Install required packages:
pip install -r requirements.txt
- Create a
.env
file in the project root and add your Gemini API key:
GEMINI_API_KEY=your_api_key_here
- Create and activate a virtual environment:
uv venv .venv && source .venv/bin/activate && uv pip install -r requirements.txt
-
Run the script in one of two ways:
A. Using the GUI interface:
python run.py
This will open a graphical interface where you can paste URL and read the transcript.
B. Using command-line interface:
python youtube_transcriber.py
When prompted, enter YouTube URLs one per line. Press Enter twice when done:
https://youtube.com/watch?v=example1 https://youtube.com/watch?v=example2 [Press Enter twice to start processing]
The script will:
- Download audio from each URL
- Split audio into 20-minute chunks (because Gemini AI has a output token limit of 8k tokens, which is roughly 30 minutes of people talking)
- Process each chunk through Gemini 1.5 flash
- Generate and save transcripts in the
transcripts_better
directory
Transcripts are saved as markdown files in the transcripts_better
directory with the following naming convention:
transcript_[sanitized_video_title].md
Current LLM model used is gemini-1.5-flash
, which is the nice multi-modal model that enabled this project.
Key constants that can be modified in the script are listed below, but are not recommended to be modified, especially if you are not paying for the Gemini API.
CALLS_PER_SECOND
: API rate limit (default: 1.8)MAX_WORKERS
: Maximum concurrent jobs (default: 2)chunk_duration
: Audio chunk size in minutes (default: 20)
The script includes:
- Automatic retries for failed API calls
- Rate limiting to prevent API throttling
- Comprehensive logging
- File existence checks to prevent duplicate downloads
Contributions are welcome! Please feel free to submit a pull request.
This tool is for educational purposes only. Please ensure you have the right to download and process any YouTube content before using this tool.