This project involves scraping captions or transcripts from YouTube videos using the YouTube Data API. It retrieves captions for either individual videos or entire playlists, storing the extracted data into a CSV file.
The Python script utilizes the googleapiclient
library to interact with the YouTube Data API v3 for fetching video captions. Additionally, it utilizes the youtube_transcript_api
library to access and retrieve the captions.
To run the script, follow these steps:
-
Install the necessary Python packages:
pip install google-api-python-client youtube-transcript-api
-
Obtain a YouTube Data API key from the Google Cloud Console:
- Visit https://console.cloud.google.com/
- Create a project (if not created already).
- Enable the YouTube Data API v3 for your project.
- Generate an API key.
-
Replace
'YOUR_API_KEY'
in the Python script (Youtube.py
) with your obtained API key.
The script can be used to scrape captions for either a single video or a playlist.
To fetch captions for a single video, modify the playlist_or_video_ids
variable in Youtube.py
with the desired video ID (not within a list).
playlist_or_video_ids = "YOUR_VIDEO_ID"
Run the script using the terminal with the following command
python Youtube.py
To fetch the captions modify the following code
playlist_or_video_ids = ["YOUR_PLAYLIST_ID"]
Feel free to customize and expand upon this README file to include additional information, setup instructions, or any other relevant details pertaining to your project's requirements and usage.