Welcome to the WhisperMe Audio Transcriber🎙️, the ultimate audio transcription tool powered by OpenAI's Whisper API built for the mission - "Accessible AI Education and Technology for all" ❤️🌏🫱🏽🫲🏼❤️
Designed with simplicity and efficiency in mind, WhisperMe aims to revolutionize the way we transcribe audio by offering atleast 98% accurate high-quality, accessible and user-friendly transcription services, that can be integrated by anyone, anywhere.
If you like the project and the mission, do leave a Star ⭐ on this repository. And if you want to further support the cause, this is where you can truly do by donating any amount you wish, here - Github Sponsors Link.
- High-Quality Transcription: Leverages the cutting-edge Whisper API for accurate transcriptions. File formats supported are -
*.flac *.wav *.mp3 *.ogg *.webm *.mpeg *.mpga *.mp4
- Live Recording Feature: Includes a live recording feature with timer and saves the audio file as an uncompressed .wav file for the highest quality recording.
New
Extract Audio from Video Files/Youtube : No more thinking about how to transcribe video files. With the packed audio extraction feature, you can now directly extract audio from any video file orpaste a link from Youtube
and hit Transcribe Now.- Large File Handling: Smartly splits large audio files into mana🫱🏽🫲🏼geable chunks for transcription , ensuring efficient processing.
- User-Friendly GUI: Offers a straightforward and the most simple interface for hassle-free operation.
- Real-Time Logging: Provides immediate feedback and status updates during transcription with logs for each operation.
- Simplest Export Option: Easily save your transcripts in the most simplest and easy-to-use option with
.txt
-
Clone the Repository :
git clone https://github.com/Shreyan1/WhisperMe.git
;cd WhisperMe
-
Install Dependencies :
pip install -r requirements.txt
-
Add the API key : Add your OpenAI developer API key to
apikey.py
-
Launch WhisperMe [on Linux and MacOS] :
./run.sh
on Linux or MacOS ;.\run.bat
on Windows using Powershell
The ideal max audio file size is 25MB, so anything less than that will be completely fine, although larger file handling has been integrated in this code but still.
Old Version : This repository contained 2 versions -
- USE-in-CODE : Has been depreciated
- GUI Version : Has been renamed to src/
Latest Version : This repository now contains 1 version of the source code -
- src : This packs a graphical user interface designed using Tkinter library to use it across all platforms. This GUI repo has 2 pairs of special .sh/.bat files namely -
clear_logs.sh/.bat
andrun.sh/.bat
.
Let's understand what they do -
- clear_logs.sh/.bat : This handles the automatic deletion of all the log files inside the log folder in a second, which could otherwise take a long time when multiple logs have been generated and have caused an overpopulation inside the folder resulting into a disk space issue.
- run.sh/.bat : This performs the direct execution of the GUI.py file without manually opening it and executing it through an editor/terminal, thus giving a software like UX while interacting with the GUI.
There is also a folder with Example Transcripts generated using this repo code by transcribing multilingual audio files for - Chinese, English, French, German, Hindi, Italian, Japanese, Korean, Russian, Spanish, Tamil
Check out the folder to view the power of the Whisper Model.
-
Start WhisperMe: Run the
run.sh
script to launch the application. -
Record Live/ Select Audio File: Click on
Start Recording
to record live and save it as a .wav file orTranscribe Now
to select the audio file you wish to transcribe. -
Extract Audio: Click on
Extract Audio
button to extract any audio file directly by pasting a link from Youtube or by choosing a video file from your folders. Supported files are -*.mp4
,*.mkv
,*.avi
-
Transcription: Wait for the transcription to complete. A "Please wait" message will display during this process.
-
Save Transcript: Once transcription is complete, choose where to save your transcript file.
You can access and use it all you want but there are 3 options to access the core service -
-
The Flash way⚡: If you already have an OpenAI API Subscription, then just include your api inside the apikey.py file inside the variable
APIKEY=
or, -
The Quicksilver Way🏃🏻♂️ : Mail in the audio file directly at my mailing address - shreyan.github@gmail.com and let me know if you need any thing specific and by how many days. I'll mail you the price depending on the length of the audio. It'll typically be starting from $0.50 / ₹29.00 for any audio file less than 20 minutes.
-
The Sonic-the-Hedgehog Way🦔: Subscription for one time usage-
- $0.99 /₹49.00 for 3 hours of WhisperMe Usage.
- $1.95 /₹95.00 for 6 hours of usage.
- $2.90 /₹125.00 for 9 hours of usage.
You can directly pay through the Github Sponsors link for this repo by clicking on the Sponsors button or by clicking on the link here- Pay via Github Sponsors and then drop a mail at shreyan.github@gmail.com
We'll send you a private API key within a few hours for the subscribed hours at your email address. The API key will automatically get de-activated after the susbscribed hours of usage. To extend it you can either subscribe again or pre-subscribe with more hours for as long as you want and the price will be calculated as : $(0.99+(n−1)×0.95) or ₹(-2.5n^2 + 57.5n - 6) for n hours.
We warmly welcome contributions from the community. Whether you're fixing bugs, adding new features, or improving documentation, your help is appreciated. Please check out our Contributing Guide for more details on submitting pull requests to the project.
Encounter a bug or need support? Open an issue or start a discussion🗣️ here now to engage and help the community to understand more.
- Wikimedia Audio Library
- OpenAI Team for the Whisper API
- All you contributors who will help the community by shaping WhisperMe
WhisperMe is released under the GPL-3.0 License. See the LICENSE file for more details.
WhisperMe is created by Shreyan Basu Ray, a passionate advocate for the mission - "Accessible AI Education and Technology for all".
Let us connect together at -