All notable changes to this project will be documented in this file.
A complete rewrite of the application. This is now split into two parts:
- A small Python utility package called
frogbase
that contains all the backend logic for the UI. This can be used as a standalone package or integrated into other applications. - A slimmer Streamlit UI that provides a thin wrapper around the
frogbase
package built purely with self-hosted applications in mind.
Featuress
-
More content sources & formats
- The use of
pytube
has been replaced withyt_dlp
. This unlocks content download from a broad range of media platforms like YouTube (channels, playlists, videos), TikTok, Vimeo etc. (full list) - Local files can now be ingested from a directory instead of just a single file.
Sources can now be added in as a list of urls and/or local file paths.
- The use of
-
Semantic Search
- The search functionality now includes semantic search over transcript contents instead of a simple substring search. This is done using sentence-transformers and hnswlib
-
Updated Streamlit UI
- The UI now includes the concept of Libraries to further organize media downloads. Libraries are simply subdirectories within the main data directory.
- Filter & search functionality have been simplified and made more intuitive.
- Merged feature from @Eidenz to add translation in addition to transcription
Since there was some apetite for this, I've rewritten this to make it a tad cleaner with a few additional features based on issues raised and personal preferences.
- Ability to download entire YouTube playlists and upload multiple files at once
- Ability to browse, filter, and search through saved audio files (For now, this is done with a simple SQLite database & SQLAlchemy ORM)
- Auto-export of transcriptions in multiple formats (was a feature request)
- Simple substring based search for transcript segments. This is done with a simple
LIKE
query on the SQLite database. - Fully reworked UI with a cleaner layout and more intuitive navigation.
- Ability to save whisper configurations and reuse to prevent having to re-enter the same parameters every time.
- Removed the ability to crop audio after download to simplify the codebase. Also, temporarily removed summarization until GPT-3 integration is complete.
Initial release for demand testing (PR #1).
Features:
- Ability to process media from YouTube & local files
- Whisper transcription
- Basic huggingface integration for summarization