1.0.99996 | Alpha 4.3 for 1.1 | Current: Version 1.1-A4.3
Portable + GUI is at itch.io! Pick it up there.
Version 1.1-A 4.3
This update primarily focuses on enhancements to the GUI experience, subtitle window functionality, and a few minor bug fixes.
GUI Enhancements
-
Improved caption window:
- Dynamic text building: The caption window now dynamically builds the displayed text based on the user's selection of original, translated, or transcribed captions. This provides a cleaner and more organized display.
- Resizing capabilities: Users can now resize the caption window by clicking and dragging its edges, providing more flexibility in how the captions are displayed.
- Movement by dragging: The caption window can be moved around the screen by clicking and dragging the header text, making it easier to position the captions as desired.
- Background toggle: A menu option has been added to toggle the background of the caption window on or off, allowing for better integration with other applications.
- Auto-size mode: Users can now toggle the auto-size mode for the captions on and off. When on, the caption box will automatically resize to fit the text content.
- Updated Help Menu: The help menu (?) in the caption window has been updated to provide more information on the available features and their usage.
-
Visual Cues for Resizing: The mouse cursor now changes to indicate resizing options when hovering over the edges of the caption window, providing a more intuitive experience.
-
Transparency Settings: Options have been added to control the transparency of the caption window when in "Plant" mode, allowing for better visual integration with other applications.
-
Minor Bug Fixes:
- Fixed a minor bug related to loading the blacklist file path.
- Addressed a potential issue where the caption window might not correctly display dynamic text when the form is closing.
Version 1.1-A4
Full Fixes located at #112
What's Changed
- GUI Update Fixes by @cyberofficial in #94
- Enchantments and Bug Fixes by @cyberofficial in #101
- Bleeding to Main by @cyberofficial in #103
- Version 1.1-A4 by @cyberofficial in #112
- requirements update by @cyberofficial in #113
Changelog
Remote Microphone Server (Start of 1.1)
- Remote Microphone Server
- Fixed an issue where the stream module would show blank text if nothing is said.
- RMS Password Protected
- Made the stream module Multi-Threaded
- GUI updated to suit the HLS password system
- Fixed some issues with latency
Ignore sending data to API if data is empty
- Will not update API call if data is empty.
Journey to 1.1
- Remote Microphone Server
- RMS Password Protected
- Made the stream module Multi-Threaded
- GUI updated to suit the HLS password system
GUI Wrapper Update
- Added save functionality for HLS additional elements
- Added icons to bring up GitHub or Itch easily.
- Unchecks wipe setting checkbox if user canceled the wipe.
Fixed minor error
- Forgot to remove a sleep timer from processing audio oops.
Fix Win 2 Error For Portable Version [temporary]
- Fixed an issue where Win 2 Error would happen with the portable version, temporary fix. More better
- New UI Style (Experimenting with styling)
Remote Microphone update and Stream Module Update
Improvements:
- Enhanced Error Handling and Retries in Downloading Segments:
- Improved the
download_segment
function instream_transcription_module.py
to include more robust error handling and retries for downloading segments. - Added a
max_retries
parameter to thedownload_segment
function to specify the number of retry attempts. - The function now handles
requests.exceptions.RequestException
specifically for network-related errors, providing more informative error messages. - Implemented a retry mechanism with an optional
retry_delay
to handle temporary network issues. - Added error handling for
http.client.IncompleteRead
exceptions, which can occur if the connection drops during download.
- Improved the
- Improved M3U8 Playlist Loading:
- The
load_m3u8_with_retry
function instream_transcription_module.py
has been enhanced to retry loading the M3U8 playlist file if there are errors, making it more resilient to network fluctuations. - The function now handles potential
requests.exceptions.RequestException
andhttp.client.IncompleteRead
errors during playlist loading, retrying with a delay until successful.
- The
- Optimized Segment Downloading and Skipping Logic:
- Streamlined the segment downloading logic to avoid re-downloading already processed segments, improving efficiency.
- Modified the code to skip segments that have already been downloaded successfully, preventing unnecessary downloads.
- Prep for native audio capture support
- Added
sounddevice
,soundfile
, andpydub
to requirements.txt for audio capture. Will be using a more native way to capture audio better.
- Added
- GUI Improvements
- Adjusted vertical spacing for headers in
player.html
to address potential overlapping issues. This change ensures better visual clarity and prevents elements from overlapping, enhancing the user interface.
- Adjusted vertical spacing for headers in
Bug Fixes:
- Fixed Issue with Segment Downloading in
stream_transcription_module.py
:- Resolved a bug where the code was not properly skipping already downloaded segments in certain situations. This fix ensures that only new segments are downloaded instead of trying to decode broken audio files, improving efficiency. Fixes #108
- Fixed Issue with Incorrect HLS Flags in
remote_microphone.py
:- Corrected the HLS flags used in the FFmpeg command in
remote_microphone.py
. This change ensures that the generated HLS playlist is created correctly, addressing potential issues with live stream playback.
- Corrected the HLS flags used in the FFmpeg command in
Lock numpy to version 1.26.4
- Fixes #111 - Lock
numpy
to version 1.26.4
Update README.md
Version Bump
Additional Fixes
- UI Update: Sets a reminder to change port number before using prevents issue 107 (Fixes #107 )
Full Changelog: 1.0.99995a3...1.0.99995a4
Version 1.1-A3
A2 -> A3
- Fixed an issue where Win Error 2 would show up. #106
- Portable Version of FFMPEG will now be packed alongside the program, this is a temporary fix and will be baked into the program soon.
- GUI style updated. (Source version, not current release)
Synthalingua Change Log: Version 1.0.99995
This update focuses on improving the streaming functionality and user experience.
New Features:
- Remote Microphone Streaming: You can now stream audio from your microphone to a web server, allowing remote access for translation and transcription. This is useful for situations where you need to capture audio from a device that is not directly connected to the computer running Synthalingua.
- HLS Stream Password Support: Added support for HLS streams that require a password. You can now specify the password ID and password using the
--remote_hls_password_id
and--remote_hls_password
flags. - Improved Stream Downloading: Enhanced the reliability of segment downloading with retry mechanisms and error handling for streams with password protection or invalid credentials.
- Concurrent Download Limiting: Implemented a semaphore to limit the number of concurrent segment downloads, preventing potential network congestion and improving overall stability.
- Reduced Latency: Added a delay between segment downloads to improve the smoothness of stream processing and reduce latency.
Bug Fixes:
- Stream Transcription Crashes: Resolved several issues that could cause the stream transcription process to crash unexpectedly.
- Blacklist Filtering: Fixed a bug where blacklist filtering was not applied correctly in certain cases.
- User Interface Enhancements: The GUI wrapper now provides more informative messages and tooltips to guide users through the configuration process.
Other Changes:
- Updated dependencies to their latest versions.
- Refined code for better readability and maintainability.
Please note: This changelog highlights the key changes in version 1.0.99995. For a comprehensive list of all changes, please refer to the commit history on the GitHub repository.
ℹ️ Important information:
The streaming server for the microphone will record 30 1 second chunks. So if you want a 6 second record time, set to 6 chunks, or 3 seconds of audio set to 3 chunks. 1 Chunk = 1 Second
Synthalingua Remote Microphone Streaming and HLS Password Support
This update introduces two major features to Synthalingua: remote microphone streaming and support for password-protected HLS streams.
Remote Microphone Streaming
You can now stream audio from your microphone to a web server, enabling remote access for translation and transcription. This is particularly useful when:
- Capturing audio from a different device: If your microphone is connected to another computer or device within your network, you can utilize this feature to stream the audio to the machine running Synthalingua.
- Sharing audio with collaborators: Collaborators on different machines can access and translate/transcribe the audio stream from your microphone, facilitating remote collaboration. So lets say a user has a beefy pc, with like a 24gb gpu, using the 2gb ram option and english they can support a few people leaving enough gpu for personal use as-well.
To use remote microphone streaming:
-
Run the
remote_microphone.py
script on the machine with the microphone input or virtual audio input. This script creates a local web server that streams the microphone audio as an HLS stream. You will be prompted to select the microphone and choose a server port and stream key. For instance, if the script generates the stream key "your_secret_key", the HLS playlist URL would be:http://localhost:8888/index.m3u8?key=your_secret_key
-
On the machine running Synthalingua, use the
--stream
flag with the HLS playlist URL:python transcribe_audio.py --stream "http://localhost:8888/index.m3u8?key=your_secret_key" --stream_translate
Remember to replace "your_secret_key" with the actual stream key generated by the
remote_microphone.py
script. -
Add the
--remote_hls_password_id
and--remote_hls_password
flags to provide the stream key ID and key and othe relevant arguments:python transcribe_audio.py --ram 2gb --stream_original_text --stream "http://10.0.0.100:8888/index.m3u8?key=pMt3hgV1cVQRW08L9k3VDw" --stream_language English --stream_chunks 6 --device cuda --ignorelist "X:\github\Real-Time-Translation\blacklist.txt" --portnumber 2000 --condition_on_previous_text --remote_hls_password_id "key" --remote_hls_password "pMt3hgV1cVQRW08L9k3VDw "
This ensures that only authorized users with the correct stream key can access the audio stream. Which is hard coded enabled, there will be not option to disable this. Security first! This command does the following, Sets the VRAM usage to 2GB, Will show original text, Identify the input language as English so we don't need to transcribe or translate, sets the stream chunk to 6 (6 second audio recording time), using cuda (aka gpu) using a custom black list of phrases, Sets the webserver of Synthalingua to 2000 on the local machine, condition_on_previous_text will try to prevent spammed repeats of words, hls id is key, password is the password after the = sign.
HLS Stream Password Support
Synthalingua now supports HLS streams that require a password through url authentication, allowing you to access and translate/transcribe content from premium or protected sources that use hls passwords via direct url. Please remember not all HLS Password streams will not work like this, but this is still early alpha and will be improved upon later.
To use HLS stream password support also comes along the microphone server:
-
Obtain the HLS URL and password information. This might involve inspecting network requests or utilizing browser developer tools to identify the HLS playlist URL and any required authentication parameters.
-
Use the
--stream
flag with the HLS URL:python transcribe_audio.py --stream "https://your-protected-stream.com/playlist.m3u8" --stream_translate
-
Include the
--remote_hls_password_id
and--remote_hls_password
flags to provide the password ID and password:python transcribe_audio.py --stream "https://your-protected-stream.com/playlist.m3u8" --stream_translate --remote_hls_password_id "your_id" --remote_hls_password "your_password"
-
Replace "your_id" and "your_password" with the actual values required for authentication. The specific format and values for these parameters will vary depending on the stream provider.
Important Notes:
- The remote microphone server is also password-protected. Therefore, you still need to use the
--remote_hls_password_id
and--remote_hls_password
flags when connecting to the microphone stream. - HLS password support is currently in early alpha and may not work with all streams. We are actively working to improve compatibility and reliability.
- Ensure that ports are open if the microphone server is not within your local network.
- The default password ID used by the
remote_microphone.py
script is "key".
I hope these new features enhance your experience with Synthalingua! Please feel free to report any issues or provide feedback on the GitHub repository.