Skip to content

1.0.99996 | Alpha 4.3 for 1.1 | Current: Version 1.1-A4.3

Compare
Choose a tag to compare
@cyberofficial cyberofficial released this 10 Jul 00:44
· 48 commits to master since this release

Portable + GUI is at itch.io! Pick it up there.

Version 1.1-A 4.3

This update primarily focuses on enhancements to the GUI experience, subtitle window functionality, and a few minor bug fixes.

GUI Enhancements

  • Improved caption window:

    • Dynamic text building: The caption window now dynamically builds the displayed text based on the user's selection of original, translated, or transcribed captions. This provides a cleaner and more organized display.
    • Resizing capabilities: Users can now resize the caption window by clicking and dragging its edges, providing more flexibility in how the captions are displayed.
    • Movement by dragging: The caption window can be moved around the screen by clicking and dragging the header text, making it easier to position the captions as desired.
    • Background toggle: A menu option has been added to toggle the background of the caption window on or off, allowing for better integration with other applications.
    • Auto-size mode: Users can now toggle the auto-size mode for the captions on and off. When on, the caption box will automatically resize to fit the text content.
    • Updated Help Menu: The help menu (?) in the caption window has been updated to provide more information on the available features and their usage.
  • Visual Cues for Resizing: The mouse cursor now changes to indicate resizing options when hovering over the edges of the caption window, providing a more intuitive experience.

  • Transparency Settings: Options have been added to control the transparency of the caption window when in "Plant" mode, allowing for better visual integration with other applications.

  • Minor Bug Fixes:

    • Fixed a minor bug related to loading the blacklist file path.
    • Addressed a potential issue where the caption window might not correctly display dynamic text when the form is closing.

Version 1.1-A4

Full Fixes located at #112

What's Changed

Changelog

Remote Microphone Server (Start of 1.1)

  • Remote Microphone Server
  • Fixed an issue where the stream module would show blank text if nothing is said.
  • RMS Password Protected
  • Made the stream module Multi-Threaded
  • GUI updated to suit the HLS password system
  • Fixed some issues with latency

Ignore sending data to API if data is empty

  • Will not update API call if data is empty.

Journey to 1.1

  • Remote Microphone Server
  • RMS Password Protected
  • Made the stream module Multi-Threaded
  • GUI updated to suit the HLS password system

GUI Wrapper Update

  • Added save functionality for HLS additional elements
  • Added icons to bring up GitHub or Itch easily.
  • Unchecks wipe setting checkbox if user canceled the wipe.

Fixed minor error

  • Forgot to remove a sleep timer from processing audio oops.

Fix Win 2 Error For Portable Version [temporary]

  • Fixed an issue where Win 2 Error would happen with the portable version, temporary fix. More better
  • New UI Style (Experimenting with styling)

Remote Microphone update and Stream Module Update

Improvements:

  • Enhanced Error Handling and Retries in Downloading Segments:
    • Improved the download_segment function in stream_transcription_module.py to include more robust error handling and retries for downloading segments.
    • Added a max_retries parameter to the download_segment function to specify the number of retry attempts.
    • The function now handles requests.exceptions.RequestException specifically for network-related errors, providing more informative error messages.
    • Implemented a retry mechanism with an optional retry_delay to handle temporary network issues.
    • Added error handling for http.client.IncompleteRead exceptions, which can occur if the connection drops during download.
  • Improved M3U8 Playlist Loading:
    • The load_m3u8_with_retry function in stream_transcription_module.py has been enhanced to retry loading the M3U8 playlist file if there are errors, making it more resilient to network fluctuations.
    • The function now handles potential requests.exceptions.RequestException and http.client.IncompleteRead errors during playlist loading, retrying with a delay until successful.
  • Optimized Segment Downloading and Skipping Logic:
    • Streamlined the segment downloading logic to avoid re-downloading already processed segments, improving efficiency.
    • Modified the code to skip segments that have already been downloaded successfully, preventing unnecessary downloads.
  • Prep for native audio capture support
    • Added sounddevice, soundfile, and pydub to requirements.txt for audio capture. Will be using a more native way to capture audio better.
  • GUI Improvements
    • Adjusted vertical spacing for headers in player.html to address potential overlapping issues. This change ensures better visual clarity and prevents elements from overlapping, enhancing the user interface.

Bug Fixes:

  • Fixed Issue with Segment Downloading in stream_transcription_module.py:
    • Resolved a bug where the code was not properly skipping already downloaded segments in certain situations. This fix ensures that only new segments are downloaded instead of trying to decode broken audio files, improving efficiency. Fixes #108
  • Fixed Issue with Incorrect HLS Flags in remote_microphone.py:
    • Corrected the HLS flags used in the FFmpeg command in remote_microphone.py. This change ensures that the generated HLS playlist is created correctly, addressing potential issues with live stream playback.

Lock numpy to version 1.26.4

  • Fixes #111 - Lock numpy to version 1.26.4

Update README.md

Version Bump

Additional Fixes

  • UI Update: Sets a reminder to change port number before using prevents issue 107 (Fixes #107 )

Full Changelog: 1.0.99995a3...1.0.99995a4


Version 1.1-A3
A2 -> A3

  • Fixed an issue where Win Error 2 would show up. #106
  • Portable Version of FFMPEG will now be packed alongside the program, this is a temporary fix and will be baked into the program soon.
  • GUI style updated. (Source version, not current release)

Synthalingua Change Log: Version 1.0.99995

This update focuses on improving the streaming functionality and user experience.

New Features:

  • Remote Microphone Streaming: You can now stream audio from your microphone to a web server, allowing remote access for translation and transcription. This is useful for situations where you need to capture audio from a device that is not directly connected to the computer running Synthalingua.
  • HLS Stream Password Support: Added support for HLS streams that require a password. You can now specify the password ID and password using the --remote_hls_password_id and --remote_hls_password flags.
  • Improved Stream Downloading: Enhanced the reliability of segment downloading with retry mechanisms and error handling for streams with password protection or invalid credentials.
  • Concurrent Download Limiting: Implemented a semaphore to limit the number of concurrent segment downloads, preventing potential network congestion and improving overall stability.
  • Reduced Latency: Added a delay between segment downloads to improve the smoothness of stream processing and reduce latency.

Bug Fixes:

  • Stream Transcription Crashes: Resolved several issues that could cause the stream transcription process to crash unexpectedly.
  • Blacklist Filtering: Fixed a bug where blacklist filtering was not applied correctly in certain cases.
  • User Interface Enhancements: The GUI wrapper now provides more informative messages and tooltips to guide users through the configuration process.

Other Changes:

  • Updated dependencies to their latest versions.
  • Refined code for better readability and maintainability.

Please note: This changelog highlights the key changes in version 1.0.99995. For a comprehensive list of all changes, please refer to the commit history on the GitHub repository.

ℹ️ Important information:

The streaming server for the microphone will record 30 1 second chunks. So if you want a 6 second record time, set to 6 chunks, or 3 seconds of audio set to 3 chunks. 1 Chunk = 1 Second


Synthalingua Remote Microphone Streaming and HLS Password Support

This update introduces two major features to Synthalingua: remote microphone streaming and support for password-protected HLS streams.

Remote Microphone Streaming

You can now stream audio from your microphone to a web server, enabling remote access for translation and transcription. This is particularly useful when:

  • Capturing audio from a different device: If your microphone is connected to another computer or device within your network, you can utilize this feature to stream the audio to the machine running Synthalingua.
  • Sharing audio with collaborators: Collaborators on different machines can access and translate/transcribe the audio stream from your microphone, facilitating remote collaboration. So lets say a user has a beefy pc, with like a 24gb gpu, using the 2gb ram option and english they can support a few people leaving enough gpu for personal use as-well.

To use remote microphone streaming:

  1. Run the remote_microphone.py script on the machine with the microphone input or virtual audio input. This script creates a local web server that streams the microphone audio as an HLS stream. You will be prompted to select the microphone and choose a server port and stream key. For instance, if the script generates the stream key "your_secret_key", the HLS playlist URL would be:

    http://localhost:8888/index.m3u8?key=your_secret_key 
    
  2. On the machine running Synthalingua, use the --stream flag with the HLS playlist URL:

    python transcribe_audio.py --stream "http://localhost:8888/index.m3u8?key=your_secret_key" --stream_translate
    

    Remember to replace "your_secret_key" with the actual stream key generated by the remote_microphone.py script.

  3. Add the --remote_hls_password_id and --remote_hls_password flags to provide the stream key ID and key and othe relevant arguments:

    python transcribe_audio.py --ram 2gb --stream_original_text --stream "http://10.0.0.100:8888/index.m3u8?key=pMt3hgV1cVQRW08L9k3VDw" --stream_language English --stream_chunks 6 --device cuda --ignorelist "X:\github\Real-Time-Translation\blacklist.txt" --portnumber 2000 --condition_on_previous_text --remote_hls_password_id "key" --remote_hls_password "pMt3hgV1cVQRW08L9k3VDw "
    

    This ensures that only authorized users with the correct stream key can access the audio stream. Which is hard coded enabled, there will be not option to disable this. Security first! This command does the following, Sets the VRAM usage to 2GB, Will show original text, Identify the input language as English so we don't need to transcribe or translate, sets the stream chunk to 6 (6 second audio recording time), using cuda (aka gpu) using a custom black list of phrases, Sets the webserver of Synthalingua to 2000 on the local machine, condition_on_previous_text will try to prevent spammed repeats of words, hls id is key, password is the password after the = sign.

HLS Stream Password Support

Synthalingua now supports HLS streams that require a password through url authentication, allowing you to access and translate/transcribe content from premium or protected sources that use hls passwords via direct url. Please remember not all HLS Password streams will not work like this, but this is still early alpha and will be improved upon later.

To use HLS stream password support also comes along the microphone server:

  1. Obtain the HLS URL and password information. This might involve inspecting network requests or utilizing browser developer tools to identify the HLS playlist URL and any required authentication parameters.

  2. Use the --stream flag with the HLS URL:

    python transcribe_audio.py --stream "https://your-protected-stream.com/playlist.m3u8" --stream_translate 
    
  3. Include the --remote_hls_password_id and --remote_hls_password flags to provide the password ID and password:

    python transcribe_audio.py --stream "https://your-protected-stream.com/playlist.m3u8" --stream_translate --remote_hls_password_id "your_id" --remote_hls_password "your_password" 
    
  4. Replace "your_id" and "your_password" with the actual values required for authentication. The specific format and values for these parameters will vary depending on the stream provider.

Important Notes:

  • The remote microphone server is also password-protected. Therefore, you still need to use the --remote_hls_password_id and --remote_hls_password flags when connecting to the microphone stream.
  • HLS password support is currently in early alpha and may not work with all streams. We are actively working to improve compatibility and reliability.
  • Ensure that ports are open if the microphone server is not within your local network.
  • The default password ID used by the remote_microphone.py script is "key".

I hope these new features enhance your experience with Synthalingua! Please feel free to report any issues or provide feedback on the GitHub repository.