Releases: KoljaB/RealtimeSTT
v0.3.7
RealtimeSTT 0.3.7
- fixed a bug to make client terminate gracefully (logged websocket error in debug mode before)
- reworked the CLI interfaces and added shorter commands (for example --writechunks is now -W or --write, for more information please look into the Client Server Readme)
v0.3.6
RealtimeSTT 0.3.6
- more logging for client/server:
Additional parameters for server:- --use_extended_logging, writes extensive log messages for the recording worker, that processes the audio chunks
- --debug, enables debug logging for detailed server operations
- --logchunks, enables logging of incoming audio chunks (periods)
- --writechunks, saves received audio chunks to a WAV file
Additional parameters for client: - --debug, enables debug logging for detailed client operations
- --writechunks, saves recorded audio chunks to a WAV file
- more logging for AudioToTextRecorder when called with use_extended_logging = True
- new init_realtime_after_seconds parameter for AudioToTextRecorder to finetune the default of 0.2s
v0.3.5
v0.3.4
v0.3.2
RealtimeSTT 0.3.2
New Features:
- server/stt_server.py and AudioToTextRecorderClient class now support wake words (all parameters and callbacks of AudioToTextRecorder should now have been already implemented into AudioToTextRecorderClient class, please write an issue if you miss a functionality)
- update microphone reconnect
v0.3.1
RealtimeSTT 0.3.1
New Features:
-
AudioToTextRecorderClient class: automatically starts a server if none is running and connects to it. The class shares the same interface as AudioToTextRecorder, making it easy to upgrade or switch between the two. (Work in progress, most parameters and callbacks of AudioToTextRecorder are already implemented into AudioToTextRecorderClient, but not all. Also the server can not handle concurrent (parallel) requests yet.)
-
New reworked CLI interface: "stt-server" to start the server, "stt" to start the client, look at "server" folder for more info
-
fixed #127
-
integrated PR #131
v0.3.0
RealtimeSTT 0.3.0
New Features:
- Soundcard Compatibility: Automatically adjusts from 48kHz downwards if 16kHz is unsupported, resampling to 16kHz.
- Early Transcription: Added
early_transcription_on_silence
parameter to enable transcription during speech pauses, reducing overall latency. - Transcription Process Optimizations: Transcription process outsourced into separate class and optimized pipe communication for more stability and faster pipe communication, leading to fewer occurrances of audio chunks getting discarded due to queue size overflows.
- Immediate Listen State: Fixed issue soi the system immediately returns to the listening state right after stopping the recording, preventing lost chunks.
- Improved Logging: Always logs debug messages to a file, even if not explicitly configured. Option to disable logging with
no_log_file
parameter. - Transcription Time Display: New
print_transcription_time
parameter to show model processing time.
Bugfixes:
- Chunk Handling: Enhanced chunk handling with the new
allowed_latency_limit
parameter, reducing dropped data during high-latency scenarios.
v0.2.42
v0.2.41
v0.2.4
-
new parameter allowing to use the same model for both realtime and final transcriptions:
use_main_model_for_realtime (bool, default=False)
If set to True, the main transcription model will be used for both regular and real-time transcription.
If False, a separate model specified byrealtime_model_type
will be used for real-time transcription.Using a single model can save memory and potentially improve performance, but may not be optimized for real-time processing. Using separate models allows for a smaller, faster model for real-time transcription while keeping a more accurate model for final transcription.