You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have the impression that transcription is triggered too easily. I only have to lay my hands on the table or breathe in a little louder than normal and it says "Speech detected." The VAD filter makes things much better by at least freeing me from hallucinated output like "Thank you." I also set the minimum recording duration from 100 to 400 ms. But, nonetheless, WhisperWriter constantly uselessly starts and stops (at least in "continuous" "Recording mode"), which means unnecessary time "Transcribing...", which means it can miss something you say, because it can't concurrently record while it transcribes.
Is there anything that can be done, like a volume threshold (that may not chop off early parts of speech from being transcribed).
Can the VAD filter be configured better?
Or would it be possible to continue recording during transcription?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I have the impression that transcription is triggered too easily. I only have to lay my hands on the table or breathe in a little louder than normal and it says "Speech detected." The VAD filter makes things much better by at least freeing me from hallucinated output like "Thank you." I also set the minimum recording duration from 100 to 400 ms. But, nonetheless, WhisperWriter constantly uselessly starts and stops (at least in "continuous" "Recording mode"), which means unnecessary time "Transcribing...", which means it can miss something you say, because it can't concurrently record while it transcribes.
Beta Was this translation helpful? Give feedback.
All reactions