Making model more resillient to bad recording #66
etlweather
started this conversation in
General
Replies: 1 comment
-
Perhaps part of the answer is found in this repo: https://github.com/daanzu/kaldi_ag_training |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am using Vosk API with
vosk-model-en-us-daanzu-20200905
model to transcribe recordings of meetings and phone calls. Your model, with Vosk, produced the best transcripts of all models and solutions I tried (Deepspeech, wav2vec, etc.).This works great for good recording. However, the quality of transcript quickly degrade when there is AC noise, speaker is farther away from recorder, audio tonality is not as good, etc.
So I was thinking I could take Common Voice dataset, apply various transformations to make the audio quality match my recordings and train a model with this.
Thus I am wondering if you have a script to build your already good model? I saw various issues on similar questions and sorry for kind of duplicating the questions. But I am not really familiar with this and I kind of get lost in those issues.
Beta Was this translation helpful? Give feedback.
All reactions