Use several TTS engines to produce a collection of wakeword (and not wakeword) samples.
Collecting wakeword samples is a very common task in voice applications, but it is not always fun to do it manually. This tool helps you to generate a collection of samples for your wakeword.
This has been integrated into the Precise Wakeword Model Maker.
pip install -r requirements.txt
For picovoice you need to install:
sudo apt-get install libttspico0
sudo apt-get install libttspico-utils
For Larynx, you may want to setup your own server and configure the larynx_host
in config/TTS_engine_config.json
to the server's IP address and port number (ie https://127.0.0.1:5002). You can also leave it as null
and it will use the default server (Neon).
- Add your wakeword and the syllables of your wakeword to
config/TTS_wakeword_config.json
- Edit the
config/TTS_engine_config.json
for the TTS engines and voices you would like to use - Run
python TTS_wakeword_generator.py
(you can edit the name of the sub directory it creates inout/
in this file withwakeword_model_name
- (OPTIONAL) Run
python TTS_words_generator.py
if you want to scrape a bunch of random popular words (for EN-US this is already in thedata/
directory so you don't have to) WARNING: This can take a very long time to complete and should only be performed if you require another language, I wouldn't recommend doing it for more than 3 or 4 voices as it takes so long! - If you want to use the default random TTS data instead of generating your own: Move
random_TTS_mp3s/
to theout/
directory and runpython convert_prescraped_data.py
- All of the converted files are in
out/converted/
It's prety simple:
wakeword
: every TTS voice says the wakeword (ie 'hey Jarvis')not-wakeword
: every TTS voice says the individual syllables of the wakeword (ie 'hey', 'jar', 'vis')not-wakeword
: every TTS voice says all of the syllable pairs of the wakeword (ie 'hey jar', 'Jarvis')
The config/google-10000-english.txt
file has been used to generate additional not-wakeword
samples using TTS_words_generator.py
, cutting off any words with less than 4 characters.
It takes a long time to generate all of the samples, so you can use the pre-generated files in data/random_TTS_mp3s/
.
These files are great for not-wakeword
samples. So if you are incrementally training a wakeword model for the not-wakeword
class, you can use these files as a starting point to test for false wake ups and add the audio that fails into your data set. :)
WARNING: config/google-10000-english.txt
is a list of the most popular words in English according to Google searches. This can include 'dirty and offensive words'. But do you really want your wakeword to wakeup when it hears something like a swear word?