-
Notifications
You must be signed in to change notification settings - Fork 100
Installation
Get a copy of µSpeech of the home page. It is available as both a zip file and a Tarball. Installation occurs like any other library. You can read more about installation of libraries here: http://arduino.cc/en/Guide/Libraries. For advanced users use the µSpeech 4.0 library. More information is available here
In uSpeech.h:
#define F_CONSTANT 380
The 30 is my value. You should compile and upload the example debug_uspeech (found in the installation directory) and open the serial monitor (I should have put this stuff in the wiki). First Enter "a" when you are shown the menu. Now you will see a stream of numbers. Generally when you say "ffff" (note: you have to be close to the microphone) the values will increase above a certain threshold. For me it was 380, it'll be different depending on your microphone hardware.
You will also have to set your microphone up so it only starts picking up values above a certain threshold. To do this close your serial monitor and reopen it. Enter the option "c" (send it to your arduino). Now you will see another set of values. When you talk these values will be above a certain threshold value, and when you are silent they should be below.
In the file uSpeech.h:
#define SILENCE 1500
My threshold is around 1500, you will have to change it to your threshold value.
LED_test should work.
To get more advanced speech recognition for 'v' and 's'/'sh', you are going to have to find your own thresholds for these. These use the coeff algorithm described here: https://arjo129.wordpress.com/2013/10/02/how-%C2%B5speech-works/ The code for this is in phoneme.cpp in the following block in phoneme.cpp:
if(coeff<30 && coeff>20){
return 'u';
}
else {
if(coeff<33){
return 'e';
}
else{
if(coeff<46){
return 'o';
}
else{
if(coeff<60){
return 'v'; // /z/ and /v/ fall here
}
else{
if(coeff<80){
return 'h'; //means /sh/ sound
}
else{
if(coeff>80){
return 's';
}
else{
return 'm'; //unreachable normally, reached only by error
}
}
}
}
}
}
The values in the if statements were derived experimentally. Usually u,e and o don't really work. You should look at 'v', 'h' (/sh/ sound) and 's' . The coeff numbers are again thresholds which vary depending on hardware. You will have to set them up yourself. When you say 'v', 'sh' and 's' the coefs will be within a certain range. To derive them for your microphone, you will have to open debug_uspeech. Select option 'd'. You will see a stream of numbers. You will have to determine a lower threshold for 'v', 'sh' and finally 's'. Again you will have to say these vowels into your microphone and find (guess) the average upper boundary (ignore outliers). After you are done, you can test it using the option b in debug_uspeech.
You are now ready to start your first project. Look over at the LED on/off tutorial and for your own project look at the getPhoneme()
reference.