Watson Speech Android SDK

The Watson Speech SDK for the Android platform enables an easy and lightweight interaction with the IBM's Watson Speech-To-Text (STT) and Text-To-Speech (TTS) services in Bluemix. The SDK includes support for recording and streaming audio in real time to the STT service while receiving a transcript of the audio as you speak. This project includes an example application that showcases the interaction with both the STT and TTS Watson services in the cloud.

The current version of the SDK uses a minSdkVersion of 9, while the example application uses a minSdkVersion of 16.

Installation

Using the library

Download the speech-android-wrapper.aar
Once unzipped drag the watsonsdk.aar file into your Android Studio project view under the libs folder.
Go to build.gradle file of your app, then set the dependencies as below:

    dependencies {
        compile fileTree(dir: 'libs', include: ['*.jar'])
        compile (name:'watsonsdk',ext:'aar')
        compile 'com.android.support:appcompat-v7:22.0.0'
    }
    repositories{
        flatDir{
            dirs 'libs'
        }
    }

Clean and run the Android Studio project

Getting credentials

Follow instructions at http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/getting_started/gs-credentials.shtml to get service credentials.

Speech To Text

Implement the SpeechDelegate and SpeechRecorderDelegate in the MainActivity

These delegates implement the callbacks when a response from the server is received or when the recorder is sending back the audio data. SpeechRecorderDelegate is optional.

   public class MainActivity extends Activity implements SpeechDelegate{}

Or with SpeechRecorderDelegate

   public class MainActivity extends Activity implements SpeechDelegate, SpeechRecorderDelegate{}

Instantiate the SpeechToText instance

   SpeechToText.sharedInstance().initWithContext(this.getHost(), this.getApplicationContext(), new SpeechConfiguration());

Enabling audio compression

By default audio sent to the server is uncompressed PCM encoded data, compressed audio using the Opus codec can be enabled.

   SpeechToText.sharedInstance().initWithContext(this.getHost(STT_URL), this.getApplicationContext(), new SpeechConfiguration(SpeechConfiguration.AUDIO_FORMAT_OGGOPUS));

Or this way:

    // Configuration
    SpeechConfiguration sConfig = new SpeechConfiguration(SpeechConfiguration.AUDIO_FORMAT_OGGOPUS);
    // STT
    SpeechToText.sharedInstance().initWithContext(this.getHost(STT_URL), this.getApplicationContext(), sConfig);

Set the Credentials and the delegate

   SpeechToText.sharedInstance().setCredentials(this.USERNAME,this.PASSWORD);
   SpeechToText.sharedInstance().setDelegate(this);

Alternatively pass a token factory object to be used by the SDK to retrieve authentication tokens to authenticate against the STT service

   SpeechToText.sharedInstance().setTokenProvider(new MyTokenProvider(this.strSTTTokenFactoryURL));
   SpeechToText.sharedInstance().setDelegate(this);

Get a list of models supported by the service

   JSONObject models = getModels();

Get details of a particular model

   JSONObject model = getModelInfo("en-US_BroadbandModel");

Pick the model to be used

   SpeechToText.sharedInstance().setModel("en-US_BroadbandModel");

Start Audio Transcription

   SpeechToText.sharedInstance().recognize();

If you implemented SpeechRecorderDelegate, and needs to process the audio data which is recorded, you can use set the delegate.

   SpeechToText.sharedInstance().recognize();
   SpeechToText.sharedInstance().setRecorderDelegate(this);

Delegate methods to receive messages from the sdk

    public void onOpen() {
        // the  connection to the STT service is successfully opened 
    }

    public void onError(String error) {
    	// error interacting with the STT service
    }

    public void onClose(int code, String reason, boolean remote) {
        // the connection with the STT service was just closed
    }

    public void onMessage(String message) {
        // a message comes from the STT service with recognition results 
    }

End Audio Transcription

   SpeechRecognition.sharedInstance().stopRecording();

Receive speech power levels during the recognize

The amplitude is calculated from the audio data buffer, and the volume (in dB) is calculated based on it.

    @Override
    public void onAmplitude(double amplitude, double volume) {
        // your code here
    }

Text To Speech

Instantiate the TextToSpeech instance

   TextToSpeech.sharedInstance().initWithContext(this.getHost(TTS_URL));

Set the Credentials

   TextToSpeech.sharedInstance().setCredentials(this.USERNAME,this.PASSWORD);

Alternatively pass a token factory object to be used by the SDK to retrieve authentication tokens to authenticate against the TTS service

   TextToSpeech.sharedInstance().setTokenProvider(new MyTokenProvider(this.strTTSTokenFactoryURL));

Get a list of voices supported by the service

   TextToSpeech.sharedInstance().voices();

Pick the voice to be used

   TextToSpeech.sharedInstance().setVoice("en-US_MichaelVoice");

Generate and play audio

  TextToSpeech.sharedInstance().synthesize(ttsText);

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
.idea		.idea
examples		examples
gradle/wrapper		gradle/wrapper
speech-android-wrapper		speech-android-wrapper
.gitignore		.gitignore
.project		.project
CONTRIBUTIONS.txt		CONTRIBUTIONS.txt
LICENSE.txt		LICENSE.txt
README.md		README.md
WatsonAndroidSpeechSDK.iml		WatsonAndroidSpeechSDK.iml
build.gradle		build.gradle
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Watson Speech Android SDK

Table of Contents

Installation

Getting credentials

Speech To Text

Implement the SpeechDelegate and SpeechRecorderDelegate in the MainActivity

Instantiate the SpeechToText instance

Get a list of models supported by the service

Get details of a particular model

Pick the model to be used

Start Audio Transcription

End Audio Transcription

Receive speech power levels during the recognize

Text To Speech

Instantiate the TextToSpeech instance

Get a list of voices supported by the service

Pick the voice to be used

Generate and play audio

Common issues

About

Releases

Packages

Languages

License

shenybluemix/speech-android-sdk

Folders and files

Latest commit

History

Repository files navigation

Watson Speech Android SDK

Table of Contents

Installation

Getting credentials

Speech To Text

Implement the SpeechDelegate and SpeechRecorderDelegate in the MainActivity

Instantiate the SpeechToText instance

Get a list of models supported by the service

Get details of a particular model

Pick the model to be used

Start Audio Transcription

End Audio Transcription

Receive speech power levels during the recognize

Text To Speech

Instantiate the TextToSpeech instance

Get a list of voices supported by the service

Pick the voice to be used

Generate and play audio

Common issues

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages