This project allows a user to generate natural-sounding voices using the Amazon Polly service. You will need an AWS account to run this project yourself.
You will need to set up your AWS account on your local machine. Instructions on how to do this can be found in Amazon's documentation.
Additionally, you will need to create a unique bucket to store the generated files. You will use this as a command line argument to run the program.
Additionally, you should fill the inputs.csv file with the translations you want to synthesize. You will have to use the AWS region codes as column headers for different languages.
Warning: In the program's current state, adding new languages will require code changes.
Run the program on the command line, assuming the bucket name is "gazeplay-voice-bucket"
> gradlew run --args gazeplay-voice-bucket # Windows
$ ./gradlew run --args gazeplay-voice-bucket # Unix
Be patient. The synthesis runs asynchronously in the cloud but the program will wait for each one to complete before moving onto the next. This is to ensure rate limiting, and that you don't incur high cloud usage fees.
The sounds will reside in the root of your chosen bucket. They will have the correct GazePlay compatible name, except for a unique identifier at the end of the filename. You should download all the files after they have been generated and remove this identifier. They will then be ready for inclusion in GazePlay.
You can download the contents of a bucket to a local folder using the AWS CLI
command aws s3 sync s3://gazeplay-voice-bucket ./gazeplay-voices
.
Here is a handy python script to quickly rename all the files you download:
import os
for file in os.listdir('gazeplay-voices'):
parts = file.split('.')
name = parts[0] + '.' + parts[1] + '.' + parts[2] + '.' + parts[4]
print(name)
os.rename('gazeplay-voices/' + file, 'gazeplay-voices/' + name)