Skip to content

Voice synthesis for the GazePlay project, powered by AWS Polly

License

Notifications You must be signed in to change notification settings

GazePlay/gazeplay-voice-synthesis

Repository files navigation

GazePlay Voice Synthesis

This project allows a user to generate natural-sounding voices using the Amazon Polly service. You will need an AWS account to run this project yourself.

Set up

You will need to set up your AWS account on your local machine. Instructions on how to do this can be found in Amazon's documentation.

Additionally, you will need to create a unique bucket to store the generated files. You will use this as a command line argument to run the program.

Additionally, you should fill the inputs.csv file with the translations you want to synthesize. You will have to use the AWS region codes as column headers for different languages.

Warning: In the program's current state, adding new languages will require code changes.

Running the program

Run the program on the command line, assuming the bucket name is "gazeplay-voice-bucket"

> gradlew run --args gazeplay-voice-bucket      # Windows
$ ./gradlew run --args gazeplay-voice-bucket    # Unix

Be patient. The synthesis runs asynchronously in the cloud but the program will wait for each one to complete before moving onto the next. This is to ensure rate limiting, and that you don't incur high cloud usage fees.

Retrieving the sounds

The sounds will reside in the root of your chosen bucket. They will have the correct GazePlay compatible name, except for a unique identifier at the end of the filename. You should download all the files after they have been generated and remove this identifier. They will then be ready for inclusion in GazePlay.

You can download the contents of a bucket to a local folder using the AWS CLI command aws s3 sync s3://gazeplay-voice-bucket ./gazeplay-voices.

Here is a handy python script to quickly rename all the files you download:

import os

for file in os.listdir('gazeplay-voices'):
    parts = file.split('.')
    name = parts[0] + '.' + parts[1] + '.' + parts[2] + '.' + parts[4]
    print(name)
    os.rename('gazeplay-voices/' + file, 'gazeplay-voices/' + name)

About

Voice synthesis for the GazePlay project, powered by AWS Polly

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages