The Neural Network (NN) based Speech Synthesis System

This repository contains the Neural Network (NN) based Speech Synthesis System
developed at the Centre for Speech Technology Research (CSTR), University of Edinburgh.

Step 0: download data from http://homepages.inf.ed.ac.uk/zwu2/dnn_baseline/dnn_baseline_practice.tar.gz and put it in your working directory

Step 1: Install Theano version 0.6 and above (http://deeplearning.net/software/theano/). You may need other third-party python libraries such as matplotlib.

Step 2: Modify the configuration file 'myconfigfile_mvn_DNN_basic.conf' to fit your working environment. a) Change the following settings: [Paths] work: <YOUR_OWN_WORKING_DIRECTORY> data: <WHERE_IS_YOUR_DATA>

        log_config_file: <CODE_DIRECTORY>/configuration/myloggingconfigfile.conf

        sptk: <WHERE_ARE_SPTK_BINARY_FILES>
        straight: <WHERE_ARE_STRAIGHT_BINARIES>
        
b) By default, the program will run through the following processes if you keep them to be 'True'. You could run step by step by tuning on one process and tuning off all the other processes.
    NORMLAB  : True
    MAKECMP  : True
    NORMCMP  : True
    TRAINDNN : True
    DNNGEN   : True
    GENWAV   : False
    CALMCD   : True

Note: current recipe required C-version STRAIGHT, which cannot be distributed. Please replace the GENWAV step by using your own vocoder.

Step 3: Create myloggingconfigfile.conf based on exampleloggingconfigfile.conf

Step 4: Run './submit.sh ./run_lstm.py ' script. It will lock a GPU automatically. If you do not want to lock GPU, please modify the ./run_dnn.sh script.

If everything goes smoothly, after 3 to 5 hours, the program will give objective results, and synthesised voices if you have your own vocoder.

Feel free to contact us if you have any questions.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
configuration		configuration
doc		doc
frontend		frontend
io_funcs		io_funcs
layers		layers
logplot		logplot
models		models
recipes		recipes
tools		tools
training_schemes		training_schemes
utils		utils
work_in_progress		work_in_progress
COPYING		COPYING
CREDITS.md		CREDITS.md
LICENSE		LICENSE
README.md		README.md
gpu_lock.py		gpu_lock.py
run_dnn.py		run_dnn.py
run_dnn.sh		run_dnn.sh
run_dnn_bottleneck.py		run_dnn_bottleneck.py
run_lstm.py		run_lstm.py
run_mdn.py		run_mdn.py
run_mvn_MDN.sh		run_mvn_MDN.sh
store_model.py		store_model.py
submit.sh		submit.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

The Neural Network (NN) based Speech Synthesis System

About

Licenses found

Releases

Packages

Languages

License

Licenses found

zhizhengwu/merlin

Folders and files

Latest commit

History

Repository files navigation

The Neural Network (NN) based Speech Synthesis System

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages