Flask-Tacotron2-TTS-Web-App

This repo was forked from NVIDIA/Tacotron2 for inference test only (not for training).

Because I didn't know flask well, I forked CodeDem/flask-musing-streaming.

If you want to test NVIDIA Tacotron2 models in jupyter notebook, you better try inference model NVIDIA/Tacotron2 .

Installation

Install PyTorch 1.0 (You Need NVIDIA CUDA GPUs!)
pip install -r requirement.txt
clone this repo: https://github.com/NVIDIA/waveglow.git

or git submodule init; git submodule update
you may need models tacotron2, waveglow both :
1. NVIDIA/Tacotron2's model for inference demo: Tacotron 2 , WaveGlow
2. or My trained models:
  
  Tacotron2: English_90k_steps(ljspeech dataset), Korean_162k_steps(kss dataset)
  
  Waveglow: waveglow_152k_steps using Korean dataset

python app.py

or You can test tts on console: python console_test.py

in config.json, you can change models' path.

You can see Warning! Decoder Max on console.

In this case, your synthesized audio will have 11 seconds length and weired sounds.

This problems many happen in my korean trained model, but hardly happen in my english trained model.

I can't find any difference from synthesized audio between waveglow_256channels.pt(waveglow demo) and my waveglow_152k .

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
templates		templates
text		text
wavs		wavs
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
app.py		app.py
audio_processing.py		audio_processing.py
config.json		config.json
console_test.py		console_test.py
example.png		example.png
hparams.py		hparams.py
layers.py		layers.py
model.py		model.py
requirements.txt		requirements.txt
stft.py		stft.py
text2speech.py		text2speech.py
utils.py		utils.py