Skip to content

Commit

Permalink
Fix sil (#22)
Browse files Browse the repository at this point in the history
* Update infore dataset

* Use new textgrid data.

- Update download url and hash.
- Use sil instead of sp.
- Normalize audio to match hifigan preprocessing.
- Random dropout of tokens when training duration model to prevent overfitting.

* Load phoneme set from config instead from lexicon file.

This keeps the phoneme set unchanged even if the dataset or the lexicon file changed.

* use `jax.tree_map` instead of `jax.tree_multimap`.

* Better log file names

* Remove colab links in notebooks

* Fix `zero_silence_segments` script.

* Update pretrained models
  • Loading branch information
NTT123 authored May 16, 2022
1 parent 07a5d8a commit 8d2ee1f
Show file tree
Hide file tree
Showing 18 changed files with 8,346 additions and 4,454 deletions.
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
A Vietnamese TTS
================

Tacotron + HiFiGAN vocoder for vietnamese datasets.
Duration model + Acoustic model + HiFiGAN vocoder for vietnamese text-to-speech application.

Online demo at https://huggingface.co/spaces/ntt123/vietTTS.

This comment has been minimized.


Expand Down Expand Up @@ -32,12 +32,13 @@ Download InfoRe dataset
-----------------------

```sh
bash ./scripts/download_aligned_infore_dataset.sh
python ./scripts/download_aligned_infore_dataset.py
```

**Note**: this is a denoised and aligned version of the original dataset which is donated by the InfoRe Technology company (see [here](https://www.facebook.com/groups/j2team.community/permalink/1010834009248719/)). You can download the original dataset (**InfoRe Technology 1**) at [here](https://github.com/TensorSpeech/TensorFlowASR/blob/main/README.md#vietnamese).

The Montreal Forced Aligner (MFA) is used to align transcript and speech (textgrid files). [Here](https://colab.research.google.com/gist/NTT123/c99b5a391af56e0cb8f7b190d3d7f0ee/infore-mfa-example.ipynb) is a Colab notebook to align InfoRe dataset. Visit [MFA](https://montreal-forced-aligner.readthedocs.io/en/latest/) for more information on how to create textgrid files.
See `notebooks/denoise_infore_dataset.ipynb` for instructions on how to denoise the dataset. We use the Montreal Forced Aligner (MFA) to align transcript and speech (textgrid files).
See `notebooks/align_text_audio_infore_mfa.ipynb` for instructions on how to create textgrid files.

Train duration model
--------------------
Expand Down
Binary file modified assets/infore/clip.wav
Binary file not shown.
Loading

0 comments on commit 8d2ee1f

Please sign in to comment.