Skip to content

Latest commit

 

History

History
21 lines (17 loc) · 696 Bytes

README.md

File metadata and controls

21 lines (17 loc) · 696 Bytes

EA-SVC

An implement of "Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training"

Data prepare

  1. PPG features (10ms frameshift)
  2. F0 features (10ms frameshift)
  3. Speaker embedding (One embedding per wav file)
  4. Audio files (wave format, 24000 sample rate, mono)

Write Configuration

Set path / directory or other configurations in .json files in directory "configs" Rewrite your data load function in utils/dataset.py

Model Training

Single GPU

CUDA_VISIBLE_DEVICES=0 python train.py -c configs/stage1.json
CUDA_VISIBLE_DEVICES=0 python train.py -c configs/stage2.json
CUDA_VISIBLE_DEVICES=0 python train.py -c configs/stage3.json