Voice DeepFake and Pose Transfer using GANs UCR CS228 Deep Learning Final Project
MaskGAN-VC
- install pip3 install on requirements.txt
- Dataset:
- Training:
- Run "train_submission.ipynb". The notebook will output loss graphs and MCDs. It will also output wavs into "/code/outputs/generated_audio
For all Augmentations:
- Run "visualizations_(just_plots)_submission.ipynb" to produce all augmentation data visualizations (blue audio signal graphs and log mel spectrogram data).
Dance GAN: visualization file and dance code : drive files for visualization file
- Run the notebook "run_Dance.ipynb"
- Dataset: Need to have "DL Project"(linked above) in personal drive.
- Important: you must stop the notebook where it tell you to stop (the notebook will instruct you). Then follow the instruction in the notebook.
- For the DanceGAN output figure, all source frames will be output to this directory: "DLproject/EverybodyDanceNow_reproduce_pytorch/data/source/images/[frame number].png" and the frames of the result video to this directory: "DLproject/EverybodyDanceNow_reproduce_pytorch/results/target/test_latest/images/images/"
- Pose tranfer dance video will be output to DLproject/EverybodyDanceNow_reproduce_pytorch/ as Final_Output.mp4