This repository contains the official PyTorch implementation of the following paper:
Generative Adversarial Graph Convolutional Networks for Human Action Synthesis, Bruno Degardin, João Neves, Vasco Lopes, João Brito, Ehsan Yaghoubi and Hugo Proença, WACV 2022. [Arxiv Preprint]
Material related to our paper is available via the following links:
- Paper: https://arxiv.org/abs/2110.11191
- Video: Youtube Demo
- Code: https://github.com/DegardinBruno/Kinetic-GAN
- Poster: Kinetic-GAN Poster
- Datasets (ready to use, send email / open issue if server is down):
- Both Linux and Windows are supported, but we strongly recommend Linux for performance and compatibility reasons.
- 64-bit Python 3.7+ installation. We recommend pip.
- PyTorch >= 1.7.1
- GPU is not mandatory, but we highly recommend GPU for results reproducibility and speed.
pip install -r requirements.txt # use flag --user if permission needed
We provide comprehensive benchmarks to evaluate the supported models on different datasets using standard evaluation setup. All the models can be downloaded from the provided links. Send email / open issue if server is down.
arch | benchmark | actions | frame length x coordinate dimensions |
FID | Config | Model |
---|---|---|---|---|---|---|
kinetic-gan-mlp4 | cross-subject | 60 | 64 x 3 | 3.618 | config | weights |
kinetic-gan-mlp8 | cross-subject | 60 | 64 x 3 | 4.396 | config | weights |
kinetic-gan-mlp6 | cross-view | 60 | 64 x 3 | 4.235 | config | weights |
kinetic-gan-mlp8 | cross-view | 60 | 64 x 3 | 4.610 | config | weights |
*FID results can differ a bit due to random normal distribution and random noise
** Better action control with MLP-depth 8 (check by yourself with visualization)
arch | benchmark | actions | frame length x coordinate dimensions |
FID | Config | Model |
---|---|---|---|---|---|---|
kinetic-gan-mlp8 | cross-subject | 120 | 64 x 3 | 5.967 | config | weights |
kinetic-gan-mlp8 | cross-setup | 120 | 64 x 3 | 6.751 | config | weights |
*FID results can differ a bit due to random normal distribution and random noise
** Better action control with MLP-depth 8 (check by yourself with visualization)
arch | actions | frame length x coordinate dimensions |
MMDa | MMDs | Config | Model |
---|---|---|---|---|---|---|
kinetic-gan-mlp4 | 10 | 32 x 2 | 0.071 | 0.079 | config | weights |
kinetic-gan-mlp8 | 10 | 64 x 2 | 0.074 | 0.088 | config | weights |
kinetic-gan-mlp8 | 10 | 128 x 2 | 0.076 | 0.102 | config | weights |
kinetic-gan-mlp8 | 10 | 256 x 2 | 0.081 | 0.112 | config | weights |
kinetic-gan-mlp4 | 10 | 512 x 2 | 0.087 | 0.115 | config | weights |
kinetic-gan-mlp4 | 10 | 1024 x 2 | 0.092 | 0.121 | config | weights |
*MMD results can differ a bit due to random normal distribution and random noise
**Additionally, MMD metric is not as "stable" and descriptive as FID, check paper results and visual quality.
You can generate your own samples by using a pre-trained Kinetic-GAN with specified config and weights as folows:
- Edit or use generate.py to specify the dataset where it was trained and arguments.
- Run the training script with (Check class indexes (label -1) at NTU RGB+D Datasets):
python generate.py --model model_path --n_classes number_classes --label class_index --gen_qtd how_many_samples # Check generate.py file
- The experiments (config and samples) are written to a newly created directory
runs/kinetic-gan/exp<id>
. - Synthesising is really fast even for huge amounts of samples (GPU recommended but not mandatory).
- To visualize your samples (
action_ntu.py
for NTU RGB+D and NTU-120 RGB+D andaction_h36m.py
for Human3.6M):
python visualization/action_ntu.py --path path_samples --labels path_labels --indexes 0 1 2 # Example for Kinetic-GAN trained on NTU or NTU-120
You can visualize your samples (action_ntu.py
for NTU RGB+D and NTU-120 RGB+D and action_h36m.py
for Human3.6M) by specifying the synthetic samples and labels as follows:
python visualization/action_ntu.py --path path_samples --labels path_labels --indexes 0 1 2 # Example for Kinetic-GAN trained on NTU or NTU-120, check action_ntu.py file
Training will save 10 samples per class at each specified iteration interval. For training with NTU RGB+D, classes are repeated at every 60 samples, run:
python visualization/action_ntu.py --path path_samples --indexes 26 86 146 # ... Example for `jump up` action.
python visualization/action_ntu.py --path path_samples --indexes 23 83 143 # ... Example for `kicking something` action.
python visualization/action_ntu.py --path path_samples --indexes 58 118 178 # ... Example for `walking` action.
Blender visualization (with mesh) is only applied for a more appealing visualization. For accessing and reproducing our visualization, use specifically our blender.py and Blender 2.9+ with Python interpreter (Interpreter already included in Blender) with blend file. [IMPORTANT] Armature needs to be at that specific start position (ctrl-z to return), and configure rotation on X-axis for better visualization.
Kinetic-GAN generates up to 120 different skeleton actions trained on skeleton-based datasets, which do not have bone rotations specified and dependable by their parent bones, and may cause some poor visualizations with a mesh sometimes (check the initial gif at early iterations).
Datasets are ready to use, after downloading from resources you can train your own Kinetic-GAN networks as follows:
- Edit or use kinetic-gan.py to specify the dataset and training configuration and arguments.
- Run the training script with:
python kinetic-gan.py --data_path path_train_data.npy --label_path path_train_labels.pkl --dataset ntu_or_h36m # check kinetic-gan.py file
- The experiments (files, loss, weights and samples) are written to a newly created directory
runs/kinetic-gan/exp<id>
. - For following up the training loss run (it will save image at the
exp<id>
directory):
python visualization/plot_loss.py --batches num_batches_per_epoch --runs kinetic-gan # check plot_loss.py file
- Training may take from 24 up to 72 hours to complete (using gpu), depending on the configuration and dataset.
Similar to image modelling, evaluation is performed where the generative algorithm was trained to compare both distributions. We strongly recommend using FID metric for further works, which is far more descriptive than MMD (check paper results to see the gap between better and worst configurations w.r.t. both metrics).
- generate your samples (synthetic distribution). All experiments were computed by generating 1000 samples per action class on all datasets. Check model's configs to set up your generator:
python generate.py --model model_path --n_classes number_classes --dataset ntu_or_h36m --gen_qtd 1000 # Check model configs to set up the generator
- Evaluate with FID with generated samples (saved in last
exp<id>
directory), can take up to 10/15 minutes:
python evaluation/fid-actions.py path_real_samples path_real_labels path_fake_samples path_fake_labels
- To evaluate with MMD (avg - MMDa, joint - MMDs) with generated samples (saved in last
exp<id>
directory), do not forget to specify the dataset (checkevaluation/mmd-actions.py
for details). Can take up to 5/10 minutes:
python evaluation/mmd-actions.py --mmd_mode avg_or_joint --data_real real_data --labels_real real_labels --data_fake fake_samples --labels_fake fake_labels
Coming soon. Contact me in case of urgency.
If you find this repository useful, please consider giving a star ⭐ and citation 🦖:
@inproceedings{degardin2022generative,
title={Generative Adversarial Graph Convolutional Networks for Human Action Synthesis},
author={Degardin, Bruno and Neves, Jo{\~a}o and Lopes, Vasco and Brito, Jo{\~a}o and Yaghoubi, Ehsan and Proen{\c{c}}a, Hugo},
booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
pages={1150--1159},
year={2022}
}