This mini-project aims to estimate 2D skeleton data from 3D skeleton data. More specifically 2D skeleton data for IR frames from the kinect v2. Exact formulas exist, but we are too lazy to implement them ourselves. The motivation behind this work is that the NTU RGB+D dataset provides 2D IR skeleton data, while the PKU MMD action dataset does not. And we need them.
Note that although a similar approach for 2D RGB skeleton would be similar, the IR and RGB spaces are different and thus would require different labels (also available available in the NTU RGB+D dataset).
├── LICENSE
├── Makefile <- Makefile with commands like `make create_environment` or `make train`
├── README.md <- The top-level README for developers using this project.
├── data
│ ├── processed <- The final, canonical data sets for modeling.
│
├── models <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering),
│ the creator's initials, and a short `-` delimited description, e.g.
│ `1.0-jqp-initial-data-exploration`.
│
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated graphics and figures to be used in reporting
│
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`
│
├── setup.py <- makes project pip installable (pip install -e .) so src can be imported.
The first step to replicate our results is to download the project and create a virtual environment using the Makefile.
The data used comes from the NTU RGB+D dataset. Make sure to download the skeleton data and create two .h5 datasets with our code from our GitHub repository. The two datasets are the "SKELETON" dataset and the "IR_SKELETON" dataset.
-
Clone project
git clone https://github.com/gnocchiflette/3D-to-2D-skeleton-data-with-deep-learning.git
-
Create virtual environment
make create_environment
-
Install requirements
make requirements
-
Place the two .h5 datasets in the ./data/processed/ folder. This folder already contains a text file containing the names of all the samples from the NTU RGB+D database called samples_names.txt.
-
That's it! All the code is inside the ./notebooks/ folder. There is a single detailed notebook that acts as a main file. Feel free to rerun and modify our architecture.
We split the entire NTU RGB+D dataset into a 50-50 train/test sets. We do not use a validation set. The reasoning behind this is as follows. The problem is simple enough. The MLP simply has to approximate an existing transformation (see Kinect v2 documentation). Plus, the training data is chosen in such a way that it is unlikely for the network to study the same point 2 times.
In just 5 epochs, with a network weighing just 4kb, we are able to approximate 2D IR skeleton coordinates with an error in the 5-10 pixels range. We provide the model for each epoch in the ./models/ folder.
Better results can probably be obtained with a better use of the data and augmentation but the goal for this project was to be quickly operational.
Below is the loss per epoch during training.
Project based on the cookiecutter data science project template. #cookiecutterdatascience