This repository currently implemented the CPM and Hourglass model using TensorFlow. Instead of normal convolution, inverted residuals (also known as Mobilenet V2) module has been used inside the model for faster inference.
Model | FLOPs | PCKh | Inference Time |
CPM | 0.5G | 93.78 |
~100 ms on Snapdragon 845 ~30 ms on iPhone x |
Hourglass | 0.5G | 91.81 |
You can modify the architectures of network for training much higher PCKh model.
The respository contains:
- Code of training cpm & hourglass model
- Android demo source code
- IOS demo source code
Below GIF is catch on Mi Mix2s (~10 FPS)
Download the apk of demo.
Issue and PR are welcome when you encount any problem.
- Python3
- TensorFlow >= 1.4
Training dataset available through google driver.
Unzip it will obtain the following file structure
# root @ ubuntu in ~/hdd/ai_challenger
$ tree -L 1 .
.
├── ai_challenger_train.json
├── ai_challenger_valid.json
├── train
└── valid
The traing dataset only contains single person images and it come from the competition of AI Challenger.
- 22446 training examples
- 1500 testing examples
I transfer the annotation into COCO format for using the data augument code from tf-pose-estimation respository.
In training procedure, we use cfg
file on experiments
folder for passing the hyper-parameter.
Below is the content of mv2_cpm.cfg.
[Train]
model: 'mv2_cpm'
checkpoint: False
datapath: '/root/hdd/ai_challenger'
imgpath: '/root/hdd/'
visible_devices: '0, 1, 2'
multiprocessing_num: 8
max_epoch: 1000
lr: '0.001'
batchsize: 5
decay_rate: 0.95
input_width: 192
input_height: 192
n_kpoints: 14
scale: 2
modelpath: '/root/hdd/trained/mv2_cpm/models'
logpath: '/root/hdd/trained/mv2_cpm/log'
num_train_samples: 20000
per_update_tensorboard_step: 500
per_saved_model_step: 2000
pred_image_on_tensorboard: True
The cfg not cover all the parameters of the model, there still have some parameters in the network_mv2_cpm.py
.
Build the docker by the following command:
cd training/docker
docker build -t single-pose .
or
docker pull edvardhua/single-pose
Then run the following command to train the model:
nvidia-docker run -it -d \
-v <dataset_path>:/data5 -v <training_code_path>/training:/workspace \
-p 6006:6006 -e LOG_PATH=/root/hdd/trained/mv2_cpm/log \
-e PARAMETERS_FILE=experiments/mv2_cpm.cfg edvardhua/single-pose
Also, it will create the tensorboard on port 6006. Beside, make sure you install the nvidia-docker
.
- install the dependencies.
cd training
pip3 install -r requirements.txt
Beside, you also need to install cocoapi
- Edit the parameters files in experiments folder, it contains almost all the hyper-parameters and other configuration you need to define in training. After that, passing the parameters file to start the training:
cd training
python3 src/train.py experiments/mv2_cpm.cfg
After 12 hour training, the model is almost coverage on 3 Nvidia 1080Ti graphics cards, below is the corresponding plot on tensorboard.
Run the follow command to evaluate the value of your PCKh.
python3 src/benchmark.py --frozen_pb_path=hourglass/model-360000.pb \
--anno_json_path=/root/hdd/ai_challenger/ai_challenger_valid.json \
--img_path=/root/hdd \
--output_node_name=hourglass_out_3
CPM
Hourglass
After you training the model, the following command can transfer the model into tflite.
# Convert to frozen pb.
cd training
python3 src/gen_frozen_pb.py \
--checkpoint=<you_training_model_path>/model-xxx --output_graph=<you_output_model_path>/model-xxx.pb \
--size=192 --model=mv2_cpm_2
# If you update tensorflow to 1.9, run following command.
python3 src/gen_tflite_coreml.py \
--frozen_pb=forzen_graph.pb \
--input_node_name='image' \
--output_node_name='Convolutional_Pose_Machine/stage_5_out' \
--output_path='./' \
--type=tflite
# Convert to tflite.
# See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/docs_src/mobile/tflite/devguide.md for more information.
bazel-bin/tensorflow/contrib/lite/toco/toco \
--input_file=<you_output_model_path>/model-xxx.pb \
--output_file=<you_output_tflite_model_path>/mv2-cpm.tflite \
--input_format=TENSORFLOW_GRAPHDEF --output_format=TFLITE \
--inference_type=FLOAT \
--input_shape="1,192,192,3" \
--input_array='image' \
--output_array='Convolutional_Pose_Machine/stage_5_out'
Then, place the tflite file in android_demo/app/src/main/assets
and modify the parameters in ImageClassifierFloatInception.kt
.
......
......
// parameters need to modify in ImageClassifierFloatInception.kt
/**
* Create ImageClassifierFloatInception instance
*
* @param imageSizeX Get the image size along the x axis.
* @param imageSizeY Get the image size along the y axis.
* @param outputW The output width of model
* @param outputH The output height of model
* @param modelPath Get the name of the model file stored in Assets.
* @param numBytesPerChannel Get the number of bytes that is used to store a single
* color channel value.
*/
fun create(
activity: Activity,
imageSizeX: Int = 192,
imageSizeY: Int = 192,
outputW: Int = 96,
outputH: Int = 96,
modelPath: String = "mv2-cpm.tflite",
numBytesPerChannel: Int = 4
): ImageClassifierFloatInception =
ImageClassifierFloatInception(
activity,
imageSizeX,
imageSizeY,
outputW,
outputH,
modelPath,
numBytesPerChannel)
......
......
Finally, import the project to Android Studio
and run in you smartphone.
Thanks to tucan, now you can run model on iOS.
First, convert model into CoreML model.
# Convert to frozen pb.
cd training
python3 src/gen_frozen_pb.py \
--checkpoint=<you_training_model_path>/model-xxx --output_graph=<you_output_model_path>/model-xxx.pb \
--size=192 --model=mv2_cpm_2
# Run the following command to get mlmodel
python3 src/gen_tflite_coreml.py \
--frozen_pb=forzen_graph.pb \
--input_node_name='image' \
--output_node_name='Convolutional_Pose_Machine/stage_5_out' \
--output_path='./' \
--type=coreml
Then, follow the instruction on PoseEstimation-CoreML.
[1] Paper of Convolutional Pose Machines
[2] Paper of Stack Hourglass
[3] Paper of MobileNet V2
[4] Repository PoseEstimation-CoreML
[5] Repository of tf-pose-estimation
[6] Devlope guide of TensorFlow Lite