Final Project for the open university course 22928 - Computer Vision

This is my very first deep learning project.

Project definition:

Create an algorithm which recieves as input an image of a single face, and outputs its pose, in rotvec form.

Success criterion is determined by comparison to the output of another algorithm, which was not given.

Since it was impossible to optimize for an algorithm (which on its own has error), and all I had was about 100 images tagged by that unknown algorithm, I decided to just optimize for the tagged data of 300W-LP [which required cleaning], using the original hopenet, and choosing the epoch which minimizes validation error on the set I did have, which was tagged by the unknown algorithm.

This is suboptimal, but was the best under the time constraints.

Again - the benchmark for success in the course was not based on the data of any dataset, but on comparison to some other, unknown algorithm.

See the report at HopenetReport.pdf

Making this work required migrating to python 3, finding data, cleaning the data, learning GCP, and making quite a few adaptations to the validation set.

I am quite pleased by the result as a first deep learning project.

This project was very insightful overall. I just wish next time I get to make at least one something of my own.

The code was not meant to ever be used after the course is finished, and maintainability was dumped nearing the deadline.

My results

Data sets:

300W-LP: https://drive.google.com/file/d/0B7OEHD3T4eCkVGs0TkhUWFN6N1k/view?usp=sharing

AFLW (I didn't use it): http://www.cbsr.ia.ac.cn/users/xiangyuzhu/projects/3DDFA/Database/AFLW2000-3D.zip

Original Hopenet Readme

Hopenet is an accurate and easy to use head pose estimation network. Models have been trained on the 300W-LP dataset and have been tested on real data with good qualitative performance.

For details about the method and quantitative results please check the CVPR Workshop paper.

new GoT trailer example video

new Conan-Cruise-Car example video

To use please install PyTorch and OpenCV (for video) - I believe that's all you need apart from usual libraries such as numpy. You need a GPU to run Hopenet (for now).

To test on a video using dlib face detections (center of head will be jumpy):

python code/test_on_video_dlib.py --snapshot PATH_OF_SNAPSHOT --face_model PATH_OF_DLIB_MODEL --video PATH_OF_VIDEO --output_string STRING_TO_APPEND_TO_OUTPUT --n_frames N_OF_FRAMES_TO_PROCESS --fps FPS_OF_SOURCE_VIDEO

To test on a video using your own face detections (we recommend using dockerface, center of head will be smoother):

python code/test_on_video_dockerface.py --snapshot PATH_OF_SNAPSHOT --video PATH_OF_VIDEO --bboxes FACE_BOUNDING_BOX_ANNOTATIONS --output_string STRING_TO_APPEND_TO_OUTPUT --n_frames N_OF_FRAMES_TO_PROCESS --fps FPS_OF_SOURCE_VIDEO

Face bounding box annotations should be in Dockerface format (n_frame x_min y_min x_max y_max confidence).

Pre-trained models:

300W-LP, alpha 1

300W-LP, alpha 2

300W-LP, alpha 1, robust to image quality

For more information on what alpha stands for please read the paper. First two models are for validating paper results, if used on real data we suggest using the last model as it is more robust to image quality and blur and gives good results on video.

Please open an issue if you have an problem.

Some very cool implementations of this work on other platforms by some cool people:

Gluon

MXNet

TensorFlow with Keras

A really cool lightweight version of HopeNet:

Deep Head Pose Light

If you find Hopenet useful in your research please cite:

@InProceedings{Ruiz_2018_CVPR_Workshops,
author = {Ruiz, Nataniel and Chong, Eunji and Rehg, James M.},
title = {Fine-Grained Head Pose Estimation Without Keypoints},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2018}
}

Nataniel Ruiz, Eunji Chong, James M. Rehg

Georgia Institute of Technology

Name		Name	Last commit message	Last commit date
Latest commit History 260 Commits
code		code
.gitignore		.gitignore
HopenetReport.docx		HopenetReport.docx
HopenetReport.pdf		HopenetReport.pdf
LICENSE.md		LICENSE.md
README.md		README.md
conan-cruise.gif		conan-cruise.gif
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Final Project for the open university course 22928 - Computer Vision

Project definition:

My results

Data sets:

Original Hopenet Readme

About

Releases

Packages

Languages

License

noamzilo/deep-head-pose

Folders and files

Latest commit

History

Repository files navigation

Final Project for the open university course 22928 - Computer Vision

Project definition:

My results

Data sets:

Original Hopenet Readme

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages