Skip to content

RCR landmark detection training

Patrik Huber edited this page Jul 30, 2015 · 11 revisions

Guide to train a RCR landmark detection model for facial landmark detection

This page explains how to use the app rcr-train to train a landmark detection model.

You will need:

  • LFPW with .pts landmarks from http://ibug.doc.ic.ac.uk/resources/facial-point-annotations/
  • the OpenCV face detector (haarcascade_frontalface_alt2.xml)
  • the mean from superviseddescent/examples/data/mean_ibug_lfpw_68.txt
  • two configs, superviseddescent/examples/data/rcr_training_22.cfg and superviseddescent/examples/data/rcr_eval.cfg.

To train a model, just run rcr-train with all the arguments (adjust the paths!):

$ rcr-train -d data/iBug_lfpw/trainset -m superviseddescent/examples/data/mean_ibug_lfpw_68.txt -f haarcascade_frontalface_alt2.xml -c rcr_training_22.cfg -e rcr_eval.cfg -t data/iBug_lfpw/testset -o out.bin

This will produce a .bin file (nearly) identical to the model in the repo. “Nearly” identical because there is a little randomness involved in the perturbation. The model should have an accuracy of about 0.037 (error in % of the IED) on LFPW-test. The model from our paper achieves around 0.031 - there's a couple of improvements left to do in this library.

Specifying which landmarks to train

In the file rcr_training_22.cfg, you can freely change the landmark IDs you want to train (they correspond to the indices found on the iBug homepage). The indices will be stored in the model and can later be used in rcr-detect.

Be sure to always include the landmarks 37, 40, 43 and 46 when you train the model. They will be used to calculate the inter eye distance and normalise the model update (and calculate the test error). You can also edit rcr_eval.cfg and specify a list of landmarks for each eye (see rightEye, leftEye) - the average of them will be used as the eye center. However, modifying this file is not thoroughly tested yet.

Update: Make sure to also include landmark 58, it's needed before training by check_face(), which makes sure what the face detector detected on the training (and test) data is the correct face. rcr-train runs without error or warning when you don't include it, but some images it learns from might contain garbage (the face detector is not perfect!).

Training time and some remarks

My CMake scripts should automatically add /OpenMP in Visual Studio and -fopenmp on Linux, so the training should be fairly quick. If you're using clang, I think you need to specify an OpenMP runtime when running rcr-train.

The following are some reference numbers for training time and memory consumption on Windows with Visual Studio 2013/2015 (CPU is an i7-4700MQ):

num landmarks training time RAM usage
22 5 minutes <1 GB
42 15 minutes <1 GB
68 ~45 minutes <2 GB