This repository provides an analysis-by-synthesis framework to fit a textured FLAME model to an image. FLAME is a lightweight generic 3D head model learned from over 33,000 head scans, but it does not come with an appearance space (see the scientific publication for details).
Variations of the texture space for the first five principal components. Each column shows the variation for ±2 standard deviations along one axis.
This repository
- describes how to build a texture space for FLAME from in-the-wild images, and provides
- code to fit a textured FLAME model to in-the-wild images, optimizing for FLAME's parameters, appearance, and lighting, and
- code to optimize for the FLAME texture to match an in-the-wild image.
The FLAME model and the texture space can be downloaded from the FLAME project website. You need to sign up and agree to the license for access.
The demos will be released soon.
The goal is to build a texture space from in-the-wild images in order to cover large range of ethnicities, age groups, etc. We therefore randomly select 1500 images from the FFHQ dataset in order to build a texture space. This is done in following steps
1. Initialization
Building a texture space from in-the-wild images is a chicken-and-egg problem. Given a texture space, it can be used in an analysis-by-synthesis fashion to fit the 3D model to images, where these fits then can be used to build a texture space. To get an initial texture space, we fit FLAME to the Basel Face Model (BFM) template, and project the BFM vertex colors onto the FLAME mesh, to get an initial texture basis.
2. Model fitting
We then fit FLAME to the FFHQ images, optimizing for the FLAME shape, pose, and expression parameters, the parameters of the initial texture space, the parameters for Spherical Harmonics (SH) lighting (we optimize for 9 SH coefficient only, shared across all three color channels), and a texture offset to capture texture details deviating from the initial texture space. The fitting minimizes a landmark loss, a photometric loss, and diverse regularizers for shape, pose, expression, appearance, and the texture offset.
The landmark loss minimizes the difference between the landmarks projected from the face model's surface, and predicted 2D landmarks (predicted using the FAN landmark predictor). The photometric loss is optimized for the skin region only (provided by the face segmentation network) to gain robustness to partial occlusions. See the provided code for details how to fit a textured FLAME model to an image.
3. Texture completion
After fitting, the computed texture offsets capture for each image the facial appearance of the non-occluded skin region. To complete the texture maps, we train an inpainting network adapted from GMCNN (across all texture maps) supervisely by adding random strokes (i.e. strokes of random size and location) in the visible face region(visibility obtained from the fitted reconstruction) and learning to inpaint these strokes. Once trained, we inpaint all missing regions with the resulting inpainting network.
4. Texture space computation
After completing these 1500 texture maps, we use principal component analysis (PCA) to compute a texture space.
The single image photometric fitting demo is implemented and tested in a conda environment with PyTorch 1.5 and PyTorch3D 0.2 in Python 3.8.3. For better CUDA supports, we recommend you to install PyTorch3D 0.2 via conda,
conda create -n pytorch3d python=3.8
conda activate pytorch3d
conda install -c pytorch pytorch=1.5.0 torchvision cudatoolkit=10.2
conda install -c conda-forge -c fvcore fvcore
conda install pytorch3d -c pytorch3d
ATTENTION: The pip and conda packages of PyTorch3D have different dependencies, please follow their installation guide.
Run this demo with specified FFHQ image name and computing device,
python demos/photometric_fitting.py 00000 cuda
Run custom image,
python demos/wj_fitting.py FFHQ/00000.png cuda
Run reconstruct face and driving expression,
python demos/exp_with_texture.py video.mp4 cuda
Run transfer expression,
python demos/transfer_exp.py video.mp4 basic_model.npy cuda
facial landmark face-alignment
face segmentation face-parsing.PyTorch
related model can be found Google Cloud or Baidu Yun code:1emq
Another simple demo to sample the texture space can be found here.
The code is available for non-commercial scientific research purposes. The texture model is available under Creative Commons BY-NC-SA 4.0 license. For details see the Texture license.
We use the FLAME.py from FLAME_PyTorch
When using this code or the texture model in a scientific publication, please cite this GitHub repository and the FFHQ dataset. When using the FLAME geometry model, please cite the model (you find the up-to-date bibtex here).
For questions regarding the provided fitting code please contact haiwen.feng@tuebingen.mpg.de, for FLAME related questions please contact flame@tuebingen.mpg.de.