This solution has become one of 4 winners of the hackathon 1st stage.
Create potentially viral app for the Photolab audience.
Technically, our app should have been embedded inside Photolab app as a web-view.
So we decided to create something fairly simple: what if we let users swap faces from several input photos with arbitrary number of people to another photo with some crowd, in fully unsupervised way. The result is then posted on facebook and can be used to challenge their friends to find someone familiar in the mixed photos.
Our web service must be easy to use, able to create high-quality swaps and make it fast.
Here is the algorithm:
- Find all faces on both input and output photos. Let's use detector, based on HOG descriptors for that.
- Extract features from detected faces with pretrained model (any "metric learning" model for face-recognition can be used, like facenet). For instance, we map every face to the 128D space. Then we measure L2 distances in this face-features space, between every face in the input photo and faces in the destination photo. And finaly, for every "input face" we must find "nearest neighbour" from the distanation photo. Here is the illustration:
-
After we formed all pairs of faces, we need to get landmarks for every face to be able warp it while swapping. Again, we can use pretrained dlib's landmarks predictor.
-
Finally do the face-swapping!
Basically we can go two ways:- use only affine transform on "source" face mask (aka warp_2d in code);
- apply delaunay triangulation on "source" and destination faces landmarks and warp each triangle on source face (aka warp_3d in code). This approach leads to more accurate facial expressions transfer;
In our case, simple heuristic works just fine: do the triangulation only if detected face bbox is big enough to fill
k <= bbox_h/img_h
of image height (wherek
is a given value).
In order to speed up the inference, detected faces has been separated inn_jobs
groups to make similarity calculation and face swapping inn_jobs
parallel independent processes (wheren_jobs
is, again, a given value).
User start from choosing couple photos of him and / or his friend(s) and the photo where input faces should be placed:
And after ~2 sec. he gets the result and can share it on various social platforms:
As you can see, we swapped the most similar faces and did it pretty well.
Check out more examples with me and Elon:
- Install docker.
- Pull last image version to your machine from dockerhub:
docker pull gasparjan/photolab_hack:latest
- AWS S3 used to store uploaded images. So you need to configure access to S3 on the host machine:
- go to .aws folder:
cd ~/.aws
- create config file and fill it with user name:
[profile %USER_NAME%]
- create credentials file and fill it with secret keys:
[%USER_NAME%]
aws_access_key_id = ""
aws_secret_access_key = ""
- Run docker:
docker run --rm -it -v /tmp:/tmp -v /root/.aws:/root/.aws \
-p 8080:8000 --ipc=host gasparjan/photolab_hack:latest
or go to the project folder and run bash script:
cd ~/photolab_hack
./run_docker.sh
Server starts automatically.