the purpose of this repository is to explore facial expression recognition with deep learning
python 2.7
pytorch 0.3.0
GTX 1080
-
VGG Face finetune for image-based expression recognition
-
VGG+GRU for video-based expression recognition
-
Resnet+GRU for video-based expression recognition
VGG Face
We use VGG face for finetune, FER2013 dataset for classification. First of all, we convert caffe model to Pytorch model through tool,we provide converted pytorch model The FER2013 database consists of 35889 images: 28709 for training, 3589 for public test, and 3589 for private test. We use training data and public test data for training, and evaluate the model performance with private test data. Here we provide processed FER2013 dataset
Usage:
First download FER2013 dataset(need to unzip) and pytorch model(VGG_Face_torch.pth), save to folder VGG_Finetune
.
Your file looks like this:
VGG_Finetune
β train.py
β VGG_Face_torch.py
β VGG_Face_torch.pth
β
ββββtrain
β β 0
β β 1
β
ββββtest
β 0
β 1
Then python train.py
,you can begin to train. Please note if you have memory problem, reduce batchsize.
You can also finetune your own data, 0~6 in train and test mean different expressions. You can also use to train other classification problem.
After you train and eval, you should get 71.64%
precision.
VGG+GRU
We propose this model especially for Emotiw audio-video challenge. It can also be used to handle other video classification problems, such as action classification, you can just change the folder as we do.
Usage:
First change getTraintest_Emotiw.py in TrainTestlist
line 10~11 for video path. Your videos should look like this:
Emotiw-faces
β
ββββAngry
β
ββββDisgust
β
ββββ002138680
β
ββββimages0001.jpg
If you don't trian seven classes, you need to change the classInd.txt
to determine and VGG_gru.py
fully connect number.
To run this code, you also need pretrained model best7164.model you can also get this model by training on FER2013 dataset as above.
Then python train.py
, you can begin to train.
Performance:
all the performance train and test on Emotiw2018 train and validation(just 379 video) partition.
Model | VGG+LSTM(lr1e-4) | VGG+LSTM | VGG+LSTM cell(dropout 0.8) | VGG+GRU |
---|---|---|---|---|
precision | 47.76% | 49.08% | 49.87% | 51.19% |
Resnet+GRU
We propose this model also for Emotiw challenge. This resnet is not the original Resnet, but proposed initially for Face Recognition. So it has pretrained model in face dataset. It first use in center loss, but we adjust it for facial expression recognition by adding BN, conv dropout, and short connect.
To run this code, you also need pretrained model best_resnet.model
Then change getTraintest_Emotiw.py as VGG+GRU
. Lastly python train.py
, you can begin to train.
Performance:
Model | Resnet+GRU(lr1e-4) | Resnet+GRU | Resnet+GRU+conv drop |
---|---|---|---|
precision | 46.97% | 48.29% | 50.13% |