This commit is contained in:
Jia Guo
2018-01-24 10:51:04 +08:00
2 changed files with 273 additions and 56 deletions

228
README.md
View File

@@ -1,72 +1,188 @@
# InsightFace
Face Recognition Project
# *InsightFace* : Implementation for paper 'Additive Angular Margin Loss for Deep Face Recognition'
Paper by Jiankang Deng, Jia Guo, and Stefanos Zafeiriou
### License
InsightFace is released under the MIT License.
### Contents
0. [Introduction](#introduction)
0. [Citation](#citation)
0. [Requirements](#requirements)
0. [Installation](#installation)
0. [Usage](#usage)
0. [Models](#models)
0. [Results](#results)
0. [Contribution](#contribution)
0. [Contact](#contact)
### Introduction
  Paper link: [here](https://arxiv.org/abs/1801.07698).
  This repository contains the entire pipeline for deep face recognition with **`InsightFace`** and other popular methods including Softmax, Triplet Loss, SphereFace and AMSoftmax/CosineFace, etc..
  **InsightFace** is a recently proposed face recognition method. It was initially described in an [arXiv technical report](https://arxiv.org/abs/1801.07698). By using InsightFace and this repository, you can simply achieve LFW 99.80+ and Megaface 98%+ by a single model.
We provide a refined MS1M dataset for training here, which was already packed in MXNet binary format. It allows researcher or industrial engineer to develop a deep face recognizer quickly by only two stages: 1. Download binary dataset; 2. Run training script.
In InsightFace, we support several popular network backbones and can be set just in one parameter. Below is the list until today:
- ResNet
- MobiletNet
- InceptionResNetV2
- DPN
- DenseNet
In our paper, we found there're overlap identities between facescrub dataset and Megaface distractors which greatly affects the identification performance. Sometimes more than 10 percent improvement can be achieved after removing these overlaps. This list will be made public soon in this repository.
We achieves the state-of-the-art identification performance in MegaFace Challenge, at 98%+.
### Citation
If you find **InsightFace** useful in your research, please consider to cite our paper.
```
@misc{insightface2018,
author = {Jiankang Deng, Jia Guo and Stefanos Zafeiriou},
title = {Additive Angular Margin Loss for Deep Face Recognition},
journal = {arXiv preprint arXiv:1801.07698},
year = {2018}
}
```
### Requirements
1. The only requirement is `MXNet` with GPU support(Python 2.7).
### Installation
1. Install MXNet by
```
pip install mxnet-cu80
```
2. Clone the InsightFace repository. We'll call the directory that you cloned InsightFace as **`INSIGHTFACE_ROOT`**.
```Shell
git clone --recursive https://github.com/deepinsight/insightface.git
```
### Usage
*After successfully completing the [installation](#installation)*, you are ready to run all the following experiments.
#### Part 1: Dataset Downloading.
**Note:** In this part, we assume you are in the directory **`$INSIGHTFACE_ROOT/`**
1. Download the training set (`MS1M`) from [here] and place them in **`datasets/`**. Each training dataset includes following 7 files:
```Shell
- train.idx
- train.rec
- property
- lfw.bin
- cfp_ff.bin
- cfp_fp.bin
- agedb_30.bin
```
The first three files are the dataset itself while the last four ones are binary verification sets.
#### Part 2: Train
**Note:** In this part, we assume you are in the directory **`$INSIGHTFACE_ROOT/src/`**. Before start any training procedure, make sure you set the correct env params for MXNet to ensure the performance.
```
export MXNET_CPU_WORKER_NTHREADS=24
export MXNET_ENGINE_TYPE=ThreadedEnginePerDevice
```
Now we give some examples below. Our experiments were all done on Tesla P40 GPU.
1. Train our method with LResNet100E-IR.
```Shell
CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train_softmax.py --network r100 --loss-type 4 --margin-m 0.5 --data-dir ../datasets/faces_ms1m_112x112 --prefix ../model-r100
```
It will output verification results of *LFW*, *CFP-FF*, *CFP-FP* and *AgeDB-30* every 2000 batches. You can check all command line options in **train\_softmax.py**.
This model can achieve **LFW 99.80+ and MegaFace 98.0%+**
2. Train AMSoftmax/CosineFace with LResNet50E-IR.
```Shell
CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train_softmax.py --network r50 --loss-type 2 --margin-m 0.35 --data-dir ../datasets/faces_ms1m_112x112 --prefix ../model-r50-amsoftmax
```
3. Train Softmax with LMobileNetE.
```Shell
CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train_softmax.py --network m1 --loss-type 0 --data-dir ../datasets/faces_ms1m_112x112 --prefix ../model-m1-softmax
```
4. Re-Train with Triplet on above Softmax model.
```Shell
CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train_softmax.py --network m1 --loss-type 12 --lr 0.005 --mom 0.0 --per-batch-size 150 --data-dir ../datasets/faces_ms1m_112x112 --pretrained ../model-m1-softmax,50 --prefix ../model-m1-triplet
```
5. Train Softmax with LDPN107E.
```Shell
CUDA_VISIBLE_DEVICES='0,1,2,3,4,5,6,7' python -u train_softmax.py --network p107 --loss-type 0 --per-batch-size 64 --data-dir ../datasets/faces_vgg_112x112 --prefix ../model-p107-softmax
```
### How to use
#### Part 3: MegaFace Test
1. Download pre-aligned training dataset from our data repo which is a large binary file in MXnet .rec format(maybe ready soon), or align your dataset by yourself and then pack them to prevent random small files accessing. Check those scripts under src/common and src/align.
2. Run src/train_softmax.py to train your model and set proper parameters. For example, loss-type=0 means pure softmax while loss-type=1 means SphereLoss. It will output LFW accuracy every 2000 batches and save the model if necessary.
**Note:** In this part, we assume you are in the directory **`$INSIGHTFACE_ROOT/src/megaface/`**
### Notes
1. Align all face images of facescrub dataset and megaface distractors. Please check the alignment scripts under **`$INSIGHTFACE_ROOT/src/align/`**. (We may plan to release these data soon, not sure.)
Default image size is 112x96 if not specified, all face images are aligned.
2. Next, generate feature files for both facescrub and megaface images.
In ResNet setting, \_v1 means original residual units. \_v2 means pre-activation units. \_v3 means BCBACB residual units. LResNet means we use conv33+stride11 in its first convoluition layer instead of common conv77+stride22 to preserve high image resolution. For ResNet50, we do not use bottleneck layers. For ResNet101 or ResNeXt101, we use.
```Shell
python -u gen_megaface.py
```
In last several layers, some different options can be tried to determine how embedding layer looks like and it may affect the performance. The whole network architecture can be thought as {ConvLayers(->GlobalPool)->EmbeddingLayer->Softmax}. Embedding size is set to 512 expect for optionA, as embedding size in optionA is determined by the filter size of last convolution group.
3. Remove Megaface noises which generates new feature files.
- Option\*X: Same with Option\* but use dropout after GP. OptionAX is the default setting for inceptions.
- OptionA: Use global pooling layer(GP). This is the default setting for all networks except inceptions.
- OptionB: Use one FC layer after GP.
- OptionC: Use FC->BN after GP.
- OptionD: Use FC->BN->PRelu after GP.
- OptionE: Use BN->Dropout->FC->BN after last conv layer.
- OptionF: Use BN->PRelu->Dropout->FC->BN after last conv layer.
```Matlab
python -u remove_noises.py
```
4. Start to run megaface development kit to produce final result.
### Models
     1. We plan to make some models public soon.
### Results
We simply report the performance of **LResNet100E-IR** network trained on **MS1M** dataset with our method.
| Method | LFW(%) | CFP-FF(%) | CFP-FP(%) | AgeDB-30(%) | MegaFace1M(%) |
| ------- | ------ | --------- | --------- | ----------- | ------------- |
| Ours | 99.80+ | 99.85+ | 94.0+ | 97.90+ | **98.0+** |
### Contribution
- Any type of PR or third-party contribution are welcome.
### Experiments
- **Softmax only on VGG2@112x112**
| Network/Dataset | LFW | ------ | ------ |
| :--------------------: | :--------------: | :--------------: | :--------------: |
| ResNet50D_v1 | 0.99350+-0.00293 | | |
| SE-ResNet50A\_v1 | 0.99367+-0.00233 | | |
| SE-ResNet50B_v1 | 0.99200+-0.00407 | | |
| SE-ResNet50C_v1 | 0.99317+-0.00404 | | |
| SE-ResNet50D_v1 | 0.99383+-0.00259 | | |
| SE-ResNet50E\_v1 | 0.99267+-0.00343 | | |
| SE-ResNet50F\_v1 | 0.99367+-0.00194 | | |
| SE-LResNet50C_v1 | 0.99567+-0.00238 | | |
| SE-LResNet50D_v1 | 0.99600+-0.00281 | | |
| SE-LResNet50E_v1 | 0.99650+-0.00174 | - | - |
| SE-LResNet50A_v3 | 0.99583+-0.00327 | | |
| SE-LResNet50D_v3 | 0.99617+-0.00358 | - | - |
| SE-LResNet50E_v3 | 0.99767+-0.00200 | - | - |
| LResNet50E_v3 | 0.99750+-0.00250 | | |
| SE-LResNet50F_v3 | | | |
| SE-LResNet50BX_v3 | 0.99350+-0.00263 | | |
| SE-ResNet101D_v3 | 0.99517+-0.00252 | | |
| SE-ResNet101E_v3 | 0.99467+-0.00221 | | |
| SE-ResNet152E_v3 | 0.99500+-0.00307 | | |
| Inception-ResNetBX | 0.99417+-0.00375 | - | - |
| SE-Inception-ResNet-v2 | - | - | - |
| MobileNetD | 0.99150+-0.00425 | - | - |
| LMobileNetD | 0.99383+-0.00409 | - | - |
| LMobileNetE | 0.99633+-0.00314 | - | - |
| LMobileNetF | 0.99617+-0.00211 | - | - |
| LResNeXt101E_v3 | | | |
How weight decay affects:
SE-LResNet50E-v3/vggface2/softmax:
### Contact
[Jia Guo](guojia[at]gmail.com) and [Jiankang Deng](https://ibug.doc.ic.ac.uk/people/jdeng)
Questions can also be left as issues in the repository. We will be happy to answer them.
```
```

101
src/utils/benchmark.py Normal file
View File

@@ -0,0 +1,101 @@
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import sys
import datetime
import mxnet as mx
from mxnet import ndarray as nd
import random
import argparse
import cv2
import time
import sklearn
from sklearn.decomposition import PCA
from easydict import EasyDict as edict
from sklearn.cluster import DBSCAN
import numpy as np
sys.path.append(os.path.join(os.path.dirname(__file__), '..', 'common'))
import face_image
def ch_dev(arg_params, aux_params, ctx):
new_args = dict()
new_auxs = dict()
for k, v in arg_params.items():
new_args[k] = v.as_in_context(ctx)
for k, v in aux_params.items():
new_auxs[k] = v.as_in_context(ctx)
return new_args, new_auxs
def main(args):
ctx = mx.gpu(args.gpu)
args.ctx_num = 1
prop = face_image.load_property(args.data)
image_size = prop.image_size
print('image_size', image_size)
vec = args.model.split(',')
prefix = vec[0]
epoch = int(vec[1])
print('loading',prefix, epoch)
sym, arg_params, aux_params = mx.model.load_checkpoint(prefix, epoch)
arg_params, aux_params = ch_dev(arg_params, aux_params, ctx)
all_layers = sym.get_internals()
sym = all_layers['fc1_output']
#model = mx.mod.Module.load(prefix, epoch, context = ctx)
model = mx.mod.Module(symbol=sym, context=ctx, label_names = None)
#model.bind(data_shapes=[('data', (args.batch_size, 3, image_size[0], image_size[1]))], label_shapes=[('softmax_label', (args.batch_size,))])
model.bind(data_shapes=[('data', (args.batch_size, 3, image_size[0], image_size[1]))])
model.set_params(arg_params, aux_params)
path_imgrec = os.path.join(args.data, 'train.rec')
path_imgidx = os.path.join(args.data, 'train.idx')
imgrec = mx.recordio.MXIndexedRecordIO(path_imgidx, path_imgrec, 'r') # pylint: disable=redefined-variable-type
s = imgrec.read_idx(0)
header, _ = mx.recordio.unpack(s)
assert header.flag>0
print('header0 label', header.label)
header0 = (int(header.label[0]), int(header.label[1]))
#assert(header.flag==1)
imgidx = range(1, int(header.label[0]))
stat = []
count = 0
data = nd.zeros( (1 ,3, image_size[0], image_size[1]) )
label = nd.zeros( (1,) )
for idx in imgidx:
if len(stat)%100==0:
print('processing', len(stat))
s = imgrec.read_idx(idx)
header, img = mx.recordio.unpack(s)
img = mx.image.imdecode(img)
img = nd.transpose(img, axes=(2, 0, 1))
data[0][:] = img
#input_blob = np.expand_dims(img.asnumpy(), axis=0)
#arg_params["data"] = mx.nd.array(input_blob, ctx)
#arg_params["softmax_label"] = mx.nd.empty((1,), ctx)
time_now = datetime.datetime.now()
#exe = sym.bind(ctx, arg_params ,args_grad=None, grad_req="null", aux_states=aux_params)
#exe.forward(is_train=False)
#_embedding = exe.outputs[0].asnumpy().flatten()
#db = mx.io.DataBatch(data=(data,), label=(label,))
db = mx.io.DataBatch(data=(data,))
model.forward(db, is_train=False)
net_out = model.get_outputs()[0].asnumpy()
time_now2 = datetime.datetime.now()
diff = time_now2 - time_now
stat.append(diff.total_seconds())
if len(stat)==args.param1:
break
stat = stat[10:]
print('avg infer time', np.mean(stat))
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='do network benchmark')
# general
parser.add_argument('--gpu', default=0, type=int, help='')
parser.add_argument('--data', default='', type=str, help='')
parser.add_argument('--model', default='../model/softmax,50', help='path to load model.')
parser.add_argument('--batch-size', default=1, type=int, help='')
parser.add_argument('--param1', default=1010, type=int, help='')
args = parser.parse_args()
main(args)