Merge branch 'master' of https://github.com/deepinsight/insightface

2026-05-17 14:26:08 +00:00 · 2018-01-24 10:51:04 +08:00
parent a21b9e0428 17d1ab11e7
commit b39f3e7c04
2 changed files with 273 additions and 56 deletions
--- a/README.md
+++ b/README.md
@@ -1,72 +1,188 @@
-# InsightFace
-Face Recognition Project
+
+# *InsightFace* : Implementation for paper 'Additive Angular Margin Loss for Deep Face Recognition'
+
+  Paper by Jiankang Deng, Jia Guo, and Stefanos Zafeiriou
+
+### License
+
+   InsightFace is released under the MIT License.
+
+### Contents
+0. [Introduction](#introduction)
+0. [Citation](#citation)
+0. [Requirements](#requirements)
+0. [Installation](#installation)
+0. [Usage](#usage)
+0. [Models](#models)
+0. [Results](#results)
+0. [Contribution](#contribution)
+0. [Contact](#contact)
+
+
+### Introduction
+
+   Paper link: [here](https://arxiv.org/abs/1801.07698). 
+   
+   This repository contains the entire pipeline for deep face recognition with **`InsightFace`** and other popular methods including Softmax, Triplet Loss, SphereFace and AMSoftmax/CosineFace, etc..
+
+   **InsightFace** is a recently proposed face recognition method. It was initially described in an [arXiv technical report](https://arxiv.org/abs/1801.07698). By using InsightFace and this repository, you can simply achieve LFW 99.80+ and Megaface 98%+ by a single model.
+
+   We provide a refined MS1M dataset for training here, which was already packed in MXNet binary format. It allows researcher or industrial engineer to develop a deep face recognizer quickly by only two stages: 1. Download binary dataset; 2. Run training script.
+
+   In InsightFace, we support several popular network backbones and can be set just in one parameter. Below is the list until today:
+
+- ResNet
+
+- MobiletNet
+
+- InceptionResNetV2
+
+- DPN
+
+- DenseNet
+
+ In our paper, we found there're overlap identities between facescrub dataset and Megaface distractors which greatly affects the identification performance. Sometimes more than 10 percent improvement can be achieved after removing these overlaps. This list will be made public soon in this repository.
+
+
+   We achieves the state-of-the-art identification performance in MegaFace Challenge, at 98%+. 
+
+
+### Citation
+
+   If you find **InsightFace** useful in your research, please consider to cite our paper.
+   
+```
+@misc{insightface2018,
+  author =       {Jiankang Deng, Jia Guo and Stefanos Zafeiriou},
+  title =        {Additive Angular Margin Loss for Deep Face Recognition},
+  journal =      {arXiv preprint arXiv:1801.07698},
+  year =         {2018}
+}
+```
+
+### Requirements
+      1. The only requirement is `MXNet` with GPU support(Python 2.7).
+
+### Installation
+   1. Install MXNet by 
+
+       ```
+       pip install mxnet-cu80
+       ```
+
+     
+
+   2. Clone the InsightFace repository. We'll call the directory that you cloned InsightFace as **`INSIGHTFACE_ROOT`**.
+
+       ```Shell
+       git clone --recursive https://github.com/deepinsight/insightface.git
+       ```
+      
+
+
+### Usage
+
+   *After successfully completing the [installation](#installation)*, you are ready to run all the following experiments.
+
+   #### Part 1: Dataset Downloading.
+   **Note:** In this part, we assume you are in the directory **`$INSIGHTFACE_ROOT/`**
+   1. Download the training set (`MS1M`) from [here] and place them in **`datasets/`**. Each training dataset includes following 7 files:
+
+      ```Shell
+      	- train.idx
+      	- train.rec
+      	- property
+      	- lfw.bin
+      	- cfp_ff.bin
+      	- cfp_fp.bin
+      	- agedb_30.bin
+      ```
+       The first three files are the dataset itself while the last four ones are binary verification sets.
+
+   #### Part 2: Train
+   **Note:** In this part, we assume you are in the directory **`$INSIGHTFACE_ROOT/src/`**. Before start  any training procedure, make sure you set the correct env params for MXNet to ensure the performance.
+
+```
+export MXNET_CPU_WORKER_NTHREADS=24
+export MXNET_ENGINE_TYPE=ThreadedEnginePerDevice
+```
+
+ Now we give some examples below. Our experiments were all done on Tesla P40 GPU.
+
+   1. Train our method with LResNet100E-IR.
+
+      ```Shell
+      CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train_softmax.py --network r100 --loss-type 4 --margin-m 0.5 --data-dir ../datasets/faces_ms1m_112x112  --prefix ../model-r100
+      ```
+      It will output verification results of *LFW*, *CFP-FF*, *CFP-FP* and *AgeDB-30* every 2000 batches. You can check all command line options in **train\_softmax.py**.
+
+      This model can achieve **LFW 99.80+ and MegaFace 98.0%+**
+
+   2. Train AMSoftmax/CosineFace with LResNet50E-IR.
+
+      ```Shell
+      CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train_softmax.py --network r50 --loss-type 2 --margin-m 0.35 --data-dir ../datasets/faces_ms1m_112x112 --prefix ../model-r50-amsoftmax
+      ```
+
+   3. Train Softmax with LMobileNetE.
+
+      ```Shell
+      CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train_softmax.py --network m1 --loss-type 0 --data-dir ../datasets/faces_ms1m_112x112 --prefix ../model-m1-softmax
+      ```
+
+4. Re-Train with Triplet on above Softmax model.
+   ```Shell
+   CUDA_VISIBLE_DEVICES='0,1,2,3' python -u train_softmax.py --network m1 --loss-type 12 --lr 0.005 --mom 0.0 --per-batch-size 150 --data-dir ../datasets/faces_ms1m_112x112 --pretrained ../model-m1-softmax,50 --prefix ../model-m1-triplet
+   ```
+
+5. Train Softmax with LDPN107E.
+
+      ```Shell
+      CUDA_VISIBLE_DEVICES='0,1,2,3,4,5,6,7' python -u train_softmax.py --network p107 --loss-type 0 --per-batch-size 64 --data-dir ../datasets/faces_vgg_112x112 --prefix ../model-p107-softmax
+      ```



-### How to use
+   #### Part 3: MegaFace Test

-1. Download pre-aligned training dataset from our data repo which is a large binary file in MXnet .rec format(maybe ready soon), or align your dataset by yourself and then pack them to prevent random small files accessing. Check those scripts under src/common and src/align.
-2. Run src/train_softmax.py to train your model and set proper parameters. For example, loss-type=0 means pure softmax while loss-type=1 means SphereLoss. It will output LFW accuracy every 2000 batches and save the model if necessary.
+   **Note:** In this part, we assume you are in the directory **`$INSIGHTFACE_ROOT/src/megaface/`**

-### Notes
+   1. Align all face images of facescrub dataset and megaface distractors. Please check the alignment scripts under **`$INSIGHTFACE_ROOT/src/align/`**. (We may plan to release these data soon, not sure.)

-Default image size is 112x96 if not specified, all face images are aligned.
+   2. Next, generate feature files for both facescrub and megaface images.

-In ResNet setting, \_v1 means original residual units.  \_v2 means pre-activation units.  \_v3 means BCBACB residual units.  LResNet means we use conv33+stride11 in its first convoluition layer instead of common conv77+stride22 to preserve high image resolution.   For ResNet50, we do not use bottleneck layers. For ResNet101 or ResNeXt101, we use.  
+      ```Shell
+      python -u gen_megaface.py
+      ```

-In last several layers, some different options can be tried to determine how embedding layer looks like and it may affect the performance. The whole network architecture can be thought as {ConvLayers(->GlobalPool)->EmbeddingLayer->Softmax}. Embedding size is set to 512 expect for optionA, as embedding size in optionA is determined by the filter size of last convolution group.
+   3. Remove Megaface noises which generates new feature files.

- Option\*X: Same with Option\* but use dropout after GP.  OptionAX is the default setting for inceptions.
- OptionA: Use global pooling layer(GP). This is the default setting for all networks except inceptions.
- OptionB: Use one FC layer after GP.
- OptionC: Use FC->BN after GP.
- OptionD: Use FC->BN->PRelu after GP.
- OptionE: Use BN->Dropout->FC->BN after last conv layer.
- OptionF: Use BN->PRelu->Dropout->FC->BN after last conv layer.
+      ```Matlab
+      python -u remove_noises.py
+      ```
+   4. Start to run megaface development kit to produce final result. 
+
+### Models
+      1. We plan to make some models public soon.
+
+### Results
+   
+   We simply report the performance of **LResNet100E-IR** network trained on **MS1M** dataset with our method.
+
+| Method  | LFW(%) | CFP-FF(%) | CFP-FP(%) | AgeDB-30(%) | MegaFace1M(%) |
+| ------- | ------ | --------- | --------- | ----------- | ------------- |
+|  Ours   | 99.80+ | 99.85+    | 94.0+     | 97.90+      | **98.0+**     |



+### Contribution
+   - Any type of PR or third-party contribution are welcome.

-### Experiments
-
-
-
-
- **Softmax only on VGG2@112x112**
-
-|    Network/Dataset     |       LFW        |      ------      |      ------      |
-| :--------------------: | :--------------: | :--------------: | :--------------: |
-|      ResNet50D_v1      | 0.99350+-0.00293 |                  |                  |
-|    SE-ResNet50A\_v1    | 0.99367+-0.00233 |                  |                  |
-|    SE-ResNet50B_v1     | 0.99200+-0.00407 |                  |                  |
-|    SE-ResNet50C_v1     | 0.99317+-0.00404 |                  |                  |
-|    SE-ResNet50D_v1     | 0.99383+-0.00259 |                  |                  |
-|    SE-ResNet50E\_v1    | 0.99267+-0.00343 |                  |                  |
-|    SE-ResNet50F\_v1    | 0.99367+-0.00194 |                  |                  |
-|    SE-LResNet50C_v1    | 0.99567+-0.00238 |                  |                  |
-|    SE-LResNet50D_v1    | 0.99600+-0.00281 |                  |                  |
-|    SE-LResNet50E_v1    | 0.99650+-0.00174 |        -         |        -         |
-|    SE-LResNet50A_v3    | 0.99583+-0.00327 |                  |                  |
-|    SE-LResNet50D_v3    | 0.99617+-0.00358 |        -         |        -         |
-|    SE-LResNet50E_v3    | 0.99767+-0.00200 |        -         |        -         |
-|     LResNet50E_v3      | 0.99750+-0.00250 |                  |                  |
-|    SE-LResNet50F_v3    |                  |                  |                  |
-|   SE-LResNet50BX_v3    | 0.99350+-0.00263 |                  |                  |
-|    SE-ResNet101D_v3    | 0.99517+-0.00252 |                  |                  |
-|    SE-ResNet101E_v3    | 0.99467+-0.00221 |                  |                  |
-|    SE-ResNet152E_v3    | 0.99500+-0.00307 |                  |                  |
-|   Inception-ResNetBX   | 0.99417+-0.00375 |        -         |        -         |
-| SE-Inception-ResNet-v2 |        -         |        -         |        -         |
-|       MobileNetD       | 0.99150+-0.00425 |        -         |        -         |
-|      LMobileNetD       | 0.99383+-0.00409 |        -         |        -         |
-|      LMobileNetE       | 0.99633+-0.00314 |        -         |        -         |
-|      LMobileNetF       | 0.99617+-0.00211 |        -         |        -         |
-|    LResNeXt101E_v3     |                  |                  |                  |
-
-
-How weight decay affects:
-
-SE-LResNet50E-v3/vggface2/softmax:
+### Contact

+     [Jia Guo](guojia[at]gmail.com) and [Jiankang Deng](https://ibug.doc.ic.ac.uk/people/jdeng)

+     Questions can also be left as issues in the repository. We will be happy to answer them.
+   ```

+   ```
--- a/src/utils/benchmark.py
+++ b/src/utils/benchmark.py
@@ -0,0 +1,101 @@
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+import os
+import sys
+import datetime
+import mxnet as mx
+from mxnet import ndarray as nd
+import random
+import argparse
+import cv2
+import time
+import sklearn
+from sklearn.decomposition import PCA
+from easydict import EasyDict as edict
+from sklearn.cluster import DBSCAN
+import numpy as np
+sys.path.append(os.path.join(os.path.dirname(__file__), '..', 'common'))
+import face_image
+
+def ch_dev(arg_params, aux_params, ctx):
+  new_args = dict()
+  new_auxs = dict()
+  for k, v in arg_params.items():
+    new_args[k] = v.as_in_context(ctx)
+  for k, v in aux_params.items():
+    new_auxs[k] = v.as_in_context(ctx)
+  return new_args, new_auxs
+
+
+def main(args):
+  ctx = mx.gpu(args.gpu)
+  args.ctx_num = 1
+  prop = face_image.load_property(args.data)
+  image_size = prop.image_size
+  print('image_size', image_size)
+  vec = args.model.split(',')
+  prefix = vec[0]
+  epoch = int(vec[1])
+  print('loading',prefix, epoch)
+  sym, arg_params, aux_params = mx.model.load_checkpoint(prefix, epoch)
+  arg_params, aux_params = ch_dev(arg_params, aux_params, ctx)
+  all_layers = sym.get_internals()
+  sym = all_layers['fc1_output']
+  #model = mx.mod.Module.load(prefix, epoch, context = ctx)
+  model = mx.mod.Module(symbol=sym, context=ctx, label_names = None)
+  #model.bind(data_shapes=[('data', (args.batch_size, 3, image_size[0], image_size[1]))], label_shapes=[('softmax_label', (args.batch_size,))])
+  model.bind(data_shapes=[('data', (args.batch_size, 3, image_size[0], image_size[1]))])
+  model.set_params(arg_params, aux_params)
+  path_imgrec = os.path.join(args.data, 'train.rec')
+  path_imgidx = os.path.join(args.data, 'train.idx')
+  imgrec = mx.recordio.MXIndexedRecordIO(path_imgidx, path_imgrec, 'r')  # pylint: disable=redefined-variable-type
+  s = imgrec.read_idx(0)
+  header, _ = mx.recordio.unpack(s)
+  assert header.flag>0
+  print('header0 label', header.label)
+  header0 = (int(header.label[0]), int(header.label[1]))
+  #assert(header.flag==1)
+  imgidx = range(1, int(header.label[0]))
+  stat = []
+  count = 0
+  data = nd.zeros( (1 ,3, image_size[0], image_size[1]) )
+  label = nd.zeros( (1,) )
+  for idx in imgidx:
+    if len(stat)%100==0:
+      print('processing', len(stat))
+    s = imgrec.read_idx(idx)
+    header, img = mx.recordio.unpack(s)
+    img = mx.image.imdecode(img)
+    img = nd.transpose(img, axes=(2, 0, 1))
+    data[0][:] = img
+    #input_blob = np.expand_dims(img.asnumpy(), axis=0)
+    #arg_params["data"] = mx.nd.array(input_blob, ctx)
+    #arg_params["softmax_label"] = mx.nd.empty((1,), ctx)
+    time_now = datetime.datetime.now()
+    #exe = sym.bind(ctx, arg_params ,args_grad=None, grad_req="null", aux_states=aux_params)
+    #exe.forward(is_train=False)
+    #_embedding = exe.outputs[0].asnumpy().flatten()
+    #db = mx.io.DataBatch(data=(data,), label=(label,))
+    db = mx.io.DataBatch(data=(data,))
+    model.forward(db, is_train=False)
+    net_out = model.get_outputs()[0].asnumpy()
+    time_now2 = datetime.datetime.now()
+    diff = time_now2 - time_now
+    stat.append(diff.total_seconds())
+    if len(stat)==args.param1:
+      break
+  stat = stat[10:]
+  print('avg infer time', np.mean(stat))
+
+if __name__ == '__main__':
+  parser = argparse.ArgumentParser(description='do network benchmark')
+  # general
+  parser.add_argument('--gpu', default=0, type=int, help='')
+  parser.add_argument('--data', default='', type=str, help='')
+  parser.add_argument('--model', default='../model/softmax,50', help='path to load model.')
+  parser.add_argument('--batch-size', default=1, type=int, help='')
+  parser.add_argument('--param1', default=1010, type=int, help='')
+  args = parser.parse_args()
+  main(args)
+