## Introduction SCRFD is an efficient high accuracy face detection approach which initially described in [Arxiv](https://arxiv.org/abs/2105.04714). prcurve ## Performance Precision, flops and infer time are all evaluated on **VGA resolution**. #### ResNet family | Method | Backbone | Easy | Medium | Hard | \#Params(M) | \#Flops(G) | Infer(ms) | | ------------------- | --------------- | ----- | ------ | ----- | ----------- | ---------- | --------- | | DSFD (CVPR19) | ResNet152 | 94.29 | 91.47 | 71.39 | 120.06 | 259.55 | 55.6 | | RetinaFace (CVPR20) | ResNet50 | 94.92 | 91.90 | 64.17 | 29.50 | 37.59 | 21.7 | | HAMBox (CVPR20) | ResNet50 | 95.27 | 93.76 | 76.75 | 30.24 | 43.28 | 25.9 | | TinaFace (Arxiv20) | ResNet50 | 95.61 | 94.25 | 81.43 | 37.98 | 172.95 | 38.9 | | - | - | - | - | - | - | - | - | | ResNet-34GF | ResNet50 | 95.64 | 94.22 | 84.02 | 24.81 | 34.16 | 11.8 | | **SCRFD-34GF** | Bottleneck Res | 96.06 | 94.92 | 85.29 | 9.80 | 34.13 | 11.7 | | ResNet-10GF | ResNet34x0.5 | 94.69 | 92.90 | 80.42 | 6.85 | 10.18 | 6.3 | | **SCRFD-10GF** | Basic Res | 95.16 | 93.87 | 83.05 | 3.86 | 9.98 | 4.9 | | ResNet-2.5GF | ResNet34x0.25 | 93.21 | 91.11 | 74.47 | 1.62 | 2.57 | 5.4 | | **SCRFD-2.5GF** | Basic Res | 93.78 | 92.16 | 77.87 | 0.67 | 2.53 | 4.2 | #### Mobile family | Method | Backbone | Easy | Medium | Hard | \#Params(M) | \#Flops(G) | Infer(ms) | | ------------------- | --------------- | ----- | ------ | ----- | ----------- | ---------- | --------- | | RetinaFace (CVPR20) | MobileNet0.25 | 87.78 | 81.16 | 47.32 | 0.44 | 0.802 | 7.9 | | FaceBoxes (IJCB17) | - | 76.17 | 57.17 | 24.18 | 1.01 | 0.275 | 2.5 | | - | - | - | - | - | - | - | - | | MobileNet-0.5GF | MobileNetx0.25 | 90.38 | 87.05 | 66.68 | 0.37 | 0.507 | 3.7 | | **SCRFD-0.5GF** | Depth-wise Conv | 90.57 | 88.12 | 68.51 | 0.57 | 0.508 | 3.6 | **X64 CPU Performance of SCRFD-0.5GF:** | Test-Input-Size | CPU Single-Thread | Easy | Medium | Hard | | ----------------------- | ----------------- | ----- | ------ | ----- | | Original-Size(scale1.0) | - | 90.91 | 89.49 | 82.03 | | 640x480 | 28.3ms | 90.57 | 88.12 | 68.51 | | 320x240 | 11.4ms | - | - | - | *precision and infer time are evaluated on AMD Ryzen 9 3950X, using the simple PyTorch CPU inference by setting `OMP_NUM_THREADS=1` (no mkldnn).* ## Installation Please refer to [mmdetection](https://github.com/open-mmlab/mmdetection/blob/master/docs/get_started.md) for installation. 1. Install [mmcv](https://github.com/open-mmlab/mmcv). (mmcv-full==1.2.6 and 1.3.3 was tested) 2. Install build requirements and then install mmdet. ``` pip install -r requirements/build.txt pip install -v -e . # or "python setup.py develop" ``` ## Data preparation ### WIDERFace: 1. Download WIDERFace datasets and put it under `data/retinaface`. 2. Download annotation files from [gdrive](https://drive.google.com/file/d/1UW3KoApOhusyqSHX96yEDRYiNkd3Iv3Z/view?usp=sharing) and put them under `data/retinaface/` ``` data/retinaface/ train/ images/ labelv2.txt val/ images/ labelv2.txt gt/ *.mat ``` #### Annotation Format *please refer to labelv2.txt for detail* For each image: ``` # image_width image_height bbox_x1 bbox_y1 bbox_x2 bbox_y2 (*N) ... ... # image_width image_height bbox_x1 bbox_y1 bbox_x2 bbox_y2 (*N) ... ... ``` Keypoints can be ignored if there is bbox annotation only. ## Training Example training command, with 4 GPUs: ``` CUDA_VISIBLE_DEVICES="0,1,2,3" PORT=29701 bash ./tools/dist_train.sh ./configs/scrfd/scrfd_1g.py 4 ``` ## WIDERFace Evaluation We use a pure python evaluation script without Matlab. ``` GPU=0 GROUP=scrfd TASK=scrfd_2.5g CUDA_VISIBLE_DEVICES="$GPU" python -u tools/test_widerface.py ./configs/"$GROUP"/"$TASK".py ./work_dirs/"$TASK"/model.pth --mode 0 --out wouts ``` ## Pretrained-Models | Name | Easy | Medium | Hard | FLOPs | Params(M) | Infer(ms) | Link | | :------------: | ----- | ------ | ----- | ----- | --------- | --------- | ------------------------------------------------------------ | | SCRFD_500M | 90.57 | 88.12 | 68.51 | 500M | 0.57 | 3.6 | [download](https://1drv.ms/u/s!AswpsDO2toNKqyYWxScdiTITY4TQ?e=DjXof9) | | SCRFD_1G | 92.38 | 90.57 | 74.80 | 1G | 0.64 | 4.1 | [download](https://1drv.ms/u/s!AswpsDO2toNKqyPVLI44ahNBsOMR?e=esPrBL) | | SCRFD_2.5G | 93.78 | 92.16 | 77.87 | 2.5G | 0.67 | 4.2 | [download](https://1drv.ms/u/s!AswpsDO2toNKqyTIXnzB1ujPq4th?e=5t1VNv) | | SCRFD_10G | 95.16 | 93.87 | 83.05 | 10G | 3.86 | 4.9 | [download](https://1drv.ms/u/s!AswpsDO2toNKqyUKwTiwXv2kaa8o?e=umfepO) | | SCRFD_34G | 96.06 | 94.92 | 85.29 | 34G | 9.80 | 11.7 | [download](https://1drv.ms/u/s!AswpsDO2toNKqyKZwFebVlmlOvzz?e=V2rqUy) | | SCRFD_500M_KPS | 90.97 | 88.44 | 69.49 | 500M | 0.57 | 3.6 | [download](https://1drv.ms/u/s!AswpsDO2toNKri_NDM0GIkPpkE2f?e=JkebJo) | | SCRFD_2.5G_KPS | 93.80 | 92.02 | 77.13 | 2.5G | 0.82 | 4.3 | [download](https://1drv.ms/u/s!AswpsDO2toNKqyGlhxnCg3smyQqX?e=A6Hufm) | | SCRFD_10G_KPS | 95.40 | 94.01 | 82.80 | 10G | 4.23 | 5.0 | [download](https://1drv.ms/u/s!AswpsDO2toNKqycsF19UbaCWaLWx?e=F6i5Vm) | mAP, FLOPs and inference latency are all evaluated on VGA resolution. ``_KPS`` means the model includes 5 keypoints prediction. ## Convert to ONNX Please refer to `tools/scrfd2onnx.py` Generated onnx model can accept dynamic input as default. You can also set specific input shape by pass ``--shape 640 640``, then output onnx model can be optimized by onnx-simplifier. ## Inference Please refer to `tools/scrfd.py` which uses onnxruntime to do inference. ## Network Search For two-steps search as we described in paper, we target hard mAP on how we select best candidate models. We provide an example for searching SCRFD-2.5GF in this repo as below. 1. For searching backbones: ``` python search_tools/generate_configs_2.5g.py --mode 1 ``` Where ``mode==1`` means searching backbone only. For other parameters, please check the code. 2. After step-1 done, there will be ``configs/scrfdgen2.5g/scrfdgen2.5g_1.py`` to ``configs/scrfdgen2.5g/scrfdgen2.5g_64.py`` if ``num_configs`` is set to 64. 3. Do training for every generated configs for 80 epochs, please check ``search_tools/search_train.sh`` 4. Test WIDERFace precision for every generated configs, using ``search_tools/search_test.sh``. 5. Select the top accurate config as the base template(assume the 10-th config is the best), then do the overall network search. ``` python search_tools/generate_configs_2.5g.py --mode 2 --template 10 ``` 6. Test these new generated configs again and select the top accurate one(s). ## Demo 1. [https://github.com/nihui/ncnn-android-scrfd](https://github.com/nihui/ncnn-android-scrfd)