diff --git a/recognition/arcface_paddle/README.md b/recognition/arcface_paddle/README.md
deleted file mode 120000
index 13c4f96..0000000
--- a/recognition/arcface_paddle/README.md
+++ /dev/null
@@ -1 +0,0 @@
-README_en.md
\ No newline at end of file
diff --git a/recognition/arcface_paddle/README.md b/recognition/arcface_paddle/README.md
new file mode 100644
index 0000000..74f94a6
--- /dev/null
+++ b/recognition/arcface_paddle/README.md
@@ -0,0 +1,260 @@
+# PLSC
+
+## 1. Introduction
+
+[PLSC](https://github.com/PaddlePaddle/PLSC) is an open source Paddle Large Scale Classification Tools, which supports 60 million classes on 8 NVIDIA V100 (32G).
+
+## 2. Environment preparation
+
+### 2.1 Install Paddle from source code
+
+```shell
+
+git clone https://github.com/PaddlePaddle/Paddle.git
+
+cd /path/to/Paddle/
+
+mkdir build && cd build
+
+cmake .. -DWITH_TESTING=ON -DWITH_GPU=ON -DWITH_GOLANG=OFF -DWITH_STYLE_CHECK=ON -DCMAKE_INSTALL_PREFIX=$PWD/output -DWITH_DISTRIBUTE=ON -DCMAKE_BUILD_TYPE=Release -DPY_VERSION=3.7 -DCUDA_ARCH_NAME=All -DPADDLE_VERSION=2.2.0
+
+make -j20 && make install -j20
+
+pip install output/opt/paddle/share/wheels/paddlepaddle_gpu-2.2.0-cp37-cp37m-linux_x86_64.whl
+
+```
+
+### 2.2 Download PLSC
+
+```shell
+git clone https://github.com/PaddlePaddle/PLSC.git
+
+cd /path/to/PLSC/
+```
+
+
+## 3. Data preparation
+
+### 3.1 Download dataset
+
+Download the dataset from [insightface datasets](https://github.com/deepinsight/insightface/tree/master/recognition/_datasets_).
+
+### 3.2 Extract MXNet Dataset to images
+
+```shell
+python tools/mx_recordio_2_images.py --root_dir ms1m-retinaface-t1/ --output_dir MS1M_v3/
+```
+
+After finishing unzipping the dataset, the folder structure is as follows.
+
+```
+arcface_paddle/MS1M_v3
+|_ images
+|  |_ 00000001.jpg
+|  |_ ...
+|  |_ 05179510.jpg
+|_ label.txt
+|_ agedb_30.bin
+|_ cfp_ff.bin
+|_ cfp_fp.bin
+|_ lfw.bin
+```
+
+Label file format is as follows.
+
+```
+# delimiter: "\t"
+# the following the content of label.txt
+images/00000001.jpg 0
+...
+```
+
+If you want to use customed dataset, you can arrange your data according to the above format. 
+
+### 3.3 Transform between original image files and bin files
+
+If you want to convert original image files to `bin` files used directly for training process, you can use the following command to finish the conversion.
+
+```shell
+python tools/convert_image_bin.py --image_path="your/input/image/path" --bin_path="your/output/bin/path" --mode="image2bin"
+```
+
+If you want to convert `bin` files to original image files, you can use the following command to finish the conversion.
+
+```shell
+python tools/convert_image_bin.py --image_path="your/input/bin/path" --bin_path="your/output/image/path" --mode="bin2image"
+```
+
+## 4. How to Training
+
+### 4.1 Single node, 8 GPUs:
+
+#### Static Mode
+
+```bash
+sh scripts/train_static.sh
+```
+
+#### Dynamic Mode
+
+```bash
+sh scripts/train_dynamic.sh
+```
+
+
+During training, you can view loss changes in real time through `VisualDL`,  For more information, please refer to [VisualDL](https://github.com/PaddlePaddle/VisualDL/).
+
+
+## 5. Model evaluation
+
+The model evaluation process can be started as follows.
+
+#### Static Mode
+
+```bash
+sh scripts/validation_static.sh
+```
+
+#### Dynamic Mode
+
+```bash
+sh scripts/validation_dynamic.sh
+```
+
+## 6. Export model
+PaddlePaddle supports inference using prediction engines. Firstly, you should export inference model.
+
+#### Static Mode
+
+```bash
+sh scripts/export_static.sh
+```
+
+#### Dynamic Mode
+
+```bash
+sh scripts/export_dynamic.sh
+```
+
+We also support export to onnx model, you only need to set `--export_type onnx`.
+
+## 7. Model inference
+
+The model inference process supports paddle save inference model and onnx model.
+
+```bash
+sh scripts/inference.sh
+```
+
+## 8. Model performance
+
+### 8.1 Accuracy on Verification Datasets
+
+**Configuration：**
+  * GPU: 8 NVIDIA Tesla V100 32G
+  * Precison: Pure FP16
+  * BatchSize: 128/1024
+
+| Mode    | Datasets | backbone | Ratio | agedb30 | cfp_fp | lfw  | log  | checkpoint |
+| ------- | :------: | :------- | ----- | :------ | :----- | :--- | :--- |  :--- |
+| Static  |  MS1MV3  | r50      | 0.1   | 0.98317 | 0.98943| 0.99850 | [log](https://raw.githubusercontent.com/GuoxiaWang/plsc_log/master/static/ms1mv3_r50_static_128_fp16_0.1/training.log) | [checkpoint](https://paddle-model-ecology.bj.bcebos.com/model/insight-face/distributed/ms1mv3_r50_static_128_fp16_0.1_epoch_24.tgz) |
+| Static  |  MS1MV3  | r50      | 1.0   | 0.98283 | 0.98843| 0.99850 | [log](https://raw.githubusercontent.com/GuoxiaWang/plsc_log/master/static/ms1mv3_r50_static_128_fp16_1.0/training.log) | [checkpoint](https://paddle-model-ecology.bj.bcebos.com/model/insight-face/distributed/ms1mv3_r50_static_128_fp16_1.0_epoch_24.tgz) |
+| Dynamic |  MS1MV3  | r50      | 0.1   | 0.98333 | 0.98900| 0.99833 | [log](https://raw.githubusercontent.com/GuoxiaWang/plsc_log/master/dynamic/ms1mv3_r50_dynamic_128_fp16_0.1/training.log) | [checkpoint](https://paddle-model-ecology.bj.bcebos.com/model/insight-face/distributed/ms1mv3_r50_dynamic_128_fp16_0.1_eopch_24.tgz) |
+| Dynamic |  MS1MV3  | r50      | 1.0   | 0.98317 | 0.98900| 0.99833 | [log](https://raw.githubusercontent.com/GuoxiaWang/plsc_log/master/dynamic/ms1mv3_r50_dynamic_128_fp16_1.0/training.log) | [checkpoint](https://paddle-model-ecology.bj.bcebos.com/model/insight-face/distributed/ms1mv3_r50_dynamic_128_fp16_1.0_eopch_24.tgz) |
+
+  
+### 8.2 Maximum Number of Identities 
+
+**Configuration：**
+  * GPU: 8 NVIDIA Tesla V100 32G
+  * BatchSize: 64/512
+  * SampleRatio: 0.1
+
+| Mode                      | Precision  | Res50    | Res100   |
+| ------------------------- | --------- | -------- | -------- |
+| Framework1 (static)       | AMP       | 42000000 | 39000000 |
+| Framework2 (dynamic)      | AMP       | 30000000 | 29000000 |
+| Paddle (static)           | Pure FP16 | 60000000 | 60000000 |
+| Paddle (dynamic)          | Pure FP16 | 59000000 | 59000000 |
+
+**Note:** config environment variable ``export FLAGS_allocator_strategy=naive_best_fit``
+
+### 8.3 Throughtput
+
+**Configuration：**
+  * BatchSize: 128/1024
+  * SampleRatio: 0.1
+  * Datasets: MS1MV3
+  
+![insightface_throughtput](https://github.com/GuoxiaWang/plsc_log/blob/master/insightface_throughtput.png)
+
+## 9. Demo
+
+Combined with face detection model, we can complete the face recognition process.
+
+Firstly, use the fllowing commands to download the models.
+
+```bash
+# Create models directory
+mkdir -p models
+
+# Download blazeface face detection model and extract it
+wget https://paddle-model-ecology.bj.bcebos.com/model/insight-face/blazeface_fpn_ssh_1000e_v1.0_infer.tar -P models/
+tar -xzf models/blazeface_fpn_ssh_1000e_v1.0_infer.tar -C models/
+rm -rf models/blazeface_fpn_ssh_1000e_v1.0_infer.tar
+
+# Download static ResNet50 PartialFC 0.1 model and extract it
+wget https://paddle-model-ecology.bj.bcebos.com/model/insight-face/distributed/ms1mv3_r50_static_128_fp16_0.1_epoch_24.tgz -P models/
+tar -xf models/ms1mv3_r50_static_128_fp16_0.1_epoch_24.tgz -C models/
+rm -rf models/ms1mv3_r50_static_128_fp16_0.1_epoch_24.tgz
+
+# Export static save inference model
+python tools/export.py --is_static True --export_type paddle --backbone FresResNet50 --embedding_size 512 --checkpoint_dir models/ms1mv3_r50_static_128_fp16_0.1_epoch_24 --output_dir models/ms1mv3_r50_static_128_fp16_0.1_epoch_24_infer
+rm -rf models/ms1mv3_r50_static_128_fp16_0.1_epoch_24
+```
+
+Then, use the following commands to download the gallery, demo image and font file for visualization. And we generate gallery features.
+
+```bash
+# Download gallery, query and font file
+mkdir -p images/
+git clone https://github.com/littletomatodonkey/insight-face-paddle /tmp/insight-face-paddle
+cp -r /tmp/insight-face-paddle/demo/friends/gallery/ images/
+cp -r /tmp/insight-face-paddle/demo/friends/query/ images/
+mkdir -p assets
+cp /tmp/insight-face-paddle/SourceHanSansCN-Medium.otf assets/
+rm -rf /tmp/insight-face-paddle
+
+# Build index file
+python tools/test_recognition.py \
+    --rec \
+    --rec_model_file_path models/ms1mv3_r50_static_128_fp16_0.1_epoch_24_infer/FresResNet50.pdmodel \
+    --rec_params_file_path models/ms1mv3_r50_static_128_fp16_0.1_epoch_24_infer/FresResNet50.pdiparams \
+    --build_index=images/gallery/index.bin \
+    --img_dir=images/gallery \
+    --label=images/gallery/label.txt
+```
+
+Use the following command to run the whole face recognition demo.
+
+```bash
+# detection + recogniotion process
+python tools/test_recognition.py \
+    --det \
+    --det_model_file_path models/blazeface_fpn_ssh_1000e_v1.0_infer/inference.pdmodel \
+    --det_params_file_path models/blazeface_fpn_ssh_1000e_v1.0_infer/inference.pdiparams \
+    --rec \
+    --rec_model_file_path models/ms1mv3_r50_static_128_fp16_0.1_epoch_24_infer/FresResNet50.pdmodel \
+    --rec_params_file_path models/ms1mv3_r50_static_128_fp16_0.1_epoch_24_infer/FresResNet50.pdiparams \
+    --index=images/gallery/index.bin \
+    --input=images/query/friends2.jpg \
+    --cdd_num 10 \
+    --rec_thresh 0.4 \
+    --output="./output"
+```
+
+The final result is save in folder `output/`, which is shown as follows.
+
+<div align="center">
+<img src="https://github.com/GuoxiaWang/plsc_log/blob/master/friends2.jpg"  width = "800" />
+</div>
diff --git a/recognition/arcface_paddle/README_ch.md b/recognition/arcface_paddle/README_ch.md
deleted file mode 100644
index 88df669..0000000
--- a/recognition/arcface_paddle/README_ch.md
+++ /dev/null
@@ -1,243 +0,0 @@
-简体中文 | [English](README_en.md)
-
-# Arcface-Paddle
-
-## 1. 简介
-
-`Arcface-Paddle`是基于PaddlePaddle实现的，开源深度人脸检测、识别工具。`Arcface-Paddle`目前提供了三个预训练模型，包括用于人脸检测的 `BlazeFace`、用于人脸识别的 `ArcFace` 和 `MobileFace`。
-
-- 本部分内容为人脸识别部分。
-- 人脸检测相关内容可以参考：[基于BlazeFace的人脸检测](../../detection/blazeface_paddle/README_ch.md)。
-- 基于PaddleInference的Whl包预测部署内容可以参考：[Whl包预测部署](https://github.com/littletomatodonkey/insight-face-paddle)。
-
-Note: 在此非常感谢 [GuoQuanhao](https://github.com/GuoQuanhao) 基于PaddlePaddle复现了 [Arcface的基线模型](https://github.com/GuoQuanhao/arcface-Paddle)。
-
-## 2. 环境准备
-
-请参照 [Installation](./install_ch.md) 配置实验所需环境。
-
-## 3. 数据准备
-
-### 3.1 进入 repo 目录。
-
-```
-cd arcface_paddle/
-```
-
-### 3.2 下载与解压数据集
-
-使用下面的命令下载并解压 MS1M 数据集。
-
-```shell
-# 下载数据集
-wget https://paddle-model-ecology.bj.bcebos.com/data/insight-face/MS1M_bin.tar
-# 解压数据集
-tar -xf MS1M_bin.tar
-```
-
-注意：
-* 如果希望在windows环境下安装wget，可以参考：[链接](https://www.cnblogs.com/jeshy/p/10518062.html)；如果希望在windows环境中安装tar命令，可以参考：[链接](https://www.cnblogs.com/chooperman/p/14190107.html)。
-* 如果macOS环境下没有安装wget命令，可以运行下面的命令进行安装。
-
-```shell
-# 安装 homebrew
-ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)";
-# 安装wget
-brew install wget
-```
-
-
-解压完成之后，文件夹目录结构如下。
-
-```
-Arcface-Paddle/MSiM_bin
-|_ images
-|  |_ 00000000.bin
-|  |_ ...
-|  |_ 05822652.bin
-|_ label.txt
-|_ agedb_30.bin
-|_ cfp_ff.bin
-|_ cfp_fp.bin
-|_ lfw.bin
-```
-
-* 标签文件格式：
-
-  ```
-   # delimiter: "\t"
-   # the following the content of label.txt
-   images/00000000.bin 0
-   ...
-  ```
-
-如果需要使用自定义数据集，请按照上述格式进行整理，并替换配置文件中的数据集目录。
-
-注意：
-* 这里为了更加方便`Dataloader`读取数据，将原始的`train.rec`文件转化为很多`bin文件`，每个`bin文件`都唯一对应一张原始图像。如果您采集得到的文件均为原始的图像文件，那么可以参考`3.3节`中的内容完成原始图像文件到bin文件的转换。
-* 如果你的训练数据为原始的图像文件列表格式，那么在训练的时候，只需要将`is_bin`修改为`False`即可，下面的训练脚本中也会有具体的使用说明。
-
-### 3.3 原始图像文件与bin文件的转换
-
-如果希望将原始的图像文件转换为本文用于训练的bin文件，那么可以使用下面的命令进行转换。
-
-```shell
-python3.7 tools/convert_image_bin.py --image_path="your/input/image/path" --bin_path="your/output/bin/path" --mode="image2bin"
-```
-
-如果希望将bin文件转化为原始的图像文件，那么可以使用下面的命令进行转换。
-
-```shell
-python3.7 tools/convert_image_bin.py --image_path="your/input/bin/path" --bin_path="your/output/image/path" --mode="bin2image"
-```
-
-## 4. 模型训练
-
-准备好配置文件后，可以通过以下方式开始训练过程。
-
-```bash
-# 如果你的训练数据为bin文件格式的图像文件，可以使用下面的命令进行训练
-python3.7 train.py \
-    --network 'MobileFaceNet_128' \
-    --lr=0.1 \
-    --batch_size 512 \
-    --weight_decay 2e-4 \
-    --embedding_size 128 \
-    --logdir="log" \
-    --output "emore_arcface" \
-    --resume 0
-
-# 如果你的训练数据为原始图像文件，可以将`is_bin`指定为False，进行训练
-python3.7 train.py \
-    --network 'MobileFaceNet_128' \
-    --lr=0.1 \
-    --batch_size 512 \
-    --weight_decay 2e-4 \
-    --embedding_size 128 \
-    --logdir="log" \
-    --output "emore_arcface" \
-    --resume 0 \
-    --is_bin False
-```
-
-上述命令中，需要传入如下参数:
-
-+ `network`: 模型名称, 默认值为 `MobileFaceNet_128`;
-+ `lr`: 初始学习率, 默认值为  `0.1`;
-+ `batch_size`:  Batch size 的大小, 默认值为  `512`;
-+ `weight_decay`:  正则化策略, 默认值为  `2e-4`;
-+ `embedding_size`: 人脸 embedding 的长度, 默认值为 `128`;
-+ `logdir`: VDL 输出 log 的存储路径, 默认值为 `"log"`;
-+ `output`: 训练过程中的模型文件存储路径, 默认值为 `"emore_arcface"`;
-+ `resume`: 是否恢复分类层的模型权重。 `1` 表示使用之前好的权重文件进行初始化，  `0` 代表重新初始化。 如果想要恢复分类层的模型权重， 需要保证 `output` 目录下包含： `rank:0_softmax_weight_mom.pkl` 和 `rank:0_softmax_weight.pkl` 两个文件。
-+ `is_bin`: 训练数据是否为bin文件格式，默认为True。
-
-* 训练过程中的输出 log 示例如下:
-
-  ```
-  ...
-  Speed 500.89 samples/sec   Loss 55.5692   Epoch: 0   Global Step: 200   Required: 104 hours, lr_backbone_value: 0.000000, lr_pfc_value: 0.000000
-  ...
-  [lfw][2000]XNorm: 9.890562
-  [lfw][2000]Accuracy-Flip: 0.59017+-0.02031
-  [lfw][2000]Accuracy-Highest: 0.59017
-  [cfp_fp][2000]XNorm: 12.920007
-  [cfp_fp][2000]Accuracy-Flip: 0.53329+-0.01262
-  [cfp_fp][2000]Accuracy-Highest: 0.53329
-  [agedb_30][2000]XNorm: 12.188049
-  [agedb_30][2000]Accuracy-Flip: 0.51967+-0.02316
-  [agedb_30][2000]Accuracy-Highest: 0.51967
-  ...
-  ```
-
-
-在训练过程中，可以通过  `VisualDL` 实时查看loss变化，更多信息请参考 [VisualDL](https://github.com/PaddlePaddle/VisualDL/)。
-
-
-## 5. 模型评估
-
-可以通过以下方式开始模型评估过程。
-
-```bash
-python3.7 valid.py
-    --network MobileFaceNet_128  \
-    --checkpoint emore_arcface \
-```
-
-上述命令中，需要传入如下参数:
-
-+ `network`: 模型名称, 默认值为 `MobileFaceNet_128`;
-+ `checkpoint`: 保存模型权重的目录, 默认值为 `emore_arcface`;
-
-**注意:** 上面的命令将评估模型文件 `./emore_arcface/MobileFaceNet_128.pdparams` .您也可以通过同时修改 `network` 和 `checkpoint` 来修改要评估的模型文件。
-
-## 6. 模型导出
-PaddlePaddle支持使用预测引擎进行预测推理，通过导出inference模型将模型固化：
-
-```bash
-python export_inference_model.py --network MobileFaceNet_128 --output ./inference_model/ --pretrained_model ./emore_arcface/MobileFaceNet_128.pdparams
-```
-
-导出模型后，在 `./inference_model/` 目录下有：
-
-```
-./inference_model/
-|_ inference.pdmodel
-|_ inference.pdiparams
-```
-
-## 7. 模型精度与速度benchmark
-
-在MS1M训练集上进行模型训练，最终得到的模型指标在lfw、cfp_fp、agedb30三个数据集上的精度指标以及CPU、GPU的预测耗时如下。
-
-| 模型结构                  | lfw   | cfp_fp | agedb30  | CPU 耗时 | GPU 耗时 | 模型下载地址 | 
-| ------------------------- | ----- | ------ | ------- |-------|  -------- | ---- |
-| MobileFaceNet-Paddle      | 0.9945 | 0.9343  | 0.9613 | 4.3ms | 2.3ms   | [下载地址](https://paddle-model-ecology.bj.bcebos.com/model/insight-face/mobileface_v1.0_infer.tar) |
-| MobileFaceNet-mxnet | 0.9950 | 0.8894  | 0.9591   |  7.3ms | 4.7ms   | - |
-| ArcFace-Paddle      | 0.9973 | 0.9743  | 0.9788 | - | -   | [下载地址](https://paddle-model-ecology.bj.bcebos.com/model/insight-face/arcface_iresnet50_v1.0_infer.tar) |
-
-* 注：这里`ArcFace-Paddle`的backbone为iResNet50，模型相对较大，在CPU设备或者移动端设备上不推荐使用，因此没有给出具体的预测时间。
-
-**测试环境：**
-  * CPU: Intel(R) Xeon(R) Gold 6184 CPU @ 2.40GHz
-  * GPU: a single NVIDIA Tesla V100
-
-
-## 8. 模型预测
-
-
-融合人脸检测过程，可以完成"检测+识别"的人脸识别过程。
-
-首先下载索引库、待识别图像与字体文件。
-
-```bash
-# 下载用于人脸识别的索引库，这里因为示例图像是老友记中的图像，所以使用老友记中角色的人脸图像构建的底库。
-wget https://raw.githubusercontent.com/littletomatodonkey/insight-face-paddle/main/demo/friends/index.bin
-# 下载用于人脸识别的示例图像
-wget https://raw.githubusercontent.com/littletomatodonkey/insight-face-paddle/main/demo/friends/query/friends2.jpg
-# 下载字体，用于可视化
-wget https://raw.githubusercontent.com/littletomatodonkey/insight-face-paddle/main/SourceHanSansCN-Medium.otf
-```
-
-示例图像如下所示。
-
-<div align="center">
-<img src="https://raw.githubusercontent.com/littletomatodonkey/insight-face-paddle/main/demo/friends/query/friends2.jpg"  width = "800" />
-</div>
-
-
-`检测+识别`串联预测的示例脚本如下。
-
-```shell
-# 同时使用检测+识别
-python3.7 test_recognition.py --det --rec --index=index.bin --input=friends2.jpg --output="./output"
-```
-
-最终可视化结果保存在`output`目录下，可视化结果如下所示。
-
-<div align="center">
-<img src="https://raw.githubusercontent.com/littletomatodonkey/insight-face-paddle/main/demo/friends/output/friends2.jpg"  width = "800" />
-</div>
-
-
-更多关于参数解释，索引库构建、whl包预测部署的内容可以参考：[Whl包预测部署](https://github.com/littletomatodonkey/insight-face-paddle)。
diff --git a/recognition/arcface_paddle/README_en.md b/recognition/arcface_paddle/README_en.md
deleted file mode 100644
index 8e8ac0d..0000000
--- a/recognition/arcface_paddle/README_en.md
+++ /dev/null
@@ -1,246 +0,0 @@
-[简体中文](README_ch.md) | English
-
-# Arcface-Paddle
-
-## 1. Introduction
-
-`Arcface-Paddle` is an open source deep face detection and recognition toolkit, powered by PaddlePaddle. `Arcface-Paddle` provides three related pretrained models now, include `BlazeFace` for face detection, `ArcFace` and `MobileFace` for face recognition.
-
-- This tutorial is mainly about face recognition.
-- For face detection task, please refer to: [Face detection tuturial](../../detection/blazeface_paddle/README_en.md).
-- For Whl package inference using PaddleInference, please refer to [whl package inference](https://github.com/littletomatodonkey/insight-face-paddle).
-
-
-Note: Many thanks to [GuoQuanhao](https://github.com/GuoQuanhao) for the reproduction of the [Arcface basline using PaddlePaddle](https://github.com/GuoQuanhao/arcface-Paddle).
-
-## 2. Environment preparation
-
-Please refer to [Installation](./install_en.md) to setup environment at first.
-
-
-## 3. Data preparation
-
-### 3.1 Enter recognition dir.
-
-```
-cd arcface_paddle/rec
-```
-
-### 3.2 Download and unzip dataset
-
-Use the following command to download and unzip MS1M dataset.
-
-
-```shell
-# download dataset
-wget https://paddle-model-ecology.bj.bcebos.com/data/insight-face/MS1M_bin.tar
-# unzip dataset
-tar -xf MS1M_bin.tar
-```
-
-**Note:**
-* If you want to install `wget` on Windows, please refer to [link](https://www.cnblogs.com/jeshy/p/10518062.html). If you want to install `tar` on Windows. please refer to [link](https://www.cnblogs.com/chooperman/p/14190107.html).
-* If `wget` is not installed on macOS, you can use the following command to install.
-
-```shell
-# install homebrew
-ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)";
-# install wget
-brew install wget
-```
-
-After finishing unzipping the dataset, the folder structure is as follows.
-
-```
-Arcface-Paddle/MSiM_bin
-|_ images
-|  |_ 00000000.bin
-|  |_ ...
-|  |_ 05822652.bin
-|_ label.txt
-|_ agedb_30.bin
-|_ cfp_ff.bin
-|_ cfp_fp.bin
-|_ lfw.bin
-```
-
-* Label file format is as follows.
-
-  ```
-   # delimiter: "\t"
-   # the following the content of label.txt
-   images/00000000.bin 0
-   ...
-  ```
-
-If you want to use customed dataset, you can arrange your data according to the above format. And should replace data folder in the configuration using yours.
-
-
-
-**Note:**
-* For using `Dataloader` api for reading data, we convert `train.rec` into many little `bin` files, each `bin` file denotes a single image. If your dataset just contains origin image files. You can either rewrite the dataloader file or refer to section 3.3 to convert the original image files to `bin` files.
-* If you train data is image format rather than `bin` format. For the training process, you just need to set the parameter `is_bin` as `False`. More details can be seen in the following training script.
-
-### 3.3 Transform between original image files and bin files
-
-If you want to convert original image files to `bin` files used directly for training process, you can use the following command to finish the conversion.
-
-```shell
-python3.7 tools/convert_image_bin.py --image_path="your/input/image/path" --bin_path="your/output/bin/path" --mode="image2bin"
-```
-
-If you want to convert `bin` files to original image files, you can use the following command to finish the conversion.
-
-```shell
-python3.7 tools/convert_image_bin.py --image_path="your/input/bin/path" --bin_path="your/output/image/path" --mode="bin2image"
-```
-
-## 4. Model training
-
-After preparing the configuration file, The training process can be started in the following way.
-
-```bash
-# for the bin format training data
-python3.7 train.py \
-    --network 'MobileFaceNet_128' \
-    --lr=0.1 \
-    --batch_size 512 \
-    --weight_decay 2e-4 \
-    --embedding_size 128 \
-    --logdir="log" \
-    --output "emore_arcface" \
-    --resume 0
-
-# for the original image format training data
-python3.7 train.py \
-    --network 'MobileFaceNet_128' \
-    --lr=0.1 \
-    --batch_size 512 \
-    --weight_decay 2e-4 \
-    --embedding_size 128 \
-    --logdir="log" \
-    --output "emore_arcface" \
-    --resume 0 \
-    --is_bin False
-```
-
-Among them:
-
-+ `network`: Model name, such as `MobileFaceNet_128`;
-+ `lr`: Initial learning rate, default by  `0.1`;
-+ `batch_size`:  Batch size, default by  `512`;
-+ `weight_decay`:  The strategy of regularization, default by  `2e-4`;
-+ `embedding_size`: The length of face embedding, default by `128`;
-+ `logdir`: VDL log storage directory, default by `"log"`;
-+ `output`: Model stored path, default by: `"emore_arcface"`;
-+ `resume`: Restore the classification layer parameters. `1` represents recovery parameters, and `0` represents reinitialization. If you need to resume training, you need to ensure that there are `rank:0_softmax_weight_mom.pkl` and `rank:0_softmax_weight.pkl` in the output directory.
-+ `is_bin`: Whether the training data is bin format, default as True.
-
-* The output log examples are as follows:
-
-  ```
-  ...
-  Speed 500.89 samples/sec   Loss 55.5692   Epoch: 0   Global Step: 200   Required: 104 hours, lr_backbone_value: 0.000000, lr_pfc_value: 0.000000
-  ...
-  [lfw][2000]XNorm: 9.890562
-  [lfw][2000]Accuracy-Flip: 0.59017+-0.02031
-  [lfw][2000]Accuracy-Highest: 0.59017
-  [cfp_fp][2000]XNorm: 12.920007
-  [cfp_fp][2000]Accuracy-Flip: 0.53329+-0.01262
-  [cfp_fp][2000]Accuracy-Highest: 0.53329
-  [agedb_30][2000]XNorm: 12.188049
-  [agedb_30][2000]Accuracy-Flip: 0.51967+-0.02316
-  [agedb_30][2000]Accuracy-Highest: 0.51967
-  ...
-  ```
-
-
-During training, you can view loss changes in real time through `VisualDL`,  For more information, please refer to [VisualDL](https://github.com/PaddlePaddle/VisualDL/).
-
-
-## 5. Model evaluation
-
-The model evaluation process can be started as follows.
-
-```bash
-python3.7 valid.py
-    --network MobileFaceNet_128  \
-    --checkpoint emore_arcface \
-```
-
-Among them:
-
-+ `network`: Model name, such as `MobileFaceNet_128`;
-+ `checkpoint`: Directory to save model weights, default by  `emore_arcface`;
-
-**Note:** The above command will evaluate the model `./emore_arcface/MobileFaceNet_128.pdparams` .You can also modify the model to be evaluated by modifying the network name and checkpoint at the same time.
-
-
-## 6. Export model
-PaddlePaddle supports inference using prediction engines. Firstly, you should export inference model.
-
-```bash
-python export_inference_model.py --network MobileFaceNet_128 --output ./inference_model/ --pretrained_model ./emore_arcface/MobileFaceNet_128.pdparams
-```
-
-After that, the inference model files are as follow:
-
-```
-./inference_model/
-|_ inference.pdmodel
-|_ inference.pdiparams
-```
-
-## 7. Model performance
-
-For Paddle models, we train the models on `MS1M` dataset. Metrics on lfw, cfp_fp and agedb30 of the final models are shown as follows. The CPU/GPU time cost of the final models is as follows.
-
-| Model structure           | lfw   | cfp_fp | agedb30  | CPU time cost | GPU time cost | Inference model |
-| ------------------------- | ----- | ------ | ------- | -------| -------- |---- |
-| MobileFaceNet-Paddle      | 0.9945 | 0.9343  | 0.9613 | 4.3ms | 2.3ms   | [download link](https://paddle-model-ecology.bj.bcebos.com/model/insight-face/mobileface_v1.0_infer.tar) |
-| MobileFaceNet-mxnet | 0.9950 | 0.8894  | 0.9591   |  7.3ms | 4.7ms   |
-| ArcFace-Paddle      | 0.9973 | 0.9743  | 0.9788 | - | -   | [download link](https://paddle-model-ecology.bj.bcebos.com/model/insight-face/arcface_iresnet50_v1.0_infer.tar) |
-
-* Note: Backbone of the model `ArcFace-Paddle` is `iResNet50`, which is not suggested to run on CPU or arm device, so the time cost is not listed here.
-
-**Envrionment：**
-  * CPU: Intel(R) Xeon(R) Gold 6184 CPU @ 2.40GHz
-  * GPU: a single NVIDIA Tesla V100
-
-## 8. Model inference
-
-Combined with face detection model, we can complete the face recognition process.
-
-Firstly, use the following commands to download the index gallery, demo image and font file for visualization.
-
-
-```bash
-# Index library for the recognition process
-wget https://raw.githubusercontent.com/littletomatodonkey/insight-face-paddle/main/demo/friends/index.bin
-# Demo image
-wget https://raw.githubusercontent.com/littletomatodonkey/insight-face-paddle/main/demo/friends/query/friends2.jpg
-# Font file for visualization
-wget https://raw.githubusercontent.com/littletomatodonkey/insight-face-paddle/main/SourceHanSansCN-Medium.otf
-```
-
-The demo image is shown as follows.
-
-<div align="center">
-<img src="https://raw.githubusercontent.com/littletomatodonkey/insight-face-paddle/main/demo/friends/query/friends2.jpg"  width = "800" />
-</div>
-
-
-Use the following command to run the whole face recognition demo.
-
-```shell
-# detection + recogniotion process
-python3.7 test_recognition.py --det --rec --index=index.bin --input=friends2.jpg --output="./output"
-```
-
-The final result is save in folder `output/`, which is shown as follows.
-
-<div align="center">
-<img src="https://raw.githubusercontent.com/littletomatodonkey/insight-face-paddle/main/demo/friends/output/friends2.jpg"  width = "800" />
-</div>
-
-For more details about parameter explanations, index gallery construction and whl package inference, please refer to [Whl package inference tutorial](https://github.com/littletomatodonkey/insight-face-paddle).
diff --git a/recognition/arcface_paddle/backbones/iresnet.py b/recognition/arcface_paddle/backbones/iresnet.py
deleted file mode 100644
index c0d9b3c..0000000
--- a/recognition/arcface_paddle/backbones/iresnet.py
+++ /dev/null
@@ -1,255 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-# reference: https://raw.githubusercontent.com/GuoQuanhao/arcface-Paddle/main/backbones/iresnet.py
-
-import paddle
-from paddle import nn
-
-__all__ = ['iresnet18', 'iresnet34', 'iresnet50', 'iresnet100', 'iresnet200']
-
-
-def conv3x3(in_planes, out_planes, stride=1, groups=1, dilation=1):
-    """3x3 convolution with padding"""
-    return nn.Conv2D(
-        in_planes,
-        out_planes,
-        kernel_size=3,
-        stride=stride,
-        padding=dilation,
-        groups=groups,
-        bias_attr=False,
-        dilation=dilation)
-
-
-def conv1x1(in_planes, out_planes, stride=1):
-    """1x1 convolution"""
-    return nn.Conv2D(
-        in_planes, out_planes, kernel_size=1, stride=stride, bias_attr=False)
-
-
-class IBasicBlock(nn.Layer):
-    expansion = 1
-
-    def __init__(self,
-                 inplanes,
-                 planes,
-                 stride=1,
-                 downsample=None,
-                 groups=1,
-                 base_width=64,
-                 dilation=1):
-        super(IBasicBlock, self).__init__()
-        if groups != 1 or base_width != 64:
-            raise ValueError(
-                'BasicBlock only supports groups=1 and base_width=64')
-        if dilation > 1:
-            raise NotImplementedError(
-                "Dilation > 1 not supported in BasicBlock")
-        self.bn1 = nn.BatchNorm2D(inplanes, epsilon=1e-05, momentum=0.1)
-        self.conv1 = conv3x3(inplanes, planes)
-        self.bn2 = nn.BatchNorm2D(planes, epsilon=1e-05, momentum=0.1)
-        self.prelu = nn.PReLU(planes)
-        self.conv2 = conv3x3(planes, planes, stride)
-        self.bn3 = nn.BatchNorm2D(planes, epsilon=1e-05, momentum=0.1)
-        self.downsample = downsample
-        self.stride = stride
-
-    def forward(self, x):
-        identity = x
-        out = self.bn1(x)
-        out = self.conv1(out)
-        out = self.bn2(out)
-        out = self.prelu(out)
-        out = self.conv2(out)
-        out = self.bn3(out)
-        if self.downsample is not None:
-            identity = self.downsample(x)
-        out += identity
-        return out
-
-
-class IResNet(nn.Layer):
-    fc_scale = 7 * 7
-
-    def __init__(self,
-                 block,
-                 layers,
-                 dropout=0,
-                 num_features=512,
-                 zero_init_residual=False,
-                 groups=1,
-                 width_per_group=64,
-                 replace_stride_with_dilation=None,
-                 fp16=False):
-        super(IResNet, self).__init__()
-        self.fp16 = fp16
-        self.inplanes = 64
-        self.dilation = 1
-        if replace_stride_with_dilation is None:
-            replace_stride_with_dilation = [False, False, False]
-        if len(replace_stride_with_dilation) != 3:
-            raise ValueError("replace_stride_with_dilation should be None "
-                             "or a 3-element tuple, got {}".format(
-                                 replace_stride_with_dilation))
-        self.groups = groups
-        self.base_width = width_per_group
-        self.conv1 = nn.Conv2D(
-            3,
-            self.inplanes,
-            kernel_size=3,
-            stride=1,
-            padding=1,
-            bias_attr=False)
-        self.bn1 = nn.BatchNorm2D(self.inplanes, epsilon=1e-05, momentum=0.1)
-        self.prelu = nn.PReLU(self.inplanes)
-        self.layer1 = self._make_layer(block, 64, layers[0], stride=2)
-        self.layer2 = self._make_layer(
-            block,
-            128,
-            layers[1],
-            stride=2,
-            dilate=replace_stride_with_dilation[0])
-        self.layer3 = self._make_layer(
-            block,
-            256,
-            layers[2],
-            stride=2,
-            dilate=replace_stride_with_dilation[1])
-        self.layer4 = self._make_layer(
-            block,
-            512,
-            layers[3],
-            stride=2,
-            dilate=replace_stride_with_dilation[2])
-        self.bn2 = nn.BatchNorm2D(
-            512 * block.expansion, epsilon=1e-05, momentum=0.1)
-        self.dropout = nn.Dropout(p=dropout)
-        self.fc = nn.Linear(512 * block.expansion * self.fc_scale,
-                            num_features)
-        self.features = nn.BatchNorm1D(
-            num_features, momentum=0.1, epsilon=1e-05)
-        self.features.weight = paddle.create_parameter(
-            shape=self.features.weight.shape,
-            dtype='float32',
-            default_initializer=nn.initializer.Constant(value=1.0))
-        # nn.init.constant_(self.features.weight, 1.0)
-        # 修改了stop_gradient，将True设为False
-        self.features.weight.stop_gradient = False
-        #self.features.weight.requires_grad = False
-
-        for m in self.sublayers():
-            if isinstance(m, nn.Conv2D):
-                m.weight = paddle.create_parameter(
-                    shape=m.weight.shape,
-                    dtype='float32',
-                    default_initializer=nn.initializer.Normal(
-                        mean=0.0, std=0.1))
-                # nn.init.normal_(m.weight, 0, 0.1)
-            elif isinstance(m, (nn.BatchNorm2D, nn.GroupNorm)):
-                m.weight = paddle.create_parameter(
-                    shape=m.weight.shape,
-                    dtype='float32',
-                    default_initializer=nn.initializer.Constant(value=1.0))
-                m.bias = paddle.create_parameter(
-                    shape=m.bias.shape,
-                    dtype='float32',
-                    default_initializer=nn.initializer.Constant(value=0.0))
-                # nn.init.constant_(m.weight, 1)
-                # nn.init.constant_(m.bias, 0)
-
-        if zero_init_residual:
-            for m in self.sublayers():
-                if isinstance(m, IBasicBlock):
-                    m.bn2.weight = paddle.create_parameter(
-                        shape=m.bn2.weight.shape,
-                        dtype='float32',
-                        default_initializer=nn.initializer.Constant(value=0.0))
-                    # nn.init.constant_(m.bn2.weight, 0)
-
-    def _make_layer(self, block, planes, blocks, stride=1, dilate=False):
-        downsample = None
-        previous_dilation = self.dilation
-        if dilate:
-            self.dilation *= stride
-            stride = 1
-        if stride != 1 or self.inplanes != planes * block.expansion:
-            downsample = nn.Sequential(
-                conv1x1(self.inplanes, planes * block.expansion, stride),
-                nn.BatchNorm2D(
-                    planes * block.expansion, epsilon=1e-05, momentum=0.1), )
-        layers = []
-        layers.append(
-            block(self.inplanes, planes, stride, downsample, self.groups,
-                  self.base_width, previous_dilation))
-        self.inplanes = planes * block.expansion
-        for _ in range(1, blocks):
-            layers.append(
-                block(
-                    self.inplanes,
-                    planes,
-                    groups=self.groups,
-                    base_width=self.base_width,
-                    dilation=self.dilation))
-
-        return nn.Sequential(*layers)
-
-    def forward(self, x):
-        with paddle.amp.auto_cast():
-            x = self.conv1(x)
-            x = self.bn1(x)
-            x = self.prelu(x)
-            x = self.layer1(x)
-            x = self.layer2(x)
-            x = self.layer3(x)
-            x = self.layer4(x)
-            x = self.bn2(x)
-            x = paddle.cast(x, dtype='float32')
-            x = paddle.flatten(x, 1)
-            x = self.dropout(x)
-        x = self.fc(paddle.cast(x, dtype='float16') if self.fp16 else x)
-        x = self.features(x)
-        return x
-
-
-def _iresnet(arch, block, layers, pretrained, progress, **kwargs):
-    model = IResNet(block, layers, **kwargs)
-    if pretrained:
-        raise ValueError()
-    return model
-
-
-def iresnet18(pretrained=False, progress=True, **kwargs):
-    return _iresnet('iresnet18', IBasicBlock, [2, 2, 2, 2], pretrained,
-                    progress, **kwargs)
-
-
-def iresnet34(pretrained=False, progress=True, **kwargs):
-    return _iresnet('iresnet34', IBasicBlock, [3, 4, 6, 3], pretrained,
-                    progress, **kwargs)
-
-
-def iresnet50(pretrained=False, progress=True, **kwargs):
-    return _iresnet('iresnet50', IBasicBlock, [3, 4, 14, 3], pretrained,
-                    progress, **kwargs)
-
-
-def iresnet100(pretrained=False, progress=True, **kwargs):
-    return _iresnet('iresnet100', IBasicBlock, [3, 13, 30, 3], pretrained,
-                    progress, **kwargs)
-
-
-def iresnet200(pretrained=False, progress=True, **kwargs):
-    return _iresnet('iresnet200', IBasicBlock, [6, 26, 60, 6], pretrained,
-                    progress, **kwargs)
diff --git a/recognition/arcface_paddle/eval/__init__.py b/recognition/arcface_paddle/configs/__init__.py
similarity index 94%
rename from recognition/arcface_paddle/eval/__init__.py
rename to recognition/arcface_paddle/configs/__init__.py
index 61d5aa2..185a92b 100644
--- a/recognition/arcface_paddle/eval/__init__.py
+++ b/recognition/arcface_paddle/configs/__init__.py
@@ -10,4 +10,4 @@
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
-# limitations under the License.
\ No newline at end of file
+# limitations under the License.
diff --git a/recognition/arcface_paddle/configs/argparser.py b/recognition/arcface_paddle/configs/argparser.py
new file mode 100644
index 0000000..4ded39d
--- /dev/null
+++ b/recognition/arcface_paddle/configs/argparser.py
@@ -0,0 +1,281 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import logging
+import argparse
+import importlib
+
+
+def print_args(args):
+    logging.info('--------args----------')
+    for k in list(vars(args).keys()):
+        logging.info('%s: %s' % (k, vars(args)[k]))
+    logging.info('------------------------\n')
+
+
+def str2bool(v):
+    return str(v).lower() in ("true", "t", "1")
+
+
+def tostrlist(v):
+    if isinstance(v, list):
+        return v
+    elif isinstance(v, str):
+        return [e.strip() for e in v.split(',')]
+
+
+def tointlist(v):
+    if isinstance(v, list):
+        return v
+    elif isinstance(v, str):
+        return [int(e.strip()) for e in v.split(',')]
+
+
+def get_config(config_file):
+    assert config_file.startswith(
+        'configs/'), 'config file setting must start with configs/'
+    temp_config_name = os.path.basename(config_file)
+    temp_module_name = os.path.splitext(temp_config_name)[0]
+    config = importlib.import_module("configs.config")
+    cfg = config.config
+    config = importlib.import_module("configs.%s" % temp_module_name)
+    job_cfg = config.config
+    cfg.update(job_cfg)
+    if cfg.output is None:
+        cfg.output = osp.join('work_dirs', temp_module_name)
+    return cfg
+
+
+class UserNamespace(object):
+    pass
+
+
+def parse_args():
+
+    parser = argparse.ArgumentParser(description='Paddle Face Training')
+    user_namespace = UserNamespace()
+    parser.add_argument(
+        '--config_file', type=str, required=True, help='config file path')
+    parser.parse_known_args(namespace=user_namespace)
+    cfg = get_config(user_namespace.config_file)
+
+    # Model setting
+    parser.add_argument(
+        '--is_static',
+        type=str2bool,
+        default=cfg.is_static,
+        help='whether to use static mode')
+    parser.add_argument(
+        '--backbone', type=str, default=cfg.backbone, help='backbone network')
+    parser.add_argument(
+        '--classifier',
+        type=str,
+        default=cfg.classifier,
+        help='classification network')
+    parser.add_argument(
+        '--embedding_size',
+        type=int,
+        default=cfg.embedding_size,
+        help='embedding size')
+    parser.add_argument(
+        '--model_parallel',
+        type=str2bool,
+        default=cfg.model_parallel,
+        help='whether to use model parallel')
+    parser.add_argument(
+        '--sample_ratio',
+        type=float,
+        default=cfg.sample_ratio,
+        help='sample rate, use partial fc sample if sample rate less than 1.0')
+    parser.add_argument(
+        '--loss', type=str, default=cfg.loss, help='loss function')
+    parser.add_argument(
+        '--dropout',
+        type=float,
+        default=cfg.dropout,
+        help='probability of dropout')
+
+    # AMP setting
+    parser.add_argument(
+        '--fp16',
+        type=str2bool,
+        default=cfg.fp16,
+        help='whether to use fp16 training')
+    parser.add_argument(
+        '--init_loss_scaling',
+        type=float,
+        default=cfg.init_loss_scaling,
+        help='The initial loss scaling factor.')
+    parser.add_argument(
+        '--max_loss_scaling',
+        type=float,
+        default=cfg.max_loss_scaling,
+        help='The maximum loss scaling factor.')
+    parser.add_argument(
+        '--incr_every_n_steps',
+        type=int,
+        default=cfg.incr_every_n_steps,
+        help='Increases loss scaling every n consecutive steps with finite gradients.'
+    )
+    parser.add_argument(
+        '--decr_every_n_nan_or_inf',
+        type=int,
+        default=cfg.decr_every_n_nan_or_inf,
+        help='Decreases loss scaling every n accumulated steps with nan or inf gradients.'
+    )
+    parser.add_argument(
+        '--incr_ratio',
+        type=float,
+        default=cfg.incr_ratio,
+        help='The multiplier to use when increasing the loss scaling.')
+    parser.add_argument(
+        '--decr_ratio',
+        type=float,
+        default=cfg.decr_ratio,
+        help='The less-than-one-multiplier to use when decreasing the loss scaling.'
+    )
+    parser.add_argument(
+        '--use_dynamic_loss_scaling',
+        type=str2bool,
+        default=cfg.use_dynamic_loss_scaling,
+        help='Whether to use dynamic loss scaling.')
+    parser.add_argument(
+        '--custom_white_list',
+        type=tostrlist,
+        default=cfg.custom_white_list,
+        help='fp16 custom white list.')
+    parser.add_argument(
+        '--custom_black_list',
+        type=tostrlist,
+        default=cfg.custom_black_list,
+        help='fp16 custom black list.')
+
+    # Optimizer setting
+    parser.add_argument(
+        '--lr', type=float, default=cfg.lr, help='learning rate')
+    parser.add_argument(
+        '--lr_decay',
+        type=float,
+        default=cfg.lr_decay,
+        help='learning rate decay factor')
+    parser.add_argument(
+        '--weight_decay',
+        type=float,
+        default=cfg.weight_decay,
+        help='weight decay')
+    parser.add_argument(
+        '--momentum', type=float, default=cfg.momentum, help='sgd momentum')
+    parser.add_argument(
+        '--train_unit',
+        type=str,
+        default=cfg.train_unit,
+        help='train unit, "step" or "epoch"')
+    parser.add_argument(
+        '--warmup_num',
+        type=int,
+        default=cfg.warmup_num,
+        help='warmup num according train unit')
+    parser.add_argument(
+        '--train_num',
+        type=int,
+        default=cfg.train_num,
+        help='train num according train unit')
+    parser.add_argument(
+        '--decay_boundaries',
+        type=tointlist,
+        default=cfg.decay_boundaries,
+        help='piecewise decay boundaries')
+
+    # Train dataset setting
+    parser.add_argument(
+        '--use_synthetic_dataset',
+        type=str2bool,
+        default=cfg.use_synthetic_dataset,
+        help='whether to use synthetic dataset')
+    parser.add_argument(
+        '--dataset', type=str, default=cfg.dataset, help='train dataset name')
+    parser.add_argument(
+        '--data_dir',
+        type=str,
+        default=cfg.data_dir,
+        help='train dataset directory')
+    parser.add_argument(
+        '--label_file',
+        type=str,
+        default=cfg.label_file,
+        help='train label file name, each line split by "\t"')
+    parser.add_argument(
+        '--is_bin',
+        type=str2bool,
+        default=cfg.is_bin,
+        help='whether the train data is bin or original image file')
+    parser.add_argument(
+        '--num_classes',
+        type=int,
+        default=cfg.num_classes,
+        help='classes of train dataset')
+    parser.add_argument(
+        '--batch_size',
+        type=int,
+        default=cfg.batch_size,
+        help='batch size of each rank')
+    parser.add_argument(
+        '--num_workers',
+        type=int,
+        default=cfg.num_workers,
+        help='the number workers of DataLoader')
+
+    # Validation dataset setting
+    parser.add_argument(
+        '--do_validation_while_train',
+        type=str2bool,
+        default=cfg.do_validation_while_train,
+        help='do validation while train')
+    parser.add_argument(
+        '--validation_interval_step',
+        type=int,
+        default=cfg.validation_interval_step,
+        help='validation interval step')
+    parser.add_argument(
+        '--val_targets',
+        type=tostrlist,
+        default=cfg.val_targets,
+        help='val targets, list or str split by comma')
+
+    # IO setting
+    parser.add_argument(
+        '--logdir', type=str, default=cfg.logdir, help='log dir')
+    parser.add_argument(
+        '--log_interval_step',
+        type=int,
+        default=cfg.log_interval_step,
+        help='log interval step')
+    parser.add_argument(
+        '--output', type=str, default=cfg.output, help='output dir')
+    parser.add_argument(
+        '--resume', type=str2bool, default=cfg.resume, help='model resuming')
+    parser.add_argument(
+        '--checkpoint_dir',
+        type=str,
+        default=cfg.checkpoint_dir,
+        help='checkpoint direcotry')
+    parser.add_argument(
+        '--max_num_last_checkpoint',
+        type=int,
+        default=cfg.max_num_last_checkpoint,
+        help='the maximum number of lastest checkpoint to keep')
+
+    args = parser.parse_args(namespace=user_namespace)
+    return args
diff --git a/recognition/arcface_paddle/configs/config.py b/recognition/arcface_paddle/configs/config.py
new file mode 100644
index 0000000..efd126f
--- /dev/null
+++ b/recognition/arcface_paddle/configs/config.py
@@ -0,0 +1,65 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from easydict import EasyDict as edict
+
+config = edict()
+config.is_static = True
+config.backbone = 'FresResNet100'
+config.classifier = 'LargeScaleClassifier'
+config.embedding_size = 512
+config.model_parallel = True
+config.sample_ratio = 0.1
+config.loss = 'ArcFace'
+config.dropout = 0.0
+
+config.fp16 = True
+config.init_loss_scaling = 128.0
+config.max_loss_scaling = 128.0
+config.incr_every_n_steps = 2000
+config.decr_every_n_nan_or_inf = 1
+config.incr_ratio = 2.0
+config.decr_ratio = 0.5
+config.use_dynamic_loss_scaling = True
+config.custom_white_list = []
+config.custom_black_list = []
+
+config.lr = 0.1  # for global batch size = 512
+config.lr_decay = 0.1
+config.weight_decay = 5e-4
+config.momentum = 0.9
+config.train_unit = 'step'  # 'step' or 'epoch'
+config.warmup_num = 1000
+config.train_num = 180000
+config.decay_boundaries = [100000, 140000, 160000]
+
+config.use_synthetic_dataset = False
+config.dataset = "MS1M_v3"
+config.data_dir = "./MS1M_v3"
+config.label_file = "./MS1M_v3/label.txt"
+config.is_bin = False
+config.num_classes = 93431  # 85742 for MS1M_v2, 93431 for MS1M_v3
+config.batch_size = 64  # global batch size 512 of 8 GPU
+config.num_workers = 8
+
+config.do_validation_while_train = True
+config.validation_interval_step = 2000
+config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
+
+config.logdir = './log'
+config.log_interval_step = 10
+config.output = './MS1M_v3_arcface'
+config.resume = False
+config.checkpoint_dir = None
+config.max_num_last_checkpoint = 3
diff --git a/recognition/arcface_paddle/configs/ms1mv3_r100.py b/recognition/arcface_paddle/configs/ms1mv3_r100.py
new file mode 100644
index 0000000..75d200d
--- /dev/null
+++ b/recognition/arcface_paddle/configs/ms1mv3_r100.py
@@ -0,0 +1,54 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from easydict import EasyDict as edict
+
+config = edict()
+config.is_static = True
+config.backbone = 'FresResNet100'
+config.classifier = 'LargeScaleClassifier'
+config.embedding_size = 512
+config.model_parallel = True
+config.sample_ratio = 0.1
+config.loss = 'ArcFace'
+config.dropout = 0.0
+
+config.lr = 0.1  # for global batch size = 512
+config.lr_decay = 0.1
+config.weight_decay = 5e-4
+config.momentum = 0.9
+config.train_unit = 'epoch'  # 'step' or 'epoch'
+config.warmup_num = 0
+config.train_num = 25
+config.decay_boundaries = [10, 16, 22]
+
+config.use_synthetic_dataset = False
+config.dataset = "MS1M_v3"
+config.data_dir = "./MS1M_v3"
+config.label_file = "./MS1M_v3/label.txt"
+config.is_bin = False
+config.num_classes = 93431  # 85742 for MS1M_v2, 93431 for MS1M_v3
+config.batch_size = 128  # global batch size 512 of 8 GPU
+config.num_workers = 8
+
+config.do_validation_while_train = True
+config.validation_interval_step = 2000
+config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
+
+config.logdir = './log'
+config.log_interval_step = 100
+config.output = './MS1M_v3_arcface'
+config.resume = False
+config.checkpoint_dir = None
+config.max_num_last_checkpoint = 1
diff --git a/recognition/arcface_paddle/configs/ms1mv3_r50.py b/recognition/arcface_paddle/configs/ms1mv3_r50.py
new file mode 100644
index 0000000..e5f556c
--- /dev/null
+++ b/recognition/arcface_paddle/configs/ms1mv3_r50.py
@@ -0,0 +1,54 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from easydict import EasyDict as edict
+
+config = edict()
+config.is_static = True
+config.backbone = 'FresResNet50'
+config.classifier = 'LargeScaleClassifier'
+config.embedding_size = 512
+config.model_parallel = True
+config.sample_ratio = 0.1
+config.loss = 'ArcFace'
+config.dropout = 0.0
+
+config.lr = 0.1  # for global batch size = 512
+config.lr_decay = 0.1
+config.weight_decay = 5e-4
+config.momentum = 0.9
+config.train_unit = 'epoch'  # 'step' or 'epoch'
+config.warmup_num = 0
+config.train_num = 25
+config.decay_boundaries = [10, 16, 22]
+
+config.use_synthetic_dataset = False
+config.dataset = "MS1M_v3"
+config.data_dir = "./MS1M_v3"
+config.label_file = "./MS1M_v3/label.txt"
+config.is_bin = False
+config.num_classes = 93431  # 85742 for MS1M_v2, 93431 for MS1M_v3
+config.batch_size = 128  # global batch size 512 of 8 GPU
+config.num_workers = 8
+
+config.do_validation_while_train = True
+config.validation_interval_step = 2000
+config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
+
+config.logdir = './log'
+config.log_interval_step = 100
+config.output = './MS1M_v3_arcface'
+config.resume = False
+config.checkpoint_dir = None
+config.max_num_last_checkpoint = 1
diff --git a/recognition/arcface_paddle/dataloader/common_dataset.py b/recognition/arcface_paddle/dataloader/common_dataset.py
deleted file mode 100644
index 754d4f5..0000000
--- a/recognition/arcface_paddle/dataloader/common_dataset.py
+++ /dev/null
@@ -1,72 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-from paddle.io import Dataset
-from paddle.vision import transforms
-import os
-import cv2
-from PIL import Image
-import random
-import paddle
-import numpy as np
-
-from dataloader.kv_helper import read_img_from_bin
-
-
-class CommonDataset(Dataset):
-    def __init__(self, root_dir, label_file, is_bin=True):
-        super(CommonDataset, self).__init__()
-        self.root_dir = root_dir
-        self.label_file = label_file
-        self.full_lines = self.get_file_list(label_file)
-        self.delimiter = "\t"
-        self.is_bin = is_bin
-        self.transform = transforms.Compose([
-            transforms.RandomHorizontalFlip(),
-            transforms.ToTensor(),
-            transforms.Normalize(
-                mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]),
-        ])
-
-        self.num_samples = len(self.full_lines)
-
-    def get_file_list(self, label_file):
-        with open(label_file, "r") as fin:
-            full_lines = fin.readlines()
-
-        print("finish reading file, image num: {}".format(len(full_lines)))
-        return full_lines
-
-    def __getitem__(self, idx):
-        try:
-            line = self.full_lines[idx]
-
-            img_path, label = line.split(self.delimiter)
-            label = int(label)
-            label = paddle.to_tensor(label, dtype='int64')
-            img_path = os.path.join(self.root_dir, img_path)
-            if self.is_bin:
-                img = read_img_from_bin(img_path)
-            else:
-                img = cv2.imread(img_path)
-            img = img[:, :, ::-1]
-            img = self.transform(img)
-            return img, label
-
-        except Exception as e:
-            print("data read faild: {}, exception info: {}".format(line, e))
-            return self.__getitem__(random.randint(0, len(self)))
-
-    def __len__(self):
-        return self.num_samples
diff --git a/recognition/arcface_paddle/config.py b/recognition/arcface_paddle/datasets/__init__.py
similarity index 54%
rename from recognition/arcface_paddle/config.py
rename to recognition/arcface_paddle/datasets/__init__.py
index 49ee19b..c97f8e4 100644
--- a/recognition/arcface_paddle/config.py
+++ b/recognition/arcface_paddle/datasets/__init__.py
@@ -1,33 +1,15 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-from easydict import EasyDict as edict
-
-config = edict()
-config.sample_rate = 1
-config.momentum = 0.9
-
-config.data_dir = "./MS1M_bin"
-config.file_list = "MS1M_bin/label.txt"
-config.num_classes = 85742
-config.num_epoch = 32
-config.warmup_epoch = 1
-config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
-
-def lr_step_func(epoch):
-    return ((epoch + 1) / (4 + 1))**2 if epoch < -1 else 0.1**len(
-        [m for m in [6, 12, 18, 24] if m - 1 <= epoch])
-
-
-config.lr_func = lr_step_func
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from .common_dataset import CommonDataset, SyntheticDataset, load_bin
diff --git a/recognition/arcface_paddle/datasets/common_dataset.py b/recognition/arcface_paddle/datasets/common_dataset.py
new file mode 100644
index 0000000..e1d5a15
--- /dev/null
+++ b/recognition/arcface_paddle/datasets/common_dataset.py
@@ -0,0 +1,134 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import pickle
+import paddle
+import os
+import cv2
+import six
+import random
+import paddle
+import numpy as np
+import logging
+from PIL import Image
+from io import BytesIO
+
+from datasets.kv_helper import read_img_from_bin
+
+
+def transform(img):
+    # random horizontal flip
+    if random.randint(0, 1) == 0:
+        img = cv2.flip(img, 1)
+    # normalize to mean 0.5, std 0.5
+    img = (img - 127.5) * 0.00784313725
+    # BGR2RGB
+    img = img[:, :, ::-1]
+    img = img.transpose((2, 0, 1))
+    return img
+
+
+class CommonDataset(paddle.io.Dataset):
+    def __init__(self, root_dir, label_file, fp16=False, is_bin=True):
+        super(CommonDataset, self).__init__()
+        self.root_dir = root_dir
+        self.label_file = label_file
+        self.fp16 = fp16
+        with open(label_file, "r") as fin:
+            self.full_lines = fin.readlines()
+
+        self.delimiter = "\t"
+        self.is_bin = is_bin
+
+        self.num_samples = len(self.full_lines)
+        logging.info("read label file finished, total num: {}"
+                     .format(self.num_samples))
+
+    def __getitem__(self, idx):
+
+        line = self.full_lines[idx]
+
+        img_path, label = line.strip().split(self.delimiter)
+        img_path = os.path.join(self.root_dir, img_path)
+        if self.is_bin:
+            img = read_img_from_bin(img_path)
+        else:
+            img = cv2.imread(img_path)
+
+        img = transform(img)
+
+        img = img.astype('float16' if self.fp16 else 'float32')
+        label = np.int32(label)
+
+        return img, label
+
+    def __len__(self):
+        return self.num_samples
+
+
+class SyntheticDataset(paddle.io.Dataset):
+    def __init__(self, num_classes, fp16=False):
+        super(SyntheticDataset, self).__init__()
+        self.num_classes = num_classes
+        self.fp16 = fp16
+        self.label_list = np.random.randint(
+            0, num_classes, (5179510, ), dtype=np.int32)
+        self.num_samples = len(self.label_list)
+
+    def __getitem__(self, idx):
+        label = self.label_list[idx]
+        img = np.random.randint(0, 255, size=(112, 112, 3), dtype=np.uint8)
+        img = transform(img)
+
+        img = img.astype('float16' if self.fp16 else 'float32')
+        label = np.int32(label)
+
+        return img, label
+
+    def __len__(self):
+        return self.num_samples
+
+
+# 返回为 numpy
+def load_bin(path, image_size):
+    if six.PY2:
+        bins, issame_list = pickle.load(open(path, 'rb'))
+    else:
+        bins, issame_list = pickle.load(open(path, 'rb'), encoding='bytes')
+    data_list = []
+    for flip in [0, 1]:
+        data = np.empty(
+            (len(issame_list) * 2, 3, image_size[0], image_size[1]))
+        data_list.append(data)
+    for i in range(len(issame_list) * 2):
+        _bin = bins[i]
+        if six.PY2:
+            if not isinstance(_bin, six.string_types):
+                _bin = _bin.tostring()
+            img_ori = Image.open(StringIO(_bin))
+        else:
+            img_ori = Image.open(BytesIO(_bin))
+        for flip in [0, 1]:
+            img = img_ori.copy()
+            if flip == 1:
+                img = img.transpose(Image.FLIP_LEFT_RIGHT)
+            if img.mode != 'RGB':
+                img = img.convert('RGB')
+            img = np.array(img).astype('float32').transpose((2, 0, 1))
+            img = (img - 127.5) * 0.00784313725
+            data_list[flip][i][:] = img
+        if i % 1000 == 0:
+            print('loading bin', i)
+    print(data_list[0].shape)
+    return data_list, issame_list
diff --git a/recognition/arcface_paddle/dataloader/kv_helper.py b/recognition/arcface_paddle/datasets/kv_helper.py
similarity index 99%
rename from recognition/arcface_paddle/dataloader/kv_helper.py
rename to recognition/arcface_paddle/datasets/kv_helper.py
index 4be675d..43bff7c 100644
--- a/recognition/arcface_paddle/dataloader/kv_helper.py
+++ b/recognition/arcface_paddle/datasets/kv_helper.py
@@ -65,4 +65,4 @@ def read_img_from_bin(input_path):
         value = pickle.loads(value)
         value = np.frombuffer(value, dtype='uint8')
         img = cv2.imdecode(value, 1)
-    return img
\ No newline at end of file
+    return img
diff --git a/recognition/arcface_paddle/backbones/__init__.py b/recognition/arcface_paddle/dynamic/backbones/__init__.py
similarity index 89%
rename from recognition/arcface_paddle/backbones/__init__.py
rename to recognition/arcface_paddle/dynamic/backbones/__init__.py
index 8a69def..4e51edd 100644
--- a/recognition/arcface_paddle/backbones/__init__.py
+++ b/recognition/arcface_paddle/dynamic/backbones/__init__.py
@@ -13,4 +13,4 @@
 # limitations under the License.
 
 from .mobilefacenet import MobileFaceNet_128
-from .iresnet import iresnet18, iresnet34, iresnet50, iresnet100, iresnet200
+from .iresnet import FresResNet50, FresResNet100
diff --git a/recognition/arcface_paddle/dynamic/backbones/iresnet.py b/recognition/arcface_paddle/dynamic/backbones/iresnet.py
new file mode 100644
index 0000000..9424a5f
--- /dev/null
+++ b/recognition/arcface_paddle/dynamic/backbones/iresnet.py
@@ -0,0 +1,337 @@
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+import paddle
+from paddle import ParamAttr
+import paddle.nn as nn
+import paddle.nn.functional as F
+from paddle.nn import Conv2D, BatchNorm, Linear, Dropout, PReLU
+from paddle.nn import AdaptiveAvgPool2D, MaxPool2D, AvgPool2D
+from paddle.nn.initializer import XavierNormal, Constant
+
+import math
+
+__all__ = ["FresResNet50", "FresResNet100"]
+
+
+class ConvBNLayer(nn.Layer):
+    def __init__(self,
+                 num_channels,
+                 num_filters,
+                 filter_size,
+                 stride=1,
+                 groups=1,
+                 act=None,
+                 name=None,
+                 data_format="NCHW"):
+        super(ConvBNLayer, self).__init__()
+
+        self._conv = Conv2D(
+            in_channels=num_channels,
+            out_channels=num_filters,
+            kernel_size=filter_size,
+            stride=stride,
+            padding=(filter_size - 1) // 2,
+            groups=groups,
+            weight_attr=ParamAttr(name=name + "_weights"),
+            bias_attr=False,
+            data_format=data_format)
+        if name == "conv1":
+            bn_name = "bn_" + name
+        else:
+            bn_name = "bn" + name[3:]
+        self._batch_norm = BatchNorm(
+            num_filters,
+            act=act,
+            epsilon=1e-05,
+            param_attr=ParamAttr(name=bn_name + "_scale"),
+            bias_attr=ParamAttr(bn_name + "_offset"),
+            moving_mean_name=bn_name + "_mean",
+            moving_variance_name=bn_name + "_variance",
+            data_layout=data_format)
+
+    def forward(self, inputs):
+        y = self._conv(inputs)
+        y = self._batch_norm(y)
+        return y
+
+
+class BasicBlock(nn.Layer):
+    def __init__(self,
+                 num_channels,
+                 num_filters,
+                 stride,
+                 shortcut=True,
+                 name=None,
+                 data_format="NCHW"):
+        super(BasicBlock, self).__init__()
+        self.stride = stride
+        bn_name = "bn_" + name[3:] + "_before"
+        self._batch_norm = BatchNorm(
+            num_channels,
+            act=None,
+            epsilon=1e-05,
+            param_attr=ParamAttr(name=bn_name + "_scale"),
+            bias_attr=ParamAttr(bn_name + "_offset"),
+            moving_mean_name=bn_name + "_mean",
+            moving_variance_name=bn_name + "_variance",
+            data_layout=data_format)
+
+        self.conv0 = ConvBNLayer(
+            num_channels=num_channels,
+            num_filters=num_filters,
+            filter_size=3,
+            stride=1,
+            act=None,
+            name=name + "_branch2a",
+            data_format=data_format)
+        self.prelu = PReLU(num_parameters=1, name=name + "_branch2a_prelu")
+        self.conv1 = ConvBNLayer(
+            num_channels=num_filters,
+            num_filters=num_filters,
+            filter_size=3,
+            stride=stride,
+            act=None,
+            name=name + "_branch2b",
+            data_format=data_format)
+
+        if shortcut:
+            self.short = ConvBNLayer(
+                num_channels=num_channels,
+                num_filters=num_filters,
+                filter_size=1,
+                stride=stride,
+                act=None,
+                name=name + "_branch1",
+                data_format=data_format)
+
+        self.shortcut = shortcut
+
+    def forward(self, inputs):
+        y = self._batch_norm(inputs)
+        y = self.conv0(y)
+        y = self.prelu(y)
+        conv1 = self.conv1(y)
+
+        if self.shortcut:
+            short = self.short(inputs)
+        else:
+            short = inputs
+        y = paddle.add(x=short, y=conv1)
+        return y
+
+
+class FC(nn.Layer):
+    def __init__(self,
+                 bn_channels,
+                 num_channels,
+                 num_classes,
+                 fc_type,
+                 dropout=0.4,
+                 name=None,
+                 data_format="NCHW"):
+        super(FC, self).__init__()
+        self.p = dropout
+        self.fc_type = fc_type
+        self.num_channels = num_channels
+
+        bn_name = "bn_" + name
+        if fc_type == "Z":
+            self._batch_norm_1 = BatchNorm(
+                bn_channels,
+                act=None,
+                epsilon=1e-05,
+                param_attr=ParamAttr(name=bn_name + "_1_scale"),
+                bias_attr=ParamAttr(bn_name + "_1_offset"),
+                moving_mean_name=bn_name + "_1_mean",
+                moving_variance_name=bn_name + "_1_variance",
+                data_layout=data_format)
+            if self.p > 0:
+                self.dropout = Dropout(p=self.p, name=name + '_dropout')
+
+        elif fc_type == "E":
+            self._batch_norm_1 = BatchNorm(
+                bn_channels,
+                act=None,
+                epsilon=1e-05,
+                param_attr=ParamAttr(name=bn_name + "_1_scale"),
+                bias_attr=ParamAttr(bn_name + "_1_offset"),
+                moving_mean_name=bn_name + "_1_mean",
+                moving_variance_name=bn_name + "_1_variance",
+                data_layout=data_format)
+            if self.p > 0:
+                self.dropout = Dropout(p=self.p, name=name + '_dropout')
+            self.fc = Linear(
+                num_channels,
+                num_classes,
+                weight_attr=ParamAttr(
+                    initializer=XavierNormal(fan_in=0.0), name=name + ".w_0"),
+                bias_attr=ParamAttr(
+                    initializer=Constant(), name=name + ".b_0"))
+            self._batch_norm_2 = BatchNorm(
+                num_classes,
+                act=None,
+                epsilon=1e-05,
+                param_attr=ParamAttr(name=bn_name + "_2_scale"),
+                bias_attr=ParamAttr(bn_name + "_2_offset"),
+                moving_mean_name=bn_name + "_2_mean",
+                moving_variance_name=bn_name + "_2_variance",
+                data_layout=data_format)
+
+        elif fc_type == "FC":
+            self._batch_norm_1 = BatchNorm(
+                bn_channels,
+                act=None,
+                epsilon=1e-05,
+                param_attr=ParamAttr(name=bn_name + "_1_scale"),
+                bias_attr=ParamAttr(bn_name + "_1_offset"),
+                moving_mean_name=bn_name + "_1_mean",
+                moving_variance_name=bn_name + "_1_variance",
+                data_layout=data_format)
+            self.fc = Linear(
+                num_channels,
+                num_classes,
+                weight_attr=ParamAttr(
+                    initializer=XavierNormal(fan_in=0.0), name=name + ".w_0"),
+                bias_attr=ParamAttr(
+                    initializer=Constant(), name=name + ".b_0"))
+            self._batch_norm_2 = BatchNorm(
+                num_classes,
+                act=None,
+                epsilon=1e-05,
+                param_attr=ParamAttr(name=bn_name + "_2_scale"),
+                bias_attr=ParamAttr(bn_name + "_2_offset"),
+                moving_mean_name=bn_name + "_2_mean",
+                moving_variance_name=bn_name + "_2_variance",
+                data_layout=data_format)
+
+    def forward(self, inputs):
+        if self.fc_type == "Z":
+            y = self._batch_norm_1(inputs)
+            y = paddle.reshape(y, shape=[-1, self.num_channels])
+            if self.p > 0:
+                y = self.dropout(y)
+
+        elif self.fc_type == "E":
+            y = self._batch_norm_1(inputs)
+            y = paddle.reshape(y, shape=[-1, self.num_channels])
+            if self.p > 0:
+                y = self.dropout(y)
+            y = self.fc(y)
+            y = self._batch_norm_2(y)
+
+        elif self.fc_type == "FC":
+            y = self._batch_norm_1(inputs)
+            y = paddle.reshape(y, shape=[-1, self.num_channels])
+            y = self.fc(y)
+            y = self._batch_norm_2(y)
+
+        return y
+
+
+class FresResNet(nn.Layer):
+    def __init__(self,
+                 layers=50,
+                 num_features=512,
+                 fc_type='E',
+                 dropout=0.4,
+                 input_image_channel=3,
+                 input_image_width=112,
+                 input_image_height=112,
+                 data_format="NCHW"):
+
+        super(FresResNet, self).__init__()
+
+        self.layers = layers
+        self.data_format = data_format
+        self.input_image_channel = input_image_channel
+
+        supported_layers = [50, 100]
+        assert layers in supported_layers, \
+            "supported layers are {} but input layer is {}".format(
+                supported_layers, layers)
+
+        if layers == 50:
+            units = [3, 4, 14, 3]
+        elif layers == 100:
+            units = [3, 13, 30, 3]
+
+        num_channels = [64, 64, 128, 256]
+        num_filters = [64, 128, 256, 512]
+
+        self.conv = ConvBNLayer(
+            num_channels=self.input_image_channel,
+            num_filters=64,
+            filter_size=3,
+            stride=1,
+            act=None,
+            name="conv1",
+            data_format=self.data_format)
+        self.prelu = PReLU(num_parameters=1, name="prelu1")
+
+        self.block_list = paddle.nn.LayerList()
+        for block in range(len(units)):
+            shortcut = True
+            for i in range(units[block]):
+                conv_name = "res" + str(block + 2) + chr(97 + i)
+                basic_block = self.add_sublayer(
+                    conv_name,
+                    BasicBlock(
+                        num_channels=num_channels[block]
+                        if i == 0 else num_filters[block],
+                        num_filters=num_filters[block],
+                        stride=2 if shortcut else 1,
+                        shortcut=shortcut,
+                        name=conv_name,
+                        data_format=self.data_format))
+                self.block_list.append(basic_block)
+                shortcut = False
+
+        assert input_image_width % 16 == 0
+        assert input_image_height % 16 == 0
+        feat_w = input_image_width // 16
+        feat_h = input_image_height // 16
+        self.fc_channels = num_filters[-1] * feat_w * feat_h
+        self.fc = FC(num_filters[-1],
+                     self.fc_channels,
+                     num_features,
+                     fc_type,
+                     dropout,
+                     name='fc')
+
+    def forward(self, inputs):
+        if self.data_format == "NHWC":
+            inputs = paddle.tensor.transpose(inputs, [0, 2, 3, 1])
+            inputs.stop_gradient = True
+        y = self.conv(inputs)
+        y = self.prelu(y)
+        for block in self.block_list:
+            y = block(y)
+        y = self.fc(y)
+        return y
+
+
+def FresResNet50(**args):
+    model = FresResNet(layers=50, **args)
+    return model
+
+
+def FresResNet100(**args):
+    model = FresResNet(layers=100, **args)
+    return model
diff --git a/recognition/arcface_paddle/backbones/mobilefacenet.py b/recognition/arcface_paddle/dynamic/backbones/mobilefacenet.py
similarity index 97%
rename from recognition/arcface_paddle/backbones/mobilefacenet.py
rename to recognition/arcface_paddle/dynamic/backbones/mobilefacenet.py
index 696eb85..251223d 100644
--- a/recognition/arcface_paddle/backbones/mobilefacenet.py
+++ b/recognition/arcface_paddle/dynamic/backbones/mobilefacenet.py
@@ -147,8 +147,8 @@ class MobileFaceNet(nn.Layer):
         return x
 
 
-def MobileFaceNet_128(feature_dim=128, **args):
-    model = MobileFaceNet(feature_dim=feature_dim, **args)
+def MobileFaceNet_128(num_features=128, **args):
+    model = MobileFaceNet(feature_dim=num_features, **args)
     return model
 
 
diff --git a/recognition/arcface_paddle/dynamic/classifiers/__init__.py b/recognition/arcface_paddle/dynamic/classifiers/__init__.py
new file mode 100644
index 0000000..0c0e52f
--- /dev/null
+++ b/recognition/arcface_paddle/dynamic/classifiers/__init__.py
@@ -0,0 +1,15 @@
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from .lsc import LargeScaleClassifier
diff --git a/recognition/arcface_paddle/dynamic/classifiers/lsc.py b/recognition/arcface_paddle/dynamic/classifiers/lsc.py
new file mode 100644
index 0000000..bf0c4cb
--- /dev/null
+++ b/recognition/arcface_paddle/dynamic/classifiers/lsc.py
@@ -0,0 +1,163 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import warnings
+import math
+import os
+import paddle
+import paddle.nn as nn
+
+
+class LargeScaleClassifier(nn.Layer):
+    """
+    Author: {Xiang An, Yang Xiao, XuHan Zhu} in DeepGlint,
+    Partial FC: Training 10 Million Identities on a Single Machine
+    See the original paper:
+    https://arxiv.org/abs/2010.05222
+    """
+
+    @paddle.no_grad()
+    def __init__(self,
+                 rank,
+                 world_size,
+                 num_classes,
+                 margin1=1.0,
+                 margin2=0.5,
+                 margin3=0.0,
+                 scale=64.0,
+                 sample_ratio=1.0,
+                 embedding_size=512,
+                 fp16=False,
+                 name=None):
+        super(LargeScaleClassifier, self).__init__()
+        self.num_classes: int = num_classes
+        self.rank: int = rank
+        self.world_size: int = world_size
+        self.sample_ratio: float = sample_ratio
+        self.embedding_size: int = embedding_size
+        self.fp16 = fp16
+        self.num_local: int = (num_classes + world_size - 1) // world_size
+        if num_classes % world_size != 0 and rank == world_size - 1:
+            self.num_local = num_classes % self.num_local
+        self.num_sample: int = int(self.sample_ratio * self.num_local)
+        self.margin1 = margin1
+        self.margin2 = margin2
+        self.margin3 = margin3
+        self.logit_scale = scale
+
+        self._parameter_list = []
+
+        if name is None:
+            name = 'dist@fc@rank@%05d.w' % rank
+
+        stddev = math.sqrt(2.0 / (self.embedding_size + self.num_local))
+        param_attr = paddle.ParamAttr(
+            name=name, initializer=paddle.nn.initializer.Normal(std=stddev))
+
+        self.index = None
+        self.weight = self.create_parameter(
+            shape=[self.embedding_size, self.num_local],
+            attr=param_attr,
+            is_bias=False,
+            dtype='float16' if self.fp16 else 'float32')
+        self.weight.is_distributed = True
+
+        if int(self.sample_ratio) < 1:
+            self.weight.stop_gradient = True
+
+    def step(self, optimizer):
+        warnings.warn(
+            "Explicitly call the function paddle._C_ops.sparse_momentum is a temporary manner. "
+            "We will merge it to optimizer in the future, please don't follow.")
+        if int(self.sample_ratio) < 1:
+            found_inf = paddle.logical_not(
+                paddle.all(paddle.isfinite(self._parameter_list[0].grad)))
+            if found_inf:
+                print('Found inf or nan in classifier')
+            else:
+                if self.weight.name not in optimizer._accumulators[
+                        optimizer._velocity_acc_str]:
+                    optimizer._add_accumulator(optimizer._velocity_acc_str,
+                                               self.weight)
+
+                velocity = optimizer._accumulators[
+                    optimizer._velocity_acc_str][self.weight.name]
+                _, _ = paddle._C_ops.sparse_momentum(
+                    self.weight,
+                    self._parameter_list[0].grad,
+                    velocity,
+                    self.index,
+                    paddle.to_tensor(
+                        optimizer.get_lr(), dtype='float32'),
+                    self.weight,
+                    velocity,
+                    'mu',
+                    optimizer._momentum,
+                    'use_nesterov',
+                    optimizer._use_nesterov,
+                    'regularization_method',
+                    optimizer._regularization_method,
+                    'regularization_coeff',
+                    optimizer._regularization_coeff,
+                    'axis',
+                    1)
+
+    def clear_grad(self):
+        self._parameter_list = []
+
+    def forward(self, feature, label):
+
+        if self.world_size > 1:
+            feature_list = []
+            paddle.distributed.all_gather(feature_list, feature)
+            total_feature = paddle.concat(feature_list, axis=0)
+
+            label_list = []
+            paddle.distributed.all_gather(label_list, label)
+            total_label = paddle.concat(label_list, axis=0)
+            total_label.stop_gradient = True
+        else:
+            total_feature = feature
+            total_label = label
+
+        if self.sample_ratio < 1.0:
+            # partial fc sample process
+            total_label, self.index = paddle.nn.functional.class_center_sample(
+                total_label, self.num_local, self.num_sample)
+            total_label.stop_gradient = True
+            self.index.stop_gradient = True
+            self.sub_weight = paddle.gather(self.weight, self.index, axis=1)
+            self.sub_weight.stop_gradient = False
+            self._parameter_list.append(self.sub_weight)
+        else:
+            self.sub_weight = self.weight
+
+        norm_feature = paddle.fluid.layers.l2_normalize(total_feature, axis=1)
+        norm_weight = paddle.fluid.layers.l2_normalize(self.sub_weight, axis=0)
+
+        local_logit = paddle.matmul(norm_feature, norm_weight)
+
+        loss = paddle.nn.functional.margin_cross_entropy(
+            local_logit,
+            total_label,
+            margin1=self.margin1,
+            margin2=self.margin2,
+            margin3=self.margin3,
+            scale=self.logit_scale,
+            return_softmax=False,
+            reduction=None, )
+
+        loss = paddle.mean(loss)
+
+        return loss
diff --git a/recognition/arcface_paddle/dynamic/export.py b/recognition/arcface_paddle/dynamic/export.py
new file mode 100644
index 0000000..a41d30a
--- /dev/null
+++ b/recognition/arcface_paddle/dynamic/export.py
@@ -0,0 +1,56 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import numpy as np
+import paddle
+
+from .utils.io import Checkpoint
+from . import backbones
+
+
+def export(args):
+    checkpoint = Checkpoint(
+        rank=0,
+        world_size=1,
+        embedding_size=args.embedding_size,
+        num_classes=None,
+        checkpoint_dir=args.checkpoint_dir, )
+
+    backbone = eval("backbones.{}".format(args.backbone))(
+        num_features=args.embedding_size)
+    checkpoint.load(backbone, for_train=False, dtype='float32')
+
+    print("Load checkpoint from '{}'.".format(args.checkpoint_dir))
+    backbone.eval()
+
+    path = os.path.join(args.output_dir, args.backbone)
+
+    if args.export_type == 'onnx':
+        paddle.onnx.export(
+            backbone,
+            path,
+            input_spec=[
+                paddle.static.InputSpec(
+                    shape=[None, 3, 112, 112], dtype='float32')
+            ])
+    else:
+        paddle.jit.save(
+            backbone,
+            path,
+            input_spec=[
+                paddle.static.InputSpec(
+                    shape=[None, 3, 112, 112], dtype='float32')
+            ])
+    print("Save exported model to '{}'.".format(args.output_dir))
diff --git a/recognition/arcface_paddle/dynamic/train.py b/recognition/arcface_paddle/dynamic/train.py
new file mode 100644
index 0000000..25c409c
--- /dev/null
+++ b/recognition/arcface_paddle/dynamic/train.py
@@ -0,0 +1,226 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import time
+import os
+import sys
+import numpy as np
+import logging
+
+import paddle
+from visualdl import LogWriter
+
+from utils.logging import AverageMeter, init_logging, CallBackLogging
+from datasets import CommonDataset, SyntheticDataset
+from utils import losses
+
+from .utils.verification import CallBackVerification
+from .utils.io import Checkpoint
+from .utils.amp import LSCGradScaler
+
+from . import classifiers
+from . import backbones
+
+RELATED_FLAGS_SETTING = {
+    'FLAGS_cudnn_exhaustive_search': 1,
+    'FLAGS_cudnn_batchnorm_spatial_persistent': 1,
+    'FLAGS_max_inplace_grad_add': 8,
+    'FLAGS_fraction_of_gpu_memory_to_use': 0.9999,
+}
+paddle.fluid.set_flags(RELATED_FLAGS_SETTING)
+
+
+def train(args):
+    writer = LogWriter(logdir=args.logdir)
+
+    rank = int(os.getenv("PADDLE_TRAINER_ID", 0))
+    world_size = int(os.getenv("PADDLE_TRAINERS_NUM", 1))
+
+    gpu_id = int(os.getenv("FLAGS_selected_gpus", 0))
+    place = paddle.CUDAPlace(gpu_id)
+
+    if world_size > 1:
+        import paddle.distributed.fleet as fleet
+        from .utils.data_parallel import sync_gradients, sync_params
+
+        strategy = fleet.DistributedStrategy()
+        strategy.without_graph_optimization = True
+        fleet.init(is_collective=True, strategy=strategy)
+
+    if args.use_synthetic_dataset:
+        trainset = SyntheticDataset(args.num_classes, fp16=args.fp16)
+    else:
+        trainset = CommonDataset(
+            root_dir=args.data_dir,
+            label_file=args.label_file,
+            fp16=args.fp16,
+            is_bin=args.is_bin)
+
+    num_image = len(trainset)
+    total_batch_size = args.batch_size * world_size
+    steps_per_epoch = num_image // total_batch_size
+    if args.train_unit == 'epoch':
+        warmup_steps = steps_per_epoch * args.warmup_num
+        total_steps = steps_per_epoch * args.train_num
+        decay_steps = [x * steps_per_epoch for x in args.decay_boundaries]
+        total_epoch = args.train_num
+    else:
+        warmup_steps = args.warmup_num
+        total_steps = args.train_num
+        decay_steps = [x for x in args.decay_boundaries]
+        total_epoch = (total_steps + steps_per_epoch - 1) // steps_per_epoch
+
+    if rank == 0:
+        logging.info('world_size: {}'.format(world_size))
+        logging.info('total_batch_size: {}'.format(total_batch_size))
+        logging.info('warmup_steps: {}'.format(warmup_steps))
+        logging.info('steps_per_epoch: {}'.format(steps_per_epoch))
+        logging.info('total_steps: {}'.format(total_steps))
+        logging.info('total_epoch: {}'.format(total_epoch))
+        logging.info('decay_steps: {}'.format(decay_steps))
+
+    base_lr = total_batch_size * args.lr / 512
+    lr_scheduler = paddle.optimizer.lr.PiecewiseDecay(
+        boundaries=decay_steps,
+        values=[
+            base_lr * (args.lr_decay**i) for i in range(len(decay_steps) + 1)
+        ])
+    if warmup_steps > 0:
+        lr_scheduler = paddle.optimizer.lr.LinearWarmup(
+            lr_scheduler, warmup_steps, 0, base_lr)
+
+    if args.fp16:
+        paddle.set_default_dtype("float16")
+
+    margin_loss_params = eval("losses.{}".format(args.loss))()
+    backbone = eval("backbones.{}".format(args.backbone))(
+        num_features=args.embedding_size, dropout=args.dropout)
+    classifier = eval("classifiers.{}".format(args.classifier))(
+        rank=rank,
+        world_size=world_size,
+        num_classes=args.num_classes,
+        margin1=margin_loss_params.margin1,
+        margin2=margin_loss_params.margin2,
+        margin3=margin_loss_params.margin3,
+        scale=margin_loss_params.scale,
+        sample_ratio=args.sample_ratio,
+        embedding_size=args.embedding_size,
+        fp16=args.fp16)
+
+    backbone.train()
+    classifier.train()
+
+    optimizer = paddle.optimizer.Momentum(
+        parameters=[{
+            'params': backbone.parameters(),
+        }, {
+            'params': classifier.parameters(),
+        }],
+        learning_rate=lr_scheduler,
+        momentum=args.momentum,
+        weight_decay=args.weight_decay)
+
+    if args.fp16:
+        optimizer._dtype = 'float32'
+
+    if world_size > 1:
+        # sync backbone params for data parallel
+        sync_params(backbone.parameters())
+
+    if args.do_validation_while_train:
+        callback_verification = CallBackVerification(
+            args.validation_interval_step,
+            rank,
+            args.batch_size,
+            args.val_targets,
+            args.data_dir,
+            fp16=args.fp16, )
+
+    callback_logging = CallBackLogging(args.log_interval_step, rank,
+                                       world_size, total_steps,
+                                       args.batch_size, writer)
+
+    checkpoint = Checkpoint(
+        rank=rank,
+        world_size=world_size,
+        embedding_size=args.embedding_size,
+        num_classes=args.num_classes,
+        model_save_dir=os.path.join(args.output, args.backbone),
+        checkpoint_dir=args.checkpoint_dir,
+        max_num_last_checkpoint=args.max_num_last_checkpoint)
+
+    start_epoch = 0
+    global_step = 0
+    loss_avg = AverageMeter()
+    if args.resume:
+        extra_info = checkpoint.load(
+            backbone, classifier, optimizer, for_train=True)
+        start_epoch = extra_info['epoch'] + 1
+        lr_state = extra_info['lr_state']
+        # there last_epoch means last_step in for PiecewiseDecay
+        # since we always use step style for lr_scheduler
+        global_step = lr_state['last_epoch']
+        lr_scheduler.set_state_dict(lr_state)
+
+    train_loader = paddle.io.DataLoader(
+        trainset,
+        places=place,
+        num_workers=args.num_workers,
+        batch_sampler=paddle.io.DistributedBatchSampler(
+            dataset=trainset,
+            batch_size=args.batch_size,
+            shuffle=True,
+            drop_last=True))
+
+    scaler = LSCGradScaler(
+        enable=args.fp16,
+        init_loss_scaling=args.init_loss_scaling,
+        incr_ratio=args.incr_ratio,
+        decr_ratio=args.decr_ratio,
+        incr_every_n_steps=args.incr_every_n_steps,
+        decr_every_n_nan_or_inf=args.decr_every_n_nan_or_inf,
+        use_dynamic_loss_scaling=args.use_dynamic_loss_scaling)
+
+    for epoch in range(start_epoch, total_epoch):
+        for step, (img, label) in enumerate(train_loader):
+            global_step += 1
+
+            with paddle.amp.auto_cast(enable=args.fp16):
+                features = backbone(img)
+                loss_v = classifier(features, label)
+
+            scaler.scale(loss_v).backward()
+            if world_size > 1:
+                # data parallel sync backbone gradients
+                sync_gradients(backbone.parameters())
+
+            scaler.step(optimizer)
+            classifier.step(optimizer)
+            optimizer.clear_grad()
+            classifier.clear_grad()
+
+            lr_value = optimizer.get_lr()
+            loss_avg.update(loss_v.item(), 1)
+            callback_logging(global_step, loss_avg, epoch, lr_value)
+            if args.do_validation_while_train:
+                callback_verification(global_step, backbone)
+            lr_scheduler.step()
+
+            if global_step >= total_steps:
+                break
+            sys.stdout.flush()
+
+        checkpoint.save(
+            backbone, classifier, optimizer, epoch=epoch, for_train=True)
+    writer.close()
diff --git a/recognition/arcface_paddle/dataloader/__init__.py b/recognition/arcface_paddle/dynamic/utils/__init__.py
similarity index 93%
rename from recognition/arcface_paddle/dataloader/__init__.py
rename to recognition/arcface_paddle/dynamic/utils/__init__.py
index e37b942..185a92b 100644
--- a/recognition/arcface_paddle/dataloader/__init__.py
+++ b/recognition/arcface_paddle/dynamic/utils/__init__.py
@@ -11,5 +11,3 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-
-from .common_dataset import CommonDataset
diff --git a/recognition/arcface_paddle/dynamic/utils/amp.py b/recognition/arcface_paddle/dynamic/utils/amp.py
new file mode 100644
index 0000000..2c506c6
--- /dev/null
+++ b/recognition/arcface_paddle/dynamic/utils/amp.py
@@ -0,0 +1,103 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from collections import defaultdict
+from paddle.amp import GradScaler
+from paddle import _C_ops
+import paddle
+
+
+class LSCGradScaler(GradScaler):
+    def __init__(self,
+                 enable=True,
+                 init_loss_scaling=2.**15,
+                 incr_ratio=2.0,
+                 decr_ratio=0.5,
+                 incr_every_n_steps=1000,
+                 decr_every_n_nan_or_inf=2,
+                 use_dynamic_loss_scaling=True,
+                 max_loss_scaling=32768.0):
+        super(LSCGradScaler, self).__init__(
+            enable, init_loss_scaling, incr_ratio, decr_ratio,
+            incr_every_n_steps, decr_every_n_nan_or_inf,
+            use_dynamic_loss_scaling)
+        self.max_loss_scaling = max_loss_scaling
+
+    def step(self, optimizer, classifier=None):
+        if not self._enable:
+            if classifier is not None:
+                classifier.step(optimizer)
+            return optimizer.step()
+
+#         if self._scale >= self.max_loss_scaling:
+#             self._scale = paddle.to_tensor([self.max_loss_scaling], dtype='float32')
+
+#  unscale the grad
+        self._unscale(optimizer)
+
+        if self._found_inf:
+            self._cache_founf_inf = True
+        else:
+            optimizer.step()
+            if classifier is not None:
+                classifier.step(optimizer)
+
+            self._cache_founf_inf = False
+
+        if self._use_dynamic_loss_scaling:
+            # update the scale
+            self._update()
+
+    def _unscale(self, optimizer):
+        if not self._enable:
+            return
+
+        param_grads_dict = defaultdict(list)
+        dist_param_grads_dict = defaultdict(list)
+        if getattr(optimizer, '_param_groups', None) and isinstance(
+                optimizer._param_groups[0], dict):
+            for group in optimizer._param_groups:
+                for param in group['params']:
+                    if not param.is_distributed:
+                        if param._grad_ivar() is not None:
+                            param_grads_dict[param._grad_ivar().dtype].append(
+                                param._grad_ivar())
+                    else:
+                        if param._grad_ivar() is not None:
+                            dist_param_grads_dict[param._grad_ivar(
+                            ).dtype].append(param._grad_ivar())
+        else:
+            for param in optimizer._parameter_list:
+                if not param.is_distributed:
+                    if param._grad_ivar() is not None:
+                        param_grads_dict[param._grad_ivar().dtype].append(
+                            param._grad_ivar())
+                else:
+                    if param._grad_ivar() is not None:
+                        dist_param_grads_dict[param._grad_ivar().dtype].append(
+                            param._grad_ivar())
+        for dtype in dist_param_grads_dict:
+            for grad in dist_param_grads_dict[dtype]:
+                self._found_inf = paddle.logical_not(
+                    paddle.all(paddle.isfinite(grad)))
+                if self._found_inf:
+                    return
+
+        for dtype in param_grads_dict:
+            param_grads = param_grads_dict[dtype]
+            _C_ops.check_finite_and_unscale(param_grads, self._scale,
+                                            param_grads, self._found_inf)
+            if self._found_inf:
+                print('Found inf or nan in backbone, dtype is', dtype)
+                break
diff --git a/recognition/arcface_paddle/dynamic/utils/data_parallel.py b/recognition/arcface_paddle/dynamic/utils/data_parallel.py
new file mode 100644
index 0000000..baa4372
--- /dev/null
+++ b/recognition/arcface_paddle/dynamic/utils/data_parallel.py
@@ -0,0 +1,56 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import paddle
+
+
+@paddle.no_grad()
+def sync_params(parameters):
+    for param in parameters:
+        paddle.distributed.broadcast(
+            param.detach(), src=0, group=None, use_calc_stream=True)
+
+
+@paddle.no_grad()
+def sync_gradients(parameters):
+    grad_var_set = set()
+    grad_vars = []
+    sparse_grad_vars = []
+
+    for param in parameters:
+        if param.trainable and (param._grad_ivar() is not None):
+            g_var = param._grad_ivar()
+            assert not g_var._is_sparse(
+            ), "Now, it doesn't support sparse parameters"
+            grad_vars.append(g_var)
+            assert g_var not in grad_var_set
+            grad_var_set.add(g_var)
+
+    coalesced_grads_and_vars = \
+        paddle.fluid.dygraph.parallel.build_groups(grad_vars, 128 * 1024 * 1024)
+
+    nranks = paddle.distributed.get_world_size()
+    for coalesced_grad, _, _ in coalesced_grads_and_vars:
+        # need to div nranks
+        div_factor = paddle.to_tensor(nranks, dtype=coalesced_grad.dtype)
+        paddle.fluid.framework._dygraph_tracer().trace_op(
+            type="elementwise_div",
+            inputs={'X': coalesced_grad,
+                    'Y': div_factor},
+            outputs={'Out': coalesced_grad},
+            attrs={'axis': -1})
+
+        paddle.distributed.all_reduce(coalesced_grad)
+
+    paddle.fluid.dygraph.parallel._split_tensors(coalesced_grads_and_vars)
diff --git a/recognition/arcface_paddle/dynamic/utils/io.py b/recognition/arcface_paddle/dynamic/utils/io.py
new file mode 100644
index 0000000..8135c1a
--- /dev/null
+++ b/recognition/arcface_paddle/dynamic/utils/io.py
@@ -0,0 +1,239 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import errno
+import os
+import paddle
+import logging
+import numpy as np
+import shutil
+import json
+from utils.rearrange_weight import rearrange_weight
+
+
+class Checkpoint(object):
+    def __init__(self,
+                 rank,
+                 world_size,
+                 embedding_size,
+                 num_classes,
+                 model_save_dir="./",
+                 checkpoint_dir=None,
+                 max_num_last_checkpoint=3):
+
+        self.rank: int = rank
+        self.world_size: int = world_size
+        self.embedding_size: int = embedding_size
+        self.num_classes: int = num_classes
+        self.model_save_dir: str = model_save_dir
+        self.checkpoint_dir: str = checkpoint_dir
+        self.max_num_last_checkpoint: int = max_num_last_checkpoint
+
+    def save(self,
+             backbone: paddle.nn.Layer,
+             classifier: paddle.nn.Layer=None,
+             optimizer=None,
+             epoch=0,
+             for_train=True):
+
+        model_save_dir = os.path.join(self.model_save_dir, str(epoch))
+        if not os.path.exists(model_save_dir):
+            # may be more than one processes trying
+            # to create the directory
+            try:
+                os.makedirs(model_save_dir)
+            except OSError as exc:
+                if exc.errno != errno.EEXIST:
+                    raise
+                pass
+
+        if self.rank == 0:
+            # for non dist param, we only save their at rank 0.
+            for name, param in backbone.state_dict().items():
+                paddle.save(
+                    param,
+                    os.path.join(model_save_dir, param.name + '.pdparam'))
+
+        if classifier is not None:
+            # for dist param, we need to save their at all ranks.
+            for name, param in classifier.state_dict().items():
+                paddle.save(
+                    param,
+                    os.path.join(model_save_dir, param.name + '.pdparam'))
+
+        if for_train:
+            assert optimizer is not None
+            opt_state_dict = optimizer.state_dict()
+            lr_state_dict = opt_state_dict['LR_Scheduler']
+            for name, opt in opt_state_dict.items():
+                if '@GRAD' in name:
+                    continue
+                # for non dist opt var, we only save their at rank 0,
+                # but for dist opt var, we need to save their at all ranks.
+                if 'dist@' in name and '@rank@' in name or self.rank == 0:
+                    paddle.save(opt,
+                                os.path.join(model_save_dir, name + '.pdopt'))
+
+            if self.rank == 0:
+                # save some extra info for resume
+                # pretrain_world_size, embedding_size, num_classes are used for
+                # re-split fc weight when gpu setting changed.
+                # epoch use to restart.
+                config_file = os.path.join(model_save_dir, 'meta.json')
+                extra_info = dict()
+                extra_info["pretrain_world_size"] = self.world_size
+                extra_info["embedding_size"] = self.embedding_size
+                extra_info['num_classes'] = self.num_classes
+                extra_info['epoch'] = epoch
+                extra_info['lr_state'] = lr_state_dict
+                with open(config_file, 'w') as f:
+                    json.dump(extra_info, f)
+
+        logging.info("Save model to {}.".format(model_save_dir))
+        if self.rank == 0 and self.max_num_last_checkpoint > 0:
+            for idx in range(-1, epoch - self.max_num_last_checkpoint + 1):
+                path = os.path.join(self.model_save_dir, str(idx))
+                if os.path.exists(path):
+                    logging.info("Remove checkpoint {}.".format(path))
+                    shutil.rmtree(path)
+
+    def load(self,
+             backbone: paddle.nn.Layer,
+             classifier: paddle.nn.Layer=None,
+             optimizer=None,
+             for_train=True,
+             dtype=None):
+
+        assert os.path.exists(self.checkpoint_dir)
+        checkpoint_dir = os.path.abspath(self.checkpoint_dir)
+
+        param_state_dict = {}
+        opt_state_dict = {}
+        dist_param_state_dict = {}
+
+        dist_weight_state_dict = {}
+        dist_weight_velocity_state_dict = {}
+        dist_bias_state_dict = {}
+        dist_bias_velocity_state_dict = {}
+        for path in os.listdir(checkpoint_dir):
+            path = os.path.join(checkpoint_dir, path)
+            if not os.path.isfile(path):
+                continue
+
+            basename = os.path.basename(path)
+            name, ext = os.path.splitext(basename)
+
+            if ext not in ['.pdopt', '.pdparam']:
+                continue
+
+            if not for_train and ext == '.pdopt':
+                continue
+
+            tensor = paddle.load(path, return_numpy=True)
+            if dtype:
+                assert dtype in ['float32', 'float16']
+                tensor = tensor.astype('float32')
+
+            if 'dist@' in name and '@rank@' in name:
+                if '.w' in name and 'velocity' not in name:
+                    dist_weight_state_dict[name] = tensor
+                elif '.w' in name and 'velocity' in name:
+                    dist_weight_velocity_state_dict[name] = tensor
+                elif '.b' in name and 'velocity' not in name:
+                    dist_bias_state_dict[name] = tensor
+                elif '.b' in name and 'velocity' in name:
+                    dist_bias_velocity_state_dict[name] = tensor
+
+            else:
+                if ext == '.pdparam':
+                    param_state_dict[name] = tensor
+                else:
+                    opt_state_dict[name] = tensor
+
+        if for_train:
+            meta_file = os.path.join(checkpoint_dir, 'meta.json')
+            if not os.path.exists(meta_file):
+                logging.error(
+                    "Please make sure the checkpoint dir {} exists, and "
+                    "parameters in that dir are validating.".format(
+                        checkpoint_dir))
+                exit()
+
+            with open(meta_file, 'r') as handle:
+                extra_info = json.load(handle)
+
+        # Preporcess distributed parameters.
+        if self.world_size > 1:
+            pretrain_world_size = extra_info['pretrain_world_size']
+            assert pretrain_world_size > 0
+            embedding_size = extra_info['embedding_size']
+            assert embedding_size == self.embedding_size
+            num_classes = extra_info['num_classes']
+            assert num_classes == self.num_classes
+
+            logging.info(
+                "Parameters for pre-training: pretrain_world_size ({}), "
+                "embedding_size ({}), and num_classes ({}).".format(
+                    pretrain_world_size, embedding_size, num_classes))
+            logging.info("Parameters for inference or fine-tuning: "
+                         "world_size ({}).".format(self.world_size))
+
+            rank_str = '%05d' % self.rank
+
+            dist_weight_state_dict = rearrange_weight(
+                dist_weight_state_dict, pretrain_world_size, self.world_size)
+            dist_bias_state_dict = rearrange_weight(
+                dist_bias_state_dict, pretrain_world_size, self.world_size)
+            for name, value in dist_weight_state_dict.items():
+                if rank_str in name:
+                    dist_param_state_dict[name] = value
+            for name, value in dist_bias_state_dict.items():
+                if rank_str in name:
+                    dist_param_state_dict[name] = value
+
+            if for_train:
+                dist_weight_velocity_state_dict = rearrange_weight(
+                    dist_weight_velocity_state_dict, pretrain_world_size,
+                    self.world_size)
+                dist_bias_velocity_state_dict = rearrange_weight(
+                    dist_bias_velocity_state_dict, pretrain_world_size,
+                    self.world_size)
+                for name, value in dist_weight_velocity_state_dict.items():
+                    if rank_str in name:
+                        opt_state_dict[name] = value
+                for name, value in dist_bias_velocity_state_dict.items():
+                    if rank_str in name:
+                        opt_state_dict[name] = value
+
+        def map_actual_param_name(state_dict, load_state_dict):
+            for name, param in state_dict.items():
+                state_dict[name] = load_state_dict[param.name]
+            return state_dict
+
+        logging.info("Load checkpoint from '{}'. ".format(checkpoint_dir))
+        param_state_dict = map_actual_param_name(backbone.state_dict(),
+                                                 param_state_dict)
+        backbone.set_state_dict(param_state_dict)
+        if classifier is not None:
+            dist_param_state_dict = map_actual_param_name(
+                classifier.state_dict(), dist_param_state_dict)
+            classifier.set_state_dict(dist_param_state_dict)
+        if for_train:
+            assert optimizer is not None
+            optimizer.set_state_dict(opt_state_dict)
+
+        if for_train:
+            return extra_info
+        else:
+            return {}
diff --git a/recognition/arcface_paddle/dynamic/utils/verification.py b/recognition/arcface_paddle/dynamic/utils/verification.py
new file mode 100644
index 0000000..a971baa
--- /dev/null
+++ b/recognition/arcface_paddle/dynamic/utils/verification.py
@@ -0,0 +1,133 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import time
+import os
+import numpy as np
+import sklearn
+import paddle
+import logging
+from typing import List
+
+from utils.verification import evaluate
+from datasets import load_bin
+
+
+@paddle.no_grad()
+def test(data_set, backbone, batch_size, fp16=False, nfolds=10):
+    print('testing verification..')
+    data_list = data_set[0]
+    issame_list = data_set[1]
+    embeddings_list = []
+    time_consumed = 0.0
+    for i in range(len(data_list)):
+        data = data_list[i]
+        embeddings = None
+        ba = 0
+        while ba < data.shape[0]:
+            bb = min(ba + batch_size, data.shape[0])
+            count = bb - ba
+            _data = data[bb - batch_size:bb]
+            # 将numpy转Tensor
+            img = paddle.to_tensor(
+                _data, dtype='float16' if fp16 else 'float32')
+            net_out: paddle.Tensor = backbone(img)
+            _embeddings = net_out.detach().cpu().numpy()
+            if embeddings is None:
+                embeddings = np.zeros((data.shape[0], _embeddings.shape[1]))
+            embeddings[ba:bb, :] = _embeddings[(batch_size - count):, :]
+            ba = bb
+        embeddings_list.append(embeddings)
+
+    _xnorm = 0.0
+    _xnorm_cnt = 0
+    for embed in embeddings_list:
+        for i in range(embed.shape[0]):
+            _em = embed[i]
+            _norm = np.linalg.norm(_em)
+            _xnorm += _norm
+            _xnorm_cnt += 1
+    _xnorm /= _xnorm_cnt
+
+    embeddings = embeddings_list[0].copy()
+    try:
+        embeddings = sklearn.preprocessing.normalize(embeddings)
+    except:
+        print(embeddings)
+    acc1 = 0.0
+    std1 = 0.0
+    embeddings = embeddings_list[0] + embeddings_list[1]
+    embeddings = sklearn.preprocessing.normalize(embeddings)
+    _, _, accuracy, val, val_std, far = evaluate(
+        embeddings, issame_list, nrof_folds=nfolds)
+    acc2, std2 = np.mean(accuracy), np.std(accuracy)
+    return acc1, std1, acc2, std2, _xnorm, embeddings_list
+
+
+class CallBackVerification(object):
+    def __init__(self,
+                 frequent,
+                 rank,
+                 batch_size,
+                 val_targets,
+                 rec_prefix,
+                 fp16=False,
+                 image_size=(112, 112)):
+        self.frequent: int = frequent
+        self.rank: int = rank
+        self.batch_size: int = batch_size
+        self.fp16 = fp16
+        self.highest_acc_list: List[float] = [0.0] * len(val_targets)
+        self.ver_list: List[object] = []
+        self.ver_name_list: List[str] = []
+        if self.rank == 0:
+            self.init_dataset(
+                val_targets=val_targets,
+                data_dir=rec_prefix,
+                image_size=image_size)
+
+    def ver_test(self, backbone: paddle.nn.Layer, global_step: int):
+        for i in range(len(self.ver_list)):
+            test_start = time.time()
+            acc1, std1, acc2, std2, xnorm, embeddings_list = test(
+                self.ver_list[i],
+                backbone,
+                self.batch_size,
+                fp16=self.fp16,
+                nfolds=10)
+            logging.info('[%s][%d]XNorm: %f' %
+                         (self.ver_name_list[i], global_step, xnorm))
+            logging.info('[%s][%d]Accuracy-Flip: %1.5f+-%1.5f' %
+                         (self.ver_name_list[i], global_step, acc2, std2))
+            if acc2 > self.highest_acc_list[i]:
+                self.highest_acc_list[i] = acc2
+            logging.info('[%s][%d]Accuracy-Highest: %1.5f' % (
+                self.ver_name_list[i], global_step, self.highest_acc_list[i]))
+            test_end = time.time()
+            logging.info("test time: {:.4f}".format(test_end - test_start))
+
+    def init_dataset(self, val_targets, data_dir, image_size):
+        for name in val_targets:
+            path = os.path.join(data_dir, name + ".bin")
+            if os.path.exists(path):
+                data_set = load_bin(path, image_size)
+                self.ver_list.append(data_set)
+                self.ver_name_list.append(name)
+
+    def __call__(self, num_update, backbone: paddle.nn.Layer):
+        if self.rank == 0 and num_update > 0 and num_update % self.frequent == 0:
+            backbone.eval()
+            with paddle.no_grad():
+                self.ver_test(backbone, num_update)
+            backbone.train()
diff --git a/recognition/arcface_paddle/dynamic/validation.py b/recognition/arcface_paddle/dynamic/validation.py
new file mode 100644
index 0000000..52dc9bd
--- /dev/null
+++ b/recognition/arcface_paddle/dynamic/validation.py
@@ -0,0 +1,40 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import numpy as np
+import paddle
+
+from .utils.verification import CallBackVerification
+from .utils.io import Checkpoint
+from . import backbones
+
+
+def validation(args):
+    checkpoint = Checkpoint(
+        rank=0,
+        world_size=1,
+        embedding_size=args.embedding_size,
+        num_classes=None,
+        checkpoint_dir=args.checkpoint_dir, )
+
+    backbone = eval("backbones.{}".format(args.backbone))(
+        num_features=args.embedding_size)
+    checkpoint.load(backbone, for_train=False)
+    backbone.eval()
+
+    callback_verification = CallBackVerification(
+        1, 0, args.batch_size, args.val_targets, args.data_dir)
+
+    callback_verification(1, backbone)
diff --git a/recognition/arcface_paddle/export_inference_model.py b/recognition/arcface_paddle/export_inference_model.py
deleted file mode 100644
index 2cb1ea2..0000000
--- a/recognition/arcface_paddle/export_inference_model.py
+++ /dev/null
@@ -1,54 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import os
-import argparse
-
-import paddle
-import paddle.nn.functional as F
-from paddle.jit import to_static
-
-import backbones
-
-
-def parse_args():
-    parser = argparse.ArgumentParser()
-    parser.add_argument("--network", type=str)
-    parser.add_argument("--pretrained_model", type=str)
-    parser.add_argument("--output_path", type=str, default="./inference")
-
-    return parser.parse_args()
-
-
-def load_dygraph_pretrain(model, path=None):
-    if not os.path.exists(path):
-        raise ValueError(f"The path of pretrained model file does not exists: {path}.")
-    param_state_dict = paddle.load(path)
-    model.set_dict(param_state_dict)
-    return
-
-
-def main():
-    args = parse_args()
-
-    net = eval("backbones.{}".format(args.network))()
-    load_dygraph_pretrain(net, path=args.pretrained_model)
-    net.eval()
-
-    net = to_static(net, input_spec=[paddle.static.InputSpec(shape=[None, 3, 112, 112], dtype='float32')])
-    paddle.jit.save(net, os.path.join(args.output_path, "inference"))
-
-
-if __name__ == "__main__":
-    main()
\ No newline at end of file
diff --git a/recognition/arcface_paddle/infer.py b/recognition/arcface_paddle/infer.py
deleted file mode 100644
index d4d6692..0000000
--- a/recognition/arcface_paddle/infer.py
+++ /dev/null
@@ -1,69 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import os
-import argparse
-import numpy as np
-import cv2
-import paddle
-import backbones
-
-
-def read_img(img_path=None):
-    if img_path is None:
-        img = np.random.randint(0, 255, size=(112, 112, 3), dtype=np.uint8)
-    else:
-        img = cv2.imread(img_path)
-        img = cv2.resize(img, (112, 112))
-    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
-    scale = 1. / 255.
-    mean = [0.5, 0.5, 0.5]
-    std = [0.5, 0.5, 0.5]
-    mean = np.array(mean).reshape((1, 1, 3)).astype('float32')
-    std = np.array(std).reshape((1, 1, 3)).astype('float32')
-    img = (img.astype('float32') * scale - mean) / std
-    img = img.transpose((2, 0, 1))
-    img = np.expand_dims(img, 0)
-    return img
-
-
-def main(args):
-    backbone = eval("backbones.{}".format(args.network))()
-    model_params = args.network + '.pdparams'
-    print('INFO:' + args.network + ' chose! ' + model_params + ' loaded!')
-    state_dict = paddle.load(os.path.join(args.checkpoint, model_params))
-    backbone.set_state_dict(state_dict)
-    backbone.eval()
-    img = read_img(args.img)
-    input_tensor = paddle.to_tensor(img)
-    feat = backbone(input_tensor).numpy()
-    return feat
-
-
-if __name__ == '__main__':
-    parser = argparse.ArgumentParser(description='Paddle ArcFace Testing')
-    parser.add_argument(
-        '--network',
-        type=str,
-        default='MobileFaceNet_128',
-        help='backbone network')
-    parser.add_argument(
-        '--img', type=str, default='None', help='backbone network')
-    parser.add_argument(
-        '--checkpoint',
-        type=str,
-        default='emore_arcface',
-        help='checkpoint dir')
-    args = parser.parse_args()
-    main(args)
diff --git a/recognition/arcface_paddle/install_ch.md b/recognition/arcface_paddle/install_ch.md
deleted file mode 100644
index c5f9225..0000000
--- a/recognition/arcface_paddle/install_ch.md
+++ /dev/null
@@ -1,115 +0,0 @@
-简体中文 | [English](install_en.md)
-
-# 安装说明
-
----
-本章将介绍如何安装ArcFace-paddle及其依赖项。
-
-
-## 1. 安装PaddlePaddle
-
-运行ArcFace-paddle需要`PaddlePaddle 2.1`或更高版本。可以参考下面的步骤安装PaddlePaddle。
-
-### 1.1 环境要求
-
-- python 3.x
-- cuda >= 10.1 (如果使用paddlepaddle-gpu)
-- cudnn >= 7.6.4 (如果使用paddlepaddle-gpu)
-- nccl >= 2.1.2 (如果使用分布式训练/评估)
-- gcc >= 8.2
-
-建议使用我们提供的docker运行ArcFace-paddle，有关docker、nvidia-docker使用请参考[链接](https://www.runoob.com/docker/docker-tutorial.html)。
-
-在cuda10.1时，建议显卡驱动版本大于等于418.39；在使用cuda10.2时，建议显卡驱动版本大于440.33，更多cuda版本与要求的显卡驱动版本可以参考[链接](https://docs.nvidia.com/deploy/cuda-compatibility/index.html)。
-
-
-如果不使用docker，可以直接跳过1.2部分内容，从1.3部分开始执行。
-
-
-### 1.2 （建议）准备docker环境。第一次使用这个镜像，会自动下载该镜像，请耐心等待。
-
-```
-# 切换到工作目录下
-cd /home/Projects
-# 首次运行需创建一个docker容器，再次运行时不需要运行当前命令
-# 创建一个名字为face_paddle的docker容器，并将当前目录映射到容器的/paddle目录下
-
-如果您希望在CPU环境下使用docker，使用docker而不是nvidia-docker创建docker，设置docker容器共享内存shm-size为8G，建议设置8G以上
-sudo docker run --name face_paddle -v $PWD:/paddle --shm-size=8G --network=host -it paddlepaddle/paddle:2.1.0 /bin/bash
-
-如果希望使用GPU版本的容器，请运行以下命令创建容器。
-sudo nvidia-docker run --name face_paddle -v $PWD:/paddle --shm-size=8G --network=host -it paddlepaddle/paddle:2.1.0-gpu-cuda10.2-cudnn7 /bin/bash
-```
-
-
-您也可以访问[DockerHub](https://hub.docker.com/r/paddlepaddle/paddle/tags/)获取与您机器适配的镜像。
-
-```
-# ctrl+P+Q可退出docker 容器，重新进入docker 容器使用如下命令
-sudo docker exec -it face_paddle /bin/bash
-```
-
-### 1.3 通过pip安装PaddlePaddle
-
-运行下面的命令，通过pip安装最新GPU版本PaddlePaddle
-
-```bash
-pip3 install paddlepaddle-gpu --upgrade -i https://mirror.baidu.com/pypi/simple
-```
-
-如果希望在CPU环境中使用PaddlePaddle，可以运行下面的命令安装PaddlePaddle。
-
-```bash
-pip3 install paddlepaddle --upgrade -i https://mirror.baidu.com/pypi/simple
-```
-
-**注意：**
-* 如果先安装了CPU版本的paddlepaddle，之后想切换到GPU版本，那么需要首先卸载CPU版本的paddle，再安装GPU版本的paddle，否则容易导致使用的paddle版本混乱。
-* 您也可以从源码编译安装PaddlePaddle，请参照[PaddlePaddle 安装文档](http://www.paddlepaddle.org.cn/install/quick)中的说明进行操作。
-
-
-### 1.4 验证是否安装成功
-
-使用以下命令可以验证PaddlePaddle是否安装成功。
-
-```python
-import paddle
-paddle.utils.run_check()
-```
-
-查看PaddlePaddle版本的命令如下：
-
-```bash
-python3 -c "import paddle; print(paddle.__version__)"
-```
-
-注意：
-- 从源码编译的PaddlePaddle版本号为0.0.0，请确保使用了PaddlePaddle 2.0及之后的源码编译。
-- ArcFace-paddle基于PaddlePaddle高性能的分布式训练能力，若您从源码编译，请确保打开编译选项，**WITH_DISTRIBUTE=ON**。具体编译选项参考[编译选项表](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/install/Tables.html#id3)。
-- 在docker中运行时，为保证docker容器有足够的共享内存用于Paddle的数据读取加速，在创建docker容器时，请设置参数`--shm_size=8g`，条件允许的话可以设置为更大的值
-- 如果只希望使用识别模块，则可以跳过下面的第3部分；如果只希望使用检测模块，则可以跳过下面的第2部分。
-
-
-## 2. 准备识别模块的环境
-
-安装`requiremnts`，命令如下。
-
-```shell
-pip3 install -r requirement.txt
-```
-
-## 3. 准备检测模块的环境
-
-检测模块依赖于PaddleDetection，需要首先下载PaddleDetection的代码，并安装`requiremnts`。具体命令如下。
-
-```bash
-# 克隆PaddleDetection仓库
-cd <path/to/clone/PaddleDetection>
-git clone https://github.com/PaddlePaddle/PaddleDetection.git
-
-cd PaddleDetection
-# 安装其他依赖
-pip3 install -r requirements.txt
-```
-
-更多安装教程，请参考: [Install tutorial](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/INSTALL_cn.md)。
diff --git a/recognition/arcface_paddle/install_en.md b/recognition/arcface_paddle/install_en.md
deleted file mode 100644
index 269eece..0000000
--- a/recognition/arcface_paddle/install_en.md
+++ /dev/null
@@ -1,109 +0,0 @@
-[简体中文](install_ch.md) | English
-
-# Installation
-
----
-This tutorial introduces how to install ArcFace-paddle and its requirements.
-
-## 1. Install PaddlePaddle
-
-`PaddlePaddle 2.1` or later is required for ArcFace-paddle. You can use the following steps to install PaddlePaddle.
-
-### 1.1 Environment requirements
-
-- python 3.x
-- cuda >= 10.1 (necessary if you want to use paddlepaddle-gpu)
-- cudnn >= 7.6.4 (necessary if you want to use paddlepaddle-gpu)
-- nccl >= 2.1.2 (necessary if you want the use distributed training/eval)
-- gcc >= 8.2
-
-Docker is recomended to run ArcFace-paddle, for more detailed information about docker and nvidia-docker, you can refer to the [tutorial](https://www.runoob.com/docker/docker-tutorial.html).
-
-When you use cuda10.1, the driver version needs to be larger or equal than 418.39. When you use cuda10.2, the driver version needs to be larger or equal than 440.33. For more cuda versions and specific driver versions, you can refer to the [link](https://docs.nvidia.com/deploy/cuda-compatibility/index.html).
-
-If you do not want to use docker, you can skip section 1.2 and go into section 1.3 directly.
-
-
-### 1.2 (Recommended) Prepare for a docker environment. The first time you use this docker image, it will be downloaded automatically. Please be patient.
-
-
-```
-# Switch to the working directory
-cd /home/Projects
-# You need to create a docker container for the first run, and do not need to run the current command when you run it again
-# Create a docker container named face_paddle and map the current directory to the /paddle directory of the container
-# It is recommended to set a shared memory greater than or equal to 8G through the --shm-size parameter
-sudo docker run --name face_paddle -v $PWD:/paddle --shm-size=8G --network=host -it paddlepaddle/paddle:2.1.0 /bin/bash
-
-# Use the following command to create a container if you want to use GPU in the container
-sudo nvidia-docker run --name face_paddle -v $PWD:/paddle --shm-size=8G --network=host -it paddlepaddle/paddle:2.1.0-gpu-cuda10.2-cudnn7 /bin/bash
-```
-
-You can also visit [DockerHub](https://hub.docker.com/r/paddlepaddle/paddle/tags/) to get more docker images.
-
-```
-# use ctrl+P+Q to exit docker, to re-enter docker using the following command:
-sudo docker exec -it face_paddle /bin/bash
-```
-
-### 1.3 Install PaddlePaddle using pip
-
-If you want to use PaddlePaddle on GPU, you can use the following command to install PaddlePaddle.
-
-```bash
-pip3 install paddlepaddle-gpu --upgrade -i https://mirror.baidu.com/pypi/simple
-```
-
-If you want to use PaddlePaddle on CPU, you can use the following command to install PaddlePaddle.
-
-```bash
-pip3 install paddlepaddle --upgrade -i https://mirror.baidu.com/pypi/simple
-```
-
-**Note:**
-* If you have already installed CPU version of PaddlePaddle and want to use GPU version now, you should uninstall CPU version of PaddlePaddle and then install GPU version to avoid package confusion.
-* You can also compile PaddlePaddle from source code, please refer to [PaddlePaddle Installation tutorial](http://www.paddlepaddle.org.cn/install/quick) to more compilation options.
-
-### 1.4 Verify Installation process
-
-```python
-import paddle
-paddle.utils.run_check()
-```
-
-Check PaddlePaddle version：
-
-```bash
-python3 -c "import paddle; print(paddle.__version__)"
-```
-
-Note:
-- Make sure the compiled source code is later than PaddlePaddle2.0.
-- If you want to enable distribution ability, you should assign **WITH_DISTRIBUTE=ON** when compiling. For more compilation options, please refer to [Instruction](https://www.paddlepaddle.org.cn/documentation/docs/zh/develop/install/Tables.html#id3) for more details.
-- When running in docker, in order to ensure that the container has enough shared memory for dataloader acceleration of Paddle, please set the parameter `--shm_size=8g` at creating a docker container, if conditions permit, you can set it to a larger value.
-- If you just want to use recognition module, you can skip section 3. If you just want to use detection module, you can skip section 2.
-
-## 2. Prepare for the environment of recognition
-
-Run the following command to install `requiremnts`.
-
-```shell
-pip3 install -r requirement.txt
-```
-
-## 3. Prepare for the environment of detection
-
-The detection module depends on PaddleDetection. You need to download PaddleDetection and install `requiremnts`, the command is as follows.
-
-
-```bash
-# clone PaddleDetection repo
-cd <path/to/clone/PaddleDetection>
-git clone https://github.com/PaddlePaddle/PaddleDetection.git
-
-cd PaddleDetection
-# install requiremnts
-pip3 install -r requirements.txt
-```
-
-For more installation tutorials, please refer to [Install tutorial](https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.1/docs/tutorials/INSTALL.md).
\ No newline at end of file
diff --git a/recognition/arcface_paddle/losses.py b/recognition/arcface_paddle/losses.py
deleted file mode 100644
index 03040bc..0000000
--- a/recognition/arcface_paddle/losses.py
+++ /dev/null
@@ -1,45 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import paddle
-from paddle import nn
-
-
-class CosFace(nn.Layer):
-    def __init__(self, s=64.0, m=0.40):
-        super(CosFace, self).__init__()
-        self.s = s
-        self.m = m
-
-    def forward(self, cosine, label):
-        m_hot = paddle.nn.functional.one_hot(
-            label.astype('long'), num_classes=85742) * self.m
-        cosine -= m_hot
-        ret = cosine * self.s
-        return ret
-
-
-class ArcFace(nn.Layer):
-    def __init__(self, s=64.0, m=0.50):
-        super(ArcFace, self).__init__()
-        self.s = s
-        self.m = m
-
-    def forward(self, cosine: paddle.Tensor, label):
-        m_hot = paddle.nn.functional.one_hot(
-            label.astype('long'), num_classes=85742) * self.m
-        cosine = cosine.acos()
-        cosine += m_hot
-        cosine = cosine.cos() * self.s
-        return cosine
diff --git a/recognition/arcface_paddle/partial_fc.py b/recognition/arcface_paddle/partial_fc.py
deleted file mode 100644
index c3ac122..0000000
--- a/recognition/arcface_paddle/partial_fc.py
+++ /dev/null
@@ -1,168 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import os
-import paddle
-import paddle.nn as nn
-from paddle.nn.functional import normalize, linear
-import pickle
-
-
-class PartialFC(nn.Layer):
-    """
-    Author: {Xiang An, Yang Xiao, XuHan Zhu} in DeepGlint,
-    Partial FC: Training 10 Million Identities on a Single Machine
-    See the original paper:
-    https://arxiv.org/abs/2010.05222
-    """
-
-    @paddle.no_grad()
-    def __init__(self,
-                 rank,
-                 world_size,
-                 batch_size,
-                 resume,
-                 margin_softmax,
-                 num_classes,
-                 sample_rate=1.0,
-                 embedding_size=512,
-                 prefix="./"):
-        super(PartialFC, self).__init__()
-        self.num_classes: int = num_classes
-        self.rank: int = rank
-        self.world_size: int = world_size
-        self.batch_size: int = batch_size
-        self.margin_softmax: callable = margin_softmax
-        self.sample_rate: float = sample_rate
-        self.embedding_size: int = embedding_size
-        self.prefix: str = prefix
-        self.num_local: int = num_classes // world_size + int(
-            rank < num_classes % world_size)
-        self.class_start: int = num_classes // world_size * rank + min(
-            rank, num_classes % world_size)
-        self.num_sample: int = int(self.sample_rate * self.num_local)
-
-        self.weight_name = os.path.join(
-            self.prefix, "rank:{}_softmax_weight.pkl".format(self.rank))
-        self.weight_mom_name = os.path.join(
-            self.prefix, "rank:{}_softmax_weight_mom.pkl".format(self.rank))
-
-        if resume:
-            try:
-                self.weight: paddle.Tensor = paddle.load(self.weight_name)
-                print("softmax weight resume successfully!")
-            except (FileNotFoundError, KeyError, IndexError):
-                self.weight = paddle.normal(0, 0.01, (self.num_local,
-                                                      self.embedding_size))
-                print("softmax weight resume fail!")
-
-            try:
-                self.weight_mom: paddle.Tensor = paddle.load(
-                    self.weight_mom_name)
-                print("softmax weight mom resume successfully!")
-            except (FileNotFoundError, KeyError, IndexError):
-                self.weight_mom: paddle.Tensor = paddle.zeros_like(self.weight)
-                print("softmax weight mom resume fail!")
-        else:
-            self.weight = paddle.normal(0, 0.01,
-                                        (self.num_local, self.embedding_size))
-            self.weight_mom: paddle.Tensor = paddle.zeros_like(self.weight)
-            print("softmax weight init successfully!")
-            print("softmax weight mom init successfully!")
-
-        self.index = None
-        if int(self.sample_rate) == 1:
-            self.update = lambda: 0
-            self.sub_weight = paddle.create_parameter(
-                shape=self.weight.shape,
-                dtype='float32',
-                default_initializer=paddle.nn.initializer.Assign(self.weight))
-            self.sub_weight_mom = self.weight_mom
-        else:
-            self.sub_weight = paddle.create_parameter(
-                shape=[1, 1],
-                dtype='float32',
-                default_initializer=paddle.nn.initializer.Assign(
-                    paddle.empty((1, 1))))
-
-    def save_params(self):
-        with open(self.weight_name, 'wb') as file:
-            pickle.dump(self.weight.numpy(), file)
-        with open(self.weight_mom_name, 'wb') as file:
-            pickle.dump(self.weight_mom.numpy(), file)
-
-    @paddle.no_grad()
-    def sample(self, total_label):
-        index_positive = (self.class_start <= total_label).numpy() & (
-            total_label < self.class_start + self.num_local).numpy()
-        total_label = total_label.numpy()
-        total_label[~index_positive] = -1
-        total_label[index_positive] -= self.class_start
-        total_label = paddle.to_tensor(total_label)
-
-    def forward(self, total_features, norm_weight):
-        logits = linear(total_features, paddle.t(norm_weight))
-        return logits
-
-    @paddle.no_grad()
-    def update(self):
-        self.weight_mom[self.index] = self.sub_weight_mom
-        self.weight[self.index] = self.sub_weight
-
-    def prepare(self, label, optimizer):
-        # label [64, 1]
-        total_label = label.detach()
-        self.sample(total_label)
-        optimizer._parameter_list[0] = self.sub_weight
-        norm_weight = normalize(self.sub_weight)
-        return total_label, norm_weight
-
-    def forward_backward(self, label, features, optimizer):
-        total_label, norm_weight = self.prepare(label, optimizer)
-        total_features = features.detach()
-        total_features.stop_gradient = False
-
-        logits = self.forward(total_features, norm_weight)
-        logits = self.margin_softmax(logits, total_label)
-
-        with paddle.no_grad():
-            max_fc = paddle.max(logits, axis=1, keepdim=True)
-
-            # calculate exp(logits) and all-reduce
-            logits_exp = paddle.exp(logits - max_fc)
-            logits_sum_exp = logits_exp.sum(axis=1, keepdim=True)
-
-            # calculate prob
-            logits_exp = logits_exp.divide(logits_sum_exp)
-
-            # get one-hot
-            grad = logits_exp
-            one_hot = paddle.nn.functional.one_hot(
-                total_label.astype('long'), num_classes=85742)
-
-            # calculate loss
-            loss = paddle.nn.functional.one_hot(
-                total_label.astype('long'),
-                num_classes=85742).multiply(grad).sum(axis=1)
-            loss_v = paddle.clip(loss, 1e-30).log().mean() * (-1)
-
-            # calculate grad
-            grad -= one_hot
-            grad = grad.divide(
-                paddle.to_tensor(
-                    self.batch_size * self.world_size, dtype='float32'))
-        (logits.multiply(grad)).backward()
-
-        x_grad = paddle.to_tensor(total_features.grad, stop_gradient=False)
-        return x_grad, loss_v
diff --git a/recognition/arcface_paddle/requirement.txt b/recognition/arcface_paddle/requirement.txt
index 6dc6196..25de974 100644
--- a/recognition/arcface_paddle/requirement.txt
+++ b/recognition/arcface_paddle/requirement.txt
@@ -1,4 +1,3 @@
-paddlepaddle-gpu==2.0.2
 visualdl
 opencv-python
 pillow
@@ -12,3 +11,6 @@ tqdm
 Pillow
 scikit-learn==0.23.2
 opencv-python==4.4.0.46
+onnxruntime
+onnx
+paddle2onnx
diff --git a/recognition/arcface_paddle/scripts/export_dynamic.sh b/recognition/arcface_paddle/scripts/export_dynamic.sh
new file mode 100644
index 0000000..cdc0e14
--- /dev/null
+++ b/recognition/arcface_paddle/scripts/export_dynamic.sh
@@ -0,0 +1,21 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+python tools/export.py \
+    --is_static False \
+    --export_type paddle \
+    --backbone FresResNet50 \
+    --embedding_size 512 \
+    --checkpoint_dir MS1M_v3_arcface_dynamic_128_fp16_0.1/FresResNet50/24 \
+    --output_dir MS1M_v3_arcface_dynamic_128_fp16_0.1/FresResNet50/exported_model
diff --git a/recognition/arcface_paddle/scripts/export_static.sh b/recognition/arcface_paddle/scripts/export_static.sh
new file mode 100644
index 0000000..11c1d8e
--- /dev/null
+++ b/recognition/arcface_paddle/scripts/export_static.sh
@@ -0,0 +1,21 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+python tools/export.py \
+    --is_static True \
+    --export_type paddle \
+    --backbone FresResNet50 \
+    --embedding_size 512 \
+    --checkpoint_dir MS1M_v3_arcface_static_128_fp16_0.1/FresResNet50/24 \
+    --output_dir MS1M_v3_arcface_static_128_fp16_0.1/FresResNet50/exported_model
diff --git a/recognition/arcface_paddle/scripts/inference.sh b/recognition/arcface_paddle/scripts/inference.sh
new file mode 100644
index 0000000..2d89f20
--- /dev/null
+++ b/recognition/arcface_paddle/scripts/inference.sh
@@ -0,0 +1,24 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+python tools/inference.py \
+    --export_type paddle \
+    --model_file MS1M_v3_arcface_static_128_fp16_0.1/FresResNet50/exported_model/FresResNet50.pdmodel \
+    --params_file MS1M_v3_arcface_static_128_fp16_0.1/FresResNet50/exported_model/FresResNet50.pdiparams \
+    --image_path /wangguoxia/plsc/MS1M_v3/images/00000001.jpg
+
+python tools/inference.py \
+    --export_type onnx \
+    --onnx_file MS1M_v3_arcface_static_128_fp16_0.1/FresResNet50/exported_model/FresResNet50.onnx \
+    --image_path /wangguoxia/plsc/MS1M_v3/images/00000001.jpg
diff --git a/recognition/arcface_paddle/scripts/kill_train_process.sh b/recognition/arcface_paddle/scripts/kill_train_process.sh
new file mode 100644
index 0000000..76dcf50
--- /dev/null
+++ b/recognition/arcface_paddle/scripts/kill_train_process.sh
@@ -0,0 +1,15 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+# 
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+# 
+#     http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+ps -ef | grep "train.py" | grep -v grep | awk '{print $2}' | xargs kill -9
diff --git a/recognition/arcface_paddle/scripts/perf_dynamic.sh b/recognition/arcface_paddle/scripts/perf_dynamic.sh
new file mode 100644
index 0000000..fa1ce24
--- /dev/null
+++ b/recognition/arcface_paddle/scripts/perf_dynamic.sh
@@ -0,0 +1,40 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+set -e
+
+num_test=5
+num_nodes=1
+
+configs=(configs/ms1mv3_r50.py configs/ms1mv3_r100.py)
+dtypes=(fp16 fp32)
+gpus=("0" "0,1,2,3" "0,1,2,3,4,5,6,7")
+
+for config in "${configs[@]}"
+do
+    for dtype in "${dtypes[@]}"
+    do
+        for gpu in "${gpus[@]}"
+        do
+            i=1
+            while [ $i -le ${num_test} ]
+            do
+                bash scripts/perf_runner.sh $gpu $config dynamic 93431 $dtype $num_nodes 128 0.1 ${i}
+                echo " >>>>>>Finished Test Case $config, $dtype, $gpu, ${i} <<<<<<<"
+                let i++
+                sleep 20s
+            done
+        done
+    done
+done
diff --git a/recognition/arcface_paddle/scripts/perf_runner.sh b/recognition/arcface_paddle/scripts/perf_runner.sh
new file mode 100644
index 0000000..9e0b705
--- /dev/null
+++ b/recognition/arcface_paddle/scripts/perf_runner.sh
@@ -0,0 +1,60 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+set -ex
+
+gpus=${1:-0,1,2,3,4,5,6,7}
+config_file=${2:-configs/ms1mv3_r50.py}
+mode=${3:-static}
+num_classes=${4:-93431}
+dtype=${5:-fp16}
+num_nodes=${6:-1}
+batch_size_per_device=${7:-128}
+sample_ratio=${8:-0.1}
+test_id=${9:-1}
+
+if [ $mode = "static" ]; then
+    is_static=True
+else
+    is_static=False
+fi
+
+if [ $dtype = "fp16" ]; then
+    fp16=True
+else
+    fp16=False
+fi
+
+if [[ $config_file =~ r50 ]]; then
+    backbone=r50
+else
+    backbone=r100
+fi
+
+gpu_num_per_node=`expr ${#gpus} / 2 + 1`
+
+log_dir=./logs/arcface_paddle_${backbone}_${mode}_${dtype}_r${sample_ratio}_bz${batch_size_per_device}_${num_nodes}n${gpu_num_per_node}g_id${test_id}
+
+python -m paddle.distributed.launch --gpus=${gpus} --log_dir=${log_dir} tools/train.py \
+    --config_file ${config_file} \
+    --is_static ${is_static} \
+    --num_classes ${num_classes} \
+    --fp16 ${fp16} \
+    --sample_ratio ${sample_ratio} \
+    --log_interval_step 1 \
+    --train_unit 'step' \
+    --train_num 200 \
+    --warmup_num 0 \
+    --use_synthetic_dataset True \
+    --do_validation_while_train False
diff --git a/recognition/arcface_paddle/scripts/perf_static.sh b/recognition/arcface_paddle/scripts/perf_static.sh
new file mode 100644
index 0000000..5e9c53b
--- /dev/null
+++ b/recognition/arcface_paddle/scripts/perf_static.sh
@@ -0,0 +1,40 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+set -e
+
+num_test=5
+num_nodes=1
+
+configs=(configs/ms1mv3_r50.py configs/ms1mv3_r100.py)
+dtypes=(fp16 fp32)
+gpus=("0" "0,1,2,3" "0,1,2,3,4,5,6,7")
+
+for config in "${configs[@]}"
+do
+    for dtype in "${dtypes[@]}"
+    do
+        for gpu in "${gpus[@]}"
+        do
+            i=1
+            while [ $i -le ${num_test} ]
+            do
+                bash scripts/perf_runner.sh $gpu $config static 93431 $dtype $num_nodes 128 0.1 ${i}
+                echo " >>>>>>Finished Test Case $config, $dtype, $gpu, ${i} <<<<<<<"
+                let i++
+                sleep 20s
+            done
+        done
+    done
+done
diff --git a/recognition/arcface_paddle/scripts/train_dynamic.sh b/recognition/arcface_paddle/scripts/train_dynamic.sh
new file mode 100644
index 0000000..c0774de
--- /dev/null
+++ b/recognition/arcface_paddle/scripts/train_dynamic.sh
@@ -0,0 +1,41 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+python -m paddle.distributed.launch --gpus=0,1,2,3,4,5,6,7 tools/train.py \
+    --config_file configs/ms1mv3_r50.py \
+    --is_static False \
+    --backbone FresResNet50 \
+    --classifier LargeScaleClassifier \
+    --embedding_size 512 \
+    --model_parallel True \
+    --dropout 0.0 \
+    --sample_ratio 0.1 \
+    --loss ArcFace \
+    --batch_size 128 \
+    --dataset MS1M_v3 \
+    --num_classes 93431 \
+    --data_dir /wangguoxia/plsc/MS1M_v3/ \
+    --label_file /wangguoxia/plsc/MS1M_v3/label.txt \
+    --is_bin False \
+    --log_interval_step 100 \
+    --validation_interval_step 2000 \
+    --fp16 True \
+    --use_dynamic_loss_scaling True \
+    --init_loss_scaling 27648.0 \
+    --num_workers 8 \
+    --train_unit 'epoch' \
+    --warmup_num 0 \
+    --train_num 25 \
+    --decay_boundaries "10,16,22" \
+    --output MS1M_v3_arcface_dynamic_0.1
diff --git a/recognition/arcface_paddle/scripts/train_static.sh b/recognition/arcface_paddle/scripts/train_static.sh
new file mode 100644
index 0000000..d3f8e36
--- /dev/null
+++ b/recognition/arcface_paddle/scripts/train_static.sh
@@ -0,0 +1,41 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+python -m paddle.distributed.launch --gpus=0,1,2,3,4,5,6,7 tools/train.py \
+    --config_file configs/ms1mv3_r50.py \
+    --is_static True \
+    --backbone FresResNet50 \
+    --classifier LargeScaleClassifier \
+    --embedding_size 512 \
+    --model_parallel True \
+    --dropout 0.0 \
+    --sample_ratio 0.1 \
+    --loss ArcFace \
+    --batch_size 128 \
+    --dataset MS1M_v3 \
+    --num_classes 93431 \
+    --data_dir /wangguoxia/plsc/MS1M_v3/ \
+    --label_file /wangguoxia/plsc/MS1M_v3/label.txt \
+    --is_bin False \
+    --log_interval_step 100 \
+    --validation_interval_step 2000 \
+    --fp16 True \
+    --use_dynamic_loss_scaling True \
+    --init_loss_scaling 27648.0 \
+    --num_workers 8 \
+    --train_unit 'epoch' \
+    --warmup_num 0 \
+    --train_num 25 \
+    --decay_boundaries "10,16,22" \
+    --output MS1M_v3_arcface_static_0.1
diff --git a/recognition/arcface_paddle/scripts/validation_dynamic.sh b/recognition/arcface_paddle/scripts/validation_dynamic.sh
new file mode 100644
index 0000000..0ac0942
--- /dev/null
+++ b/recognition/arcface_paddle/scripts/validation_dynamic.sh
@@ -0,0 +1,22 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+python tools/validation.py \
+    --is_static False \
+    --backbone FresResNet50 \
+    --embedding_size 512 \
+    --checkpoint_dir MS1M_v3_arcface_dynamic_128_fp16_0.1/FresResNet50/24 \
+    --data_dir /wangguoxia/plsc/MS1M_v3/ \
+    --val_targets lfw,cfp_fp,agedb_30 \
+    --batch_size 128
diff --git a/recognition/arcface_paddle/scripts/validation_static.sh b/recognition/arcface_paddle/scripts/validation_static.sh
new file mode 100644
index 0000000..86dbacc
--- /dev/null
+++ b/recognition/arcface_paddle/scripts/validation_static.sh
@@ -0,0 +1,22 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+python tools/validation.py \
+    --is_static True \
+    --backbone FresResNet50 \
+    --embedding_size 512 \
+    --checkpoint_dir MS1M_v3_arcface_static_128_fp16_0.1/FresResNet50/24 \
+    --data_dir /wangguoxia/plsc/MS1M_v3/ \
+    --val_targets lfw,cfp_fp,agedb_30 \
+    --batch_size 128
diff --git a/recognition/arcface_paddle/shell/export_inference_model.sh b/recognition/arcface_paddle/shell/export_inference_model.sh
deleted file mode 100644
index f8a592f..0000000
--- a/recognition/arcface_paddle/shell/export_inference_model.sh
+++ /dev/null
@@ -1 +0,0 @@
-python export_inference_model.py --network MobileFaceNet_128 --output ./inference_model/ --pretrained_model ./emore_arcface/MobileFaceNet_128.pdparams
\ No newline at end of file
diff --git a/recognition/arcface_paddle/shell/infer.sh b/recognition/arcface_paddle/shell/infer.sh
deleted file mode 100644
index 53cdb5b..0000000
--- a/recognition/arcface_paddle/shell/infer.sh
+++ /dev/null
@@ -1,6 +0,0 @@
-export CUDA_VISIBLE_DEVICES=1
-
-nohup python3.7 infer.py \
-    --network 'MobileFaceNet_128' \
-    --img='00000000.jpg' \
-    --checkpoint 'emore_arcface' > "infer_log.log" 2>&1 &
\ No newline at end of file
diff --git a/recognition/arcface_paddle/shell/train.sh b/recognition/arcface_paddle/shell/train.sh
deleted file mode 100644
index cd86540..0000000
--- a/recognition/arcface_paddle/shell/train.sh
+++ /dev/null
@@ -1,16 +0,0 @@
-export CUDA_VISIBLE_DEVICES=1
-
-log_name="log"
-
-
-# If you want to reduce batchsize because of GPU memory,
-# you can reduce batch size and lr proportionally.
-python3.7 train.py \
-    --network 'MobileFaceNet_128' \
-    --lr=0.1 \
-    --batch_size 16 \
-    --weight_decay 2e-4 \
-    --embedding_size 128 \
-    --logdir="${log_name}" \
-    --output "emore_arcface" \
-    --is_bin=False
diff --git a/recognition/arcface_paddle/shell/val.sh b/recognition/arcface_paddle/shell/val.sh
deleted file mode 100644
index 2f0b95d..0000000
--- a/recognition/arcface_paddle/shell/val.sh
+++ /dev/null
@@ -1,6 +0,0 @@
-export CUDA_VISIBLE_DEVICES=0
-
-
-nohup python3.7 valid.py \
-    --network 'MobileFaceNet_128' \
-    --checkpoint='emore_arcface' > "valid_log.log" 2>&1 &
\ No newline at end of file
diff --git a/recognition/arcface_paddle/static/backbones/__init__.py b/recognition/arcface_paddle/static/backbones/__init__.py
new file mode 100644
index 0000000..1db1164
--- /dev/null
+++ b/recognition/arcface_paddle/static/backbones/__init__.py
@@ -0,0 +1,15 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from .iresnet import FresResNet50, FresResNet100
diff --git a/recognition/arcface_paddle/static/backbones/iresnet.py b/recognition/arcface_paddle/static/backbones/iresnet.py
new file mode 100644
index 0000000..73e98a9
--- /dev/null
+++ b/recognition/arcface_paddle/static/backbones/iresnet.py
@@ -0,0 +1,249 @@
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import paddle
+from collections import OrderedDict
+
+__all__ = [
+    "FresResNet", "FresResNet50", "FresResNet100", "FresResNet101",
+    "FresResNet152"
+]
+
+
+class FresResNet(object):
+    def __init__(self,
+                 layers=50,
+                 num_features=512,
+                 is_train=True,
+                 fp16=False,
+                 fc_type='E',
+                 dropout=0.4):
+        super(FresResNet, self).__init__()
+        self.layers = layers
+        self.num_features = num_features
+        self.fc_type = fc_type
+
+        self.input_dict = OrderedDict()
+        self.output_dict = OrderedDict()
+
+        image = paddle.static.data(
+            name='image',
+            shape=[-1, 3, 112, 112],
+            dtype='float16' if fp16 else 'float32')
+        self.input_dict['image'] = image
+        if is_train:
+            label = paddle.static.data(name='label', shape=[-1], dtype='int32')
+            self.input_dict['label'] = label
+
+        supported_layers = [50, 100, 101, 152]
+        assert layers in supported_layers, \
+            "supported layers {}, but given {}".format(supported_layers, layers)
+
+        if layers == 50:
+            units = [3, 4, 14, 3]
+        elif layers == 100:
+            units = [3, 13, 30, 3]
+        elif layers == 101:
+            units = [3, 4, 23, 3]
+        elif layers == 152:
+            units = [3, 8, 36, 3]
+        filter_list = [64, 64, 128, 256, 512]
+        num_stages = 4
+
+        input_blob = paddle.static.nn.conv2d(
+            input=image,
+            num_filters=filter_list[0],
+            filter_size=3,
+            stride=1,
+            padding=1,
+            groups=1,
+            param_attr=paddle.ParamAttr(),
+            bias_attr=False)
+        input_blob = paddle.static.nn.batch_norm(
+            input=input_blob,
+            act=None,
+            epsilon=1e-05,
+            momentum=0.9,
+            is_test=False if is_train else True)
+        # input_blob = paddle.nn.functional.relu6(input_blob)
+        input_blob = paddle.static.nn.prelu(
+            input_blob,
+            mode="all",
+            param_attr=paddle.ParamAttr(
+                initializer=paddle.nn.initializer.Constant(0.25)))
+
+        for i in range(num_stages):
+            for j in range(units[i]):
+                input_blob = self.residual_unit_v3(
+                    input_blob,
+                    filter_list[i + 1],
+                    3,
+                    2 if j == 0 else 1,
+                    1,
+                    is_train, )
+        fc1 = self.get_fc1(input_blob, is_train, dropout)
+
+        self.output_dict['feature'] = fc1
+
+    def residual_unit_v3(self, in_data, num_filter, filter_size, stride, pad,
+                         is_train):
+
+        bn1 = paddle.static.nn.batch_norm(
+            input=in_data,
+            act=None,
+            epsilon=1e-05,
+            momentum=0.9,
+            is_test=False if is_train else True)
+        conv1 = paddle.static.nn.conv2d(
+            input=bn1,
+            num_filters=num_filter,
+            filter_size=filter_size,
+            stride=1,
+            padding=1,
+            groups=1,
+            param_attr=paddle.ParamAttr(),
+            bias_attr=False)
+        bn2 = paddle.static.nn.batch_norm(
+            input=conv1,
+            act=None,
+            epsilon=1e-05,
+            momentum=0.9,
+            is_test=False if is_train else True)
+        # prelu = paddle.nn.functional.relu6(bn2)
+        prelu = paddle.static.nn.prelu(
+            bn2,
+            mode="all",
+            param_attr=paddle.ParamAttr(
+                initializer=paddle.nn.initializer.Constant(0.25)))
+        conv2 = paddle.static.nn.conv2d(
+            input=prelu,
+            num_filters=num_filter,
+            filter_size=filter_size,
+            stride=stride,
+            padding=pad,
+            groups=1,
+            param_attr=paddle.ParamAttr(),
+            bias_attr=False)
+        bn3 = paddle.static.nn.batch_norm(
+            input=conv2,
+            act=None,
+            epsilon=1e-05,
+            momentum=0.9,
+            is_test=False if is_train else True)
+
+        if stride == 1:
+            input_blob = in_data
+        else:
+            input_blob = paddle.static.nn.conv2d(
+                input=in_data,
+                num_filters=num_filter,
+                filter_size=1,
+                stride=stride,
+                padding=0,
+                groups=1,
+                param_attr=paddle.ParamAttr(),
+                bias_attr=False)
+
+            input_blob = paddle.static.nn.batch_norm(
+                input=input_blob,
+                act=None,
+                epsilon=1e-05,
+                momentum=0.9,
+                is_test=False if is_train else True)
+
+        identity = paddle.add(bn3, input_blob)
+        return identity
+
+    def get_fc1(self, last_conv, is_train, dropout=0.4):
+        body = last_conv
+        if self.fc_type == "Z":
+            body = paddle.static.nn.batch_norm(
+                input=body,
+                act=None,
+                epsilon=1e-05,
+                is_test=False if is_train else True)
+            if dropout > 0:
+                body = paddle.nn.functional.dropout(
+                    x=body,
+                    p=dropout,
+                    training=is_train,
+                    mode='upscale_in_train')
+            fc1 = body
+        elif self.fc_type == "E":
+            body = paddle.static.nn.batch_norm(
+                input=body,
+                act=None,
+                epsilon=1e-05,
+                is_test=False if is_train else True)
+            if dropout > 0:
+                body = paddle.nn.functional.dropout(
+                    x=body,
+                    p=dropout,
+                    training=is_train,
+                    mode='upscale_in_train')
+            fc1 = paddle.static.nn.fc(
+                x=body,
+                size=self.num_features,
+                weight_attr=paddle.ParamAttr(
+                    initializer=paddle.nn.initializer.XavierNormal(
+                        fan_in=0.0)),
+                bias_attr=paddle.ParamAttr(
+                    initializer=paddle.nn.initializer.Constant()))
+            fc1 = paddle.static.nn.batch_norm(
+                input=fc1,
+                act=None,
+                epsilon=1e-05,
+                is_test=False if is_train else True)
+
+        elif self.fc_type == "FC":
+            body = paddle.static.nn.batch_norm(
+                input=body,
+                act=None,
+                epsilon=1e-05,
+                is_test=False if is_train else True)
+            fc1 = paddle.static.nn.fc(
+                x=body,
+                size=self.num_features,
+                weight_attr=paddle.ParamAttr(
+                    initializer=paddle.nn.initializer.XavierNormal(
+                        fan_in=0.0)),
+                bias_attr=paddle.ParamAttr(
+                    initializer=paddle.nn.initializer.Constant()))
+            fc1 = paddle.static.nn.batch_norm(
+                input=fc1,
+                act=None,
+                epsilon=1e-05,
+                is_test=False if is_train else True)
+
+        return fc1
+
+
+def FresResNet50(**args):
+    model = FresResNet(layers=50, **args)
+    return model
+
+
+def FresResNet100(**args):
+    model = FresResNet(layers=100, **args)
+    return model
+
+
+def FresResNet101(**args):
+    model = FresResNet(layers=101, **args)
+    return model
+
+
+def FresResNet152(**args):
+    model = FresResNet(layers=152, **args)
+    return model
diff --git a/recognition/arcface_paddle/static/classifiers/__init__.py b/recognition/arcface_paddle/static/classifiers/__init__.py
new file mode 100644
index 0000000..0c0e52f
--- /dev/null
+++ b/recognition/arcface_paddle/static/classifiers/__init__.py
@@ -0,0 +1,15 @@
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from .lsc import LargeScaleClassifier
diff --git a/recognition/arcface_paddle/static/classifiers/lsc.py b/recognition/arcface_paddle/static/classifiers/lsc.py
new file mode 100644
index 0000000..0b968d2
--- /dev/null
+++ b/recognition/arcface_paddle/static/classifiers/lsc.py
@@ -0,0 +1,127 @@
+# copyright (c) 2021 PaddlePaddle Authors. All Rights Reserve.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import math
+from six.moves import reduce
+from collections import OrderedDict
+
+import paddle
+
+__all__ = ["LargeScaleClassifier"]
+
+
+class LargeScaleClassifier(object):
+    """
+    Author: {Xiang An, Yang Xiao, XuHan Zhu} in DeepGlint,
+    Partial FC: Training 10 Million Identities on a Single Machine
+    See the original paper:
+    https://arxiv.org/abs/2010.05222
+    """
+
+    def __init__(self,
+                 feature,
+                 label,
+                 rank,
+                 world_size,
+                 num_classes,
+                 margin1=1.0,
+                 margin2=0.5,
+                 margin3=0.0,
+                 scale=64.0,
+                 sample_ratio=1.0,
+                 embedding_size=512,
+                 name=None):
+        super(LargeScaleClassifier, self).__init__()
+        self.num_classes: int = num_classes
+        self.rank: int = rank
+        self.world_size: int = world_size
+        self.sample_ratio: float = sample_ratio
+        self.embedding_size: int = embedding_size
+        self.num_local: int = (num_classes + world_size - 1) // world_size
+        if num_classes % world_size != 0 and rank == world_size - 1:
+            self.num_local = num_classes % self.num_local
+        self.num_sample: int = int(self.sample_ratio * self.num_local)
+        self.margin1 = margin1
+        self.margin2 = margin2
+        self.margin3 = margin3
+        self.logit_scale = scale
+
+        self.input_dict = OrderedDict()
+        self.input_dict['feature'] = feature
+        self.input_dict['label'] = label
+
+        self.output_dict = OrderedDict()
+
+        if name is None:
+            name = 'dist@fc@rank@%05d' % rank
+
+        stddev = math.sqrt(2.0 / (self.embedding_size + self.num_local))
+        param_attr = paddle.ParamAttr(
+            initializer=paddle.nn.initializer.Normal(std=stddev))
+
+        weight_dtype = 'float16' if feature.dtype == paddle.float16 else 'float32'
+        weight = paddle.static.create_parameter(
+            shape=[self.embedding_size, self.num_local],
+            dtype=weight_dtype,
+            name=name,
+            attr=param_attr,
+            is_bias=False)
+
+        # avoid allreducing gradients for distributed parameters
+        weight.is_distributed = True
+        # avoid broadcasting distributed parameters in startup program
+        paddle.static.default_startup_program().global_block().vars[
+            weight.name].is_distributed = True
+
+        if self.world_size > 1:
+            feature_list = []
+            paddle.distributed.all_gather(feature_list, feature)
+            total_feature = paddle.concat(feature_list, axis=0)
+
+            label_list = []
+            paddle.distributed.all_gather(label_list, label)
+            total_label = paddle.concat(label_list, axis=0)
+            total_label.stop_gradient = True
+        else:
+            total_feature = feature
+            total_label = label
+
+        total_label.stop_gradient = True
+
+        if self.sample_ratio < 1.0:
+            # partial fc sample process
+            total_label, sampled_class_index = paddle.nn.functional.class_center_sample(
+                total_label, self.num_local, self.num_sample)
+            sampled_class_index.stop_gradient = True
+            weight = paddle.gather(weight, sampled_class_index, axis=1)
+
+        norm_feature = paddle.fluid.layers.l2_normalize(total_feature, axis=1)
+        norm_weight = paddle.fluid.layers.l2_normalize(weight, axis=0)
+
+        local_logit = paddle.matmul(norm_feature, norm_weight)
+
+        loss = paddle.nn.functional.margin_cross_entropy(
+            local_logit,
+            total_label,
+            margin1=self.margin1,
+            margin2=self.margin2,
+            margin3=self.margin3,
+            scale=self.logit_scale,
+            return_softmax=False,
+            reduction=None, )
+
+        loss.desc.set_dtype(paddle.fluid.core.VarDesc.VarType.FP32)
+        loss = paddle.mean(loss)
+
+        self.output_dict['loss'] = loss
diff --git a/recognition/arcface_paddle/static/export.py b/recognition/arcface_paddle/static/export.py
new file mode 100644
index 0000000..e1f3e07
--- /dev/null
+++ b/recognition/arcface_paddle/static/export.py
@@ -0,0 +1,94 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import errno
+import os
+import numpy as np
+import paddle
+
+from .utils.io import Checkpoint
+from . import backbones
+from .static_model import StaticModel
+
+
+def export_onnx(path_prefix, feed_vars, fetch_vars, executor, program):
+
+    from paddle2onnx.graph import PaddleGraph, ONNXGraph
+    from paddle2onnx.passes import PassManager
+
+    opset_version = 10
+    enable_onnx_checker = True
+    verbose = False
+
+    paddle_graph = PaddleGraph.build_from_program(program, feed_vars,
+                                                  fetch_vars,
+                                                  paddle.fluid.global_scope())
+
+    onnx_graph = ONNXGraph.build(paddle_graph, opset_version, verbose)
+    onnx_graph = PassManager.run_pass(onnx_graph, ['inplace_node_pass'])
+
+    onnx_proto = onnx_graph.export_proto(enable_onnx_checker)
+
+    try:
+        # mkdir may conflict if pserver and trainer are running on the same machine
+        dirname = os.path.dirname(path_prefix)
+        os.makedirs(dirname)
+    except OSError as e:
+        if e.errno != errno.EEXIST:
+            raise
+    model_path = path_prefix + ".onnx"
+    if os.path.isdir(model_path):
+        raise ValueError("'{}' is an existing directory.".format(model_path))
+
+    with open(model_path, 'wb') as f:
+        f.write(onnx_proto.SerializeToString())
+
+
+def export(args):
+    checkpoint = Checkpoint(
+        rank=0,
+        world_size=1,
+        embedding_size=args.embedding_size,
+        num_classes=None,
+        checkpoint_dir=args.checkpoint_dir, )
+
+    test_program = paddle.static.Program()
+    startup_program = paddle.static.Program()
+
+    test_model = StaticModel(
+        main_program=test_program,
+        startup_program=startup_program,
+        backbone_class_name=args.backbone,
+        embedding_size=args.embedding_size,
+        mode='test', )
+
+    gpu_id = int(os.getenv("FLAGS_selected_gpus", 0))
+    place = paddle.CUDAPlace(gpu_id)
+    exe = paddle.static.Executor(place)
+    exe.run(startup_program)
+
+    checkpoint.load(program=test_program, for_train=False, dtype='float32')
+    print("Load checkpoint from '{}'.".format(args.checkpoint_dir))
+
+    path = os.path.join(args.output_dir, args.backbone)
+    if args.export_type == 'onnx':
+        feed_vars = [test_model.backbone.input_dict['image'].name]
+        fetch_vars = [test_model.backbone.output_dict['feature']]
+        export_onnx(path, feed_vars, fetch_vars, exe, program=test_program)
+    else:
+        feed_vars = [test_model.backbone.input_dict['image']]
+        fetch_vars = [test_model.backbone.output_dict['feature']]
+        paddle.static.save_inference_model(
+            path, feed_vars, fetch_vars, exe, program=test_program)
+    print("Save exported model to '{}'.".format(args.output_dir))
diff --git a/recognition/arcface_paddle/static/static_model.py b/recognition/arcface_paddle/static/static_model.py
new file mode 100644
index 0000000..fe8f82a
--- /dev/null
+++ b/recognition/arcface_paddle/static/static_model.py
@@ -0,0 +1,159 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import time
+import os
+import sys
+import numpy as np
+
+import paddle
+from visualdl import LogWriter
+
+from utils.logging import AverageMeter, init_logging, CallBackLogging
+from utils import losses
+
+from .utils.optimization_pass import gather_optimization_pass, amp_pass
+
+from . import classifiers
+from . import backbones
+
+
+class StaticModel(object):
+    def __init__(self,
+                 main_program,
+                 startup_program,
+                 backbone_class_name,
+                 embedding_size,
+                 classifier_class_name=None,
+                 num_classes=None,
+                 sample_ratio=0.1,
+                 lr_scheduler=None,
+                 momentum=0.9,
+                 weight_decay=2e-4,
+                 dropout=0.4,
+                 mode='train',
+                 fp16=False,
+                 fp16_configs=None,
+                 margin_loss_params=None):
+
+        rank = int(os.getenv("PADDLE_TRAINER_ID", 0))
+        world_size = int(os.getenv("PADDLE_TRAINERS_NUM", 1))
+        if world_size > 1:
+            import paddle.distributed.fleet as fleet
+
+        self.main_program = main_program
+        self.startup_program = startup_program
+        self.backbone_class_name = backbone_class_name
+        self.embedding_size = embedding_size
+        self.classifier_class_name = classifier_class_name
+        self.num_classes = num_classes
+        self.sample_ratio = sample_ratio
+        self.lr_scheduler = lr_scheduler
+        self.momentum = momentum
+        self.weight_decay = weight_decay
+        self.mode = mode
+        self.fp16 = fp16
+        self.fp16_configs = fp16_configs
+        self.margin_loss_params = margin_loss_params
+
+        if self.mode == 'train':
+            assert self.classifier_class_name is not None
+            assert self.num_classes is not None
+            assert self.lr_scheduler is not None
+            assert self.margin_loss_params is not None
+            with paddle.static.program_guard(self.main_program,
+                                             self.startup_program):
+                with paddle.utils.unique_name.guard():
+                    self.backbone = eval("backbones.{}".format(
+                        self.backbone_class_name))(
+                            num_features=self.embedding_size,
+                            is_train=True,
+                            fp16=self.fp16,
+                            dropout=dropout)
+                    assert 'label' in self.backbone.input_dict
+                    assert 'feature' in self.backbone.output_dict
+                    self.classifier = eval("classifiers.{}".format(
+                        self.classifier_class_name))(
+                            feature=self.backbone.output_dict['feature'],
+                            label=self.backbone.input_dict['label'],
+                            rank=rank,
+                            world_size=world_size,
+                            num_classes=self.num_classes,
+                            margin1=self.margin_loss_params.margin1,
+                            margin2=self.margin_loss_params.margin2,
+                            margin3=self.margin_loss_params.margin3,
+                            scale=self.margin_loss_params.scale,
+                            sample_ratio=self.sample_ratio,
+                            embedding_size=self.embedding_size)
+                    assert 'loss' in self.classifier.output_dict
+
+                    self.optimizer = paddle.optimizer.Momentum(
+                        learning_rate=self.lr_scheduler,
+                        momentum=self.momentum,
+                        weight_decay=paddle.regularizer.L2Decay(
+                            self.weight_decay))
+                    if self.fp16:
+                        assert self.fp16_configs is not None
+                        self.optimizer = paddle.static.amp.decorate(
+                            optimizer=self.optimizer,
+                            init_loss_scaling=self.fp16_configs[
+                                'init_loss_scaling'],
+                            incr_every_n_steps=self.fp16_configs[
+                                'incr_every_n_steps'],
+                            decr_every_n_nan_or_inf=self.fp16_configs[
+                                'decr_every_n_nan_or_inf'],
+                            incr_ratio=self.fp16_configs['incr_ratio'],
+                            decr_ratio=self.fp16_configs['decr_ratio'],
+                            use_dynamic_loss_scaling=self.fp16_configs[
+                                'use_dynamic_loss_scaling'],
+                            use_pure_fp16=self.fp16_configs['use_pure_fp16'],
+                            amp_lists=paddle.static.amp.
+                            AutoMixedPrecisionLists(
+                                custom_white_list=self.fp16_configs[
+                                    'custom_white_list'],
+                                custom_black_list=self.fp16_configs[
+                                    'custom_black_list'], ),
+                            use_fp16_guard=False)
+
+                    if world_size > 1:
+                        dist_optimizer = fleet.distributed_optimizer(
+                            self.optimizer)
+                        dist_optimizer.minimize(self.classifier.output_dict[
+                            'loss'])
+                    else:
+                        self.optimizer.minimize(self.classifier.output_dict[
+                            'loss'])
+                    if self.fp16:
+                        self.optimizer = self.optimizer._optimizer
+                    if self.sample_ratio < 1.0:
+                        gather_optimization_pass(self.main_program,
+                                                 'dist@fc@rank')
+                    if self.fp16:
+                        amp_pass(self.main_program, 'dist@fc@rank')
+
+        elif self.mode == 'test':
+            with paddle.static.program_guard(self.main_program,
+                                             self.startup_program):
+                with paddle.utils.unique_name.guard():
+                    self.backbone = eval("backbones.{}".format(
+                        self.backbone_class_name))(
+                            num_features=self.embedding_size,
+                            is_train=False,
+                            fp16=self.fp16,
+                            dropout=dropout)
+                    assert 'feature' in self.backbone.output_dict
+
+        else:
+            raise ValueError(
+                "mode is error, only support 'train' and 'test' now.")
diff --git a/recognition/arcface_paddle/static/train.py b/recognition/arcface_paddle/static/train.py
new file mode 100644
index 0000000..27b0482
--- /dev/null
+++ b/recognition/arcface_paddle/static/train.py
@@ -0,0 +1,220 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import time
+import os
+import sys
+import numpy as np
+import logging
+
+import paddle
+from visualdl import LogWriter
+
+from utils.logging import AverageMeter, CallBackLogging
+from datasets import CommonDataset, SyntheticDataset
+from utils import losses
+
+from .utils.verification import CallBackVerification
+from .utils.io import Checkpoint
+
+from . import classifiers
+from . import backbones
+from .static_model import StaticModel
+
+RELATED_FLAGS_SETTING = {
+    'FLAGS_cudnn_exhaustive_search': 1,
+    'FLAGS_cudnn_batchnorm_spatial_persistent': 1,
+    'FLAGS_max_inplace_grad_add': 8,
+    'FLAGS_fraction_of_gpu_memory_to_use': 0.9999,
+}
+paddle.fluid.set_flags(RELATED_FLAGS_SETTING)
+
+
+def train(args):
+
+    writer = LogWriter(logdir=args.logdir)
+
+    rank = int(os.getenv("PADDLE_TRAINER_ID", 0))
+    world_size = int(os.getenv("PADDLE_TRAINERS_NUM", 1))
+
+    gpu_id = int(os.getenv("FLAGS_selected_gpus", 0))
+    place = paddle.CUDAPlace(gpu_id)
+
+    if world_size > 1:
+        import paddle.distributed.fleet as fleet
+        strategy = fleet.DistributedStrategy()
+        strategy.without_graph_optimization = True
+        fleet.init(is_collective=True, strategy=strategy)
+
+    if args.use_synthetic_dataset:
+        trainset = SyntheticDataset(args.num_classes, fp16=args.fp16)
+    else:
+        trainset = CommonDataset(
+            root_dir=args.data_dir,
+            label_file=args.label_file,
+            fp16=args.fp16,
+            is_bin=args.is_bin)
+
+    num_image = len(trainset)
+    total_batch_size = args.batch_size * world_size
+    steps_per_epoch = num_image // total_batch_size
+    if args.train_unit == 'epoch':
+        warmup_steps = steps_per_epoch * args.warmup_num
+        total_steps = steps_per_epoch * args.train_num
+        decay_steps = [x * steps_per_epoch for x in args.decay_boundaries]
+        total_epoch = args.train_num
+    else:
+        warmup_steps = args.warmup_num
+        total_steps = args.train_num
+        decay_steps = [x for x in args.decay_boundaries]
+        total_epoch = (total_steps + steps_per_epoch - 1) // steps_per_epoch
+
+    if rank == 0:
+        logging.info('world_size: {}'.format(world_size))
+        logging.info('total_batch_size: {}'.format(total_batch_size))
+        logging.info('warmup_steps: {}'.format(warmup_steps))
+        logging.info('steps_per_epoch: {}'.format(steps_per_epoch))
+        logging.info('total_steps: {}'.format(total_steps))
+        logging.info('total_epoch: {}'.format(total_epoch))
+        logging.info('decay_steps: {}'.format(decay_steps))
+
+    base_lr = total_batch_size * args.lr / 512
+    lr_scheduler = paddle.optimizer.lr.PiecewiseDecay(
+        boundaries=decay_steps,
+        values=[
+            base_lr * (args.lr_decay**i) for i in range(len(decay_steps) + 1)
+        ])
+    if warmup_steps > 0:
+        lr_scheduler = paddle.optimizer.lr.LinearWarmup(
+            lr_scheduler, warmup_steps, 0, base_lr)
+
+    train_program = paddle.static.Program()
+    test_program = paddle.static.Program()
+    startup_program = paddle.static.Program()
+
+    margin_loss_params = eval("losses.{}".format(args.loss))()
+    train_model = StaticModel(
+        main_program=train_program,
+        startup_program=startup_program,
+        backbone_class_name=args.backbone,
+        embedding_size=args.embedding_size,
+        classifier_class_name=args.classifier,
+        num_classes=args.num_classes,
+        sample_ratio=args.sample_ratio,
+        lr_scheduler=lr_scheduler,
+        momentum=args.momentum,
+        weight_decay=args.weight_decay,
+        dropout=args.dropout,
+        mode='train',
+        fp16=args.fp16,
+        fp16_configs={
+            'init_loss_scaling': args.init_loss_scaling,
+            'incr_every_n_steps': args.incr_every_n_steps,
+            'decr_every_n_nan_or_inf': args.decr_every_n_nan_or_inf,
+            'incr_ratio': args.incr_ratio,
+            'decr_ratio': args.decr_ratio,
+            'use_dynamic_loss_scaling': args.use_dynamic_loss_scaling,
+            'use_pure_fp16': args.fp16,
+            'custom_white_list': args.custom_white_list,
+            'custom_black_list': args.custom_black_list,
+        },
+        margin_loss_params=margin_loss_params, )
+
+    if rank == 0:
+        with open(os.path.join(args.output, 'main_program.txt'), 'w') as f:
+            f.write(str(train_program))
+
+    if rank == 0 and args.do_validation_while_train:
+        test_model = StaticModel(
+            main_program=test_program,
+            startup_program=startup_program,
+            backbone_class_name=args.backbone,
+            embedding_size=args.embedding_size,
+            dropout=args.dropout,
+            mode='test',
+            fp16=args.fp16, )
+
+        callback_verification = CallBackVerification(
+            args.validation_interval_step, rank, args.batch_size, test_program,
+            list(test_model.backbone.input_dict.values()),
+            list(test_model.backbone.output_dict.values()), args.val_targets,
+            args.data_dir)
+
+    callback_logging = CallBackLogging(args.log_interval_step, rank,
+                                       world_size, total_steps,
+                                       args.batch_size, writer)
+    checkpoint = Checkpoint(
+        rank=rank,
+        world_size=world_size,
+        embedding_size=args.embedding_size,
+        num_classes=args.num_classes,
+        model_save_dir=os.path.join(args.output, args.backbone),
+        checkpoint_dir=args.checkpoint_dir,
+        max_num_last_checkpoint=args.max_num_last_checkpoint)
+
+    exe = paddle.static.Executor(place)
+    exe.run(startup_program)
+
+    start_epoch = 0
+    global_step = 0
+    loss_avg = AverageMeter()
+    if args.resume:
+        extra_info = checkpoint.load(program=train_program, for_train=True)
+        start_epoch = extra_info['epoch'] + 1
+        lr_state = extra_info['lr_state']
+        # there last_epoch means last_step in for PiecewiseDecay
+        # since we always use step style for lr_scheduler
+        global_step = lr_state['last_epoch']
+        train_model.lr_scheduler.set_state_dict(lr_state)
+
+    train_loader = paddle.io.DataLoader(
+        trainset,
+        feed_list=list(train_model.backbone.input_dict.values()),
+        places=place,
+        return_list=False,
+        num_workers=args.num_workers,
+        batch_sampler=paddle.io.DistributedBatchSampler(
+            dataset=trainset,
+            batch_size=args.batch_size,
+            shuffle=True,
+            drop_last=True))
+
+    max_loss_scaling = np.array([args.max_loss_scaling]).astype(np.float32)
+    for epoch in range(start_epoch, total_epoch):
+        for step, data in enumerate(train_loader):
+            global_step += 1
+
+            loss_v = exe.run(
+                train_program,
+                feed=data,
+                fetch_list=[train_model.classifier.output_dict['loss']],
+                use_program_cache=True)
+
+            loss_avg.update(np.array(loss_v)[0], 1)
+            lr_value = train_model.optimizer.get_lr()
+            callback_logging(global_step, loss_avg, epoch, lr_value)
+            if rank == 0 and args.do_validation_while_train:
+                callback_verification(global_step)
+            train_model.lr_scheduler.step()
+
+            if global_step >= total_steps:
+                break
+            sys.stdout.flush()
+
+        checkpoint.save(
+            train_program,
+            lr_scheduler=train_model.lr_scheduler,
+            epoch=epoch,
+            for_train=True)
+    writer.close()
diff --git a/recognition/arcface_paddle/static/utils/__init__.py b/recognition/arcface_paddle/static/utils/__init__.py
new file mode 100644
index 0000000..185a92b
--- /dev/null
+++ b/recognition/arcface_paddle/static/utils/__init__.py
@@ -0,0 +1,13 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/recognition/arcface_paddle/static/utils/io.py b/recognition/arcface_paddle/static/utils/io.py
new file mode 100644
index 0000000..29f70b6
--- /dev/null
+++ b/recognition/arcface_paddle/static/utils/io.py
@@ -0,0 +1,198 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import errno
+import os
+import paddle
+import logging
+import numpy as np
+import shutil
+import json
+from utils.rearrange_weight import rearrange_weight
+
+
+class Checkpoint(object):
+    def __init__(self,
+                 rank,
+                 world_size,
+                 embedding_size,
+                 num_classes,
+                 model_save_dir="./",
+                 checkpoint_dir=None,
+                 max_num_last_checkpoint=3):
+
+        self.rank: int = rank
+        self.world_size: int = world_size
+        self.embedding_size: int = embedding_size
+        self.num_classes: int = num_classes
+        self.model_save_dir: str = model_save_dir
+        self.checkpoint_dir: str = checkpoint_dir
+        self.max_num_last_checkpoint: int = max_num_last_checkpoint
+
+    def save(self, program, lr_scheduler=None, epoch=0, for_train=True):
+        model_save_dir = os.path.join(self.model_save_dir, str(epoch))
+        if not os.path.exists(model_save_dir):
+            # may be more than one processes trying
+            # to create the directory
+            try:
+                os.makedirs(model_save_dir)
+            except OSError as exc:
+                if exc.errno != errno.EEXIST:
+                    raise
+                pass
+
+        param_state_dict = program.state_dict(mode='param')
+        for name, param in param_state_dict.items():
+            # for non dist param, we only save their at rank 0,
+            # but for dist param, we need to save their at all ranks.
+            if 'dist@' in name and '@rank@' in name or self.rank == 0:
+                paddle.save(param,
+                            os.path.join(model_save_dir, name + '.pdparam'))
+
+        if for_train:
+            opt_state_dict = program.state_dict(mode='opt')
+            for name, opt in opt_state_dict.items():
+                if '@GRAD' in name:
+                    continue
+                # for non dist opt var, we only save their at rank 0,
+                # but for dist opt var, we need to save their at all ranks.
+                if 'dist@' in name and '@rank@' in name or self.rank == 0:
+                    paddle.save(opt,
+                                os.path.join(model_save_dir, name + '.pdopt'))
+
+            if self.rank == 0:
+                # save some extra info for resume
+                # pretrain_world_size, embedding_size, num_classes are used for
+                # re-split fc weight when gpu setting changed.
+                # epoch use to restart.
+                config_file = os.path.join(model_save_dir, 'meta.json')
+                extra_info = dict()
+                extra_info["pretrain_world_size"] = self.world_size
+                extra_info["embedding_size"] = self.embedding_size
+                extra_info['num_classes'] = self.num_classes
+                extra_info['epoch'] = epoch
+                extra_info['lr_state'] = lr_scheduler.state_dict()
+                with open(config_file, 'w') as f:
+                    json.dump(extra_info, f)
+
+        logging.info("Save model to {}.".format(model_save_dir))
+        if self.rank == 0 and self.max_num_last_checkpoint > 0:
+            for idx in range(-1, epoch - self.max_num_last_checkpoint + 1):
+                path = os.path.join(self.model_save_dir, str(idx))
+                if os.path.exists(path):
+                    logging.info("Remove checkpoint {}.".format(path))
+                    shutil.rmtree(path)
+
+    def load(self, program, for_train=True, dtype=None):
+        assert os.path.exists(self.checkpoint_dir)
+        checkpoint_dir = os.path.abspath(self.checkpoint_dir)
+
+        state_dict = {}
+        dist_weight_state_dict = {}
+        dist_weight_velocity_state_dict = {}
+        dist_bias_state_dict = {}
+        dist_bias_velocity_state_dict = {}
+        for path in os.listdir(checkpoint_dir):
+            path = os.path.join(checkpoint_dir, path)
+            if not os.path.isfile(path):
+                continue
+
+            basename = os.path.basename(path)
+            name, ext = os.path.splitext(basename)
+
+            if ext not in ['.pdopt', '.pdparam']:
+                continue
+
+            if not for_train and ext == '.pdopt':
+                continue
+
+            tensor = paddle.load(path, return_numpy=True)
+            if dtype:
+                assert dtype in ['float32', 'float16']
+                tensor = tensor.astype('float32')
+
+            if 'dist@' in name and '@rank@' in name:
+                if '.w' in name and 'velocity' not in name:
+                    dist_weight_state_dict[name] = tensor
+                elif '.w' in name and 'velocity' in name:
+                    dist_weight_velocity_state_dict[name] = tensor
+                elif '.b' in name and 'velocity' not in name:
+                    dist_bias_state_dict[name] = tensor
+                elif '.b' in name and 'velocity' in name:
+                    dist_bias_velocity_state_dict[name] = tensor
+
+            else:
+                state_dict[name] = tensor
+
+        if for_train:
+            meta_file = os.path.join(checkpoint_dir, 'meta.json')
+            if not os.path.exists(meta_file):
+                logging.error(
+                    "Please make sure the checkpoint dir {} exists, and "
+                    "parameters in that dir are validating.".format(
+                        checkpoint_dir))
+                exit()
+
+            with open(meta_file, 'r') as handle:
+                extra_info = json.load(handle)
+
+        # Preporcess distributed parameters.
+        if self.world_size > 1:
+            pretrain_world_size = extra_info['pretrain_world_size']
+            assert pretrain_world_size > 0
+            embedding_size = extra_info['embedding_size']
+            assert embedding_size == self.embedding_size
+            num_classes = extra_info['num_classes']
+            assert num_classes == self.num_classes
+
+            logging.info(
+                "Parameters for pre-training: pretrain_world_size ({}), "
+                "embedding_size ({}), and num_classes ({}).".format(
+                    pretrain_world_size, embedding_size, num_classes))
+            logging.info("Parameters for inference or fine-tuning: "
+                         "world_size ({}).".format(self.world_size))
+
+            rank_str = '%05d' % self.rank
+
+            dist_weight_state_dict = rearrange_weight(
+                dist_weight_state_dict, pretrain_world_size, self.world_size)
+            dist_bias_state_dict = rearrange_weight(
+                dist_bias_state_dict, pretrain_world_size, self.world_size)
+            for name, value in dist_weight_state_dict.items():
+                if rank_str in name:
+                    state_dict[name] = value
+            for name, value in dist_bias_state_dict.items():
+                if rank_str in name:
+                    state_dict[name] = value
+
+            if for_train:
+                dist_weight_velocity_state_dict = rearrange_weight(
+                    dist_weight_velocity_state_dict, pretrain_world_size,
+                    self.world_size)
+                dist_bias_velocity_state_dict = rearrange_weight(
+                    dist_bias_velocity_state_dict, pretrain_world_size,
+                    self.world_size)
+                for name, value in dist_weight_velocity_state_dict.items():
+                    if rank_str in name:
+                        state_dict[name] = value
+                for name, value in dist_bias_velocity_state_dict.items():
+                    if rank_str in name:
+                        state_dict[name] = value
+
+        program.set_state_dict(state_dict)
+        logging.info("Load checkpoint from '{}'. ".format(checkpoint_dir))
+        if for_train:
+            return extra_info
+        else:
+            return {}
diff --git a/recognition/arcface_paddle/static/utils/optimization_pass.py b/recognition/arcface_paddle/static/utils/optimization_pass.py
new file mode 100644
index 0000000..817fb6f
--- /dev/null
+++ b/recognition/arcface_paddle/static/utils/optimization_pass.py
@@ -0,0 +1,124 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+def check_contains(name, name_list):
+    for n in name_list:
+        if name in n:
+            return True
+    return False
+
+
+def gather_optimization_pass(program, weight_name):
+    op_idxs = []
+    gather_grad_op = None
+    momentum_op = None
+    for idx, op in enumerate(program.global_block().ops):
+        if (op.type == 'gather_grad' or
+                op.type == 'momentum') and check_contains(weight_name,
+                                                          op.input_arg_names):
+            op_idxs.append(idx)
+            if op.type == 'momentum':
+                momentum_op = op
+            if op.type == 'gather_grad':
+                gather_grad_op = op
+
+    if gather_grad_op is not None and momentum_op is not None:
+        inputs = {
+            'Param': momentum_op.input('Param'),
+            'Velocity': momentum_op.input('Velocity'),
+            'LearningRate': momentum_op.input('LearningRate'),
+            'Grad': gather_grad_op.input('Out@GRAD'),
+            'Index': gather_grad_op.input('Index'),
+            'Axis': gather_grad_op.input('Axis'),
+        }
+        outputs = {
+            'ParamOut': momentum_op.output('ParamOut'),
+            'VelocityOut': momentum_op.output('VelocityOut'),
+        }
+        if 'MasterParam' in momentum_op.input_names and len(
+                momentum_op.input('MasterParam')) > 0:
+            inputs['MasterParam'] = momentum_op.input('MasterParam')
+        if 'MasterParamOut' in momentum_op.output_names and len(
+                momentum_op.output('MasterParamOut')) > 0:
+            outputs['MasterParamOut'] = momentum_op.output('MasterParamOut')
+
+        attrs = {
+            'mu': momentum_op.attr('mu'),
+            'use_nesterov': momentum_op.attr('use_nesterov'),
+            'regularization_method': momentum_op.attr('regularization_method'),
+            'regularization_coeff': momentum_op.attr('regularization_coeff'),
+            'multi_precision': momentum_op.attr('multi_precision'),
+            'rescale_grad': momentum_op.attr('rescale_grad'),
+            'op_device': momentum_op.attr('op_device'),
+            'op_namescope': momentum_op.attr('op_namescope'),
+            'op_role': momentum_op.attr('op_role'),
+            'op_role_var': momentum_op.input('Param'),
+            'axis': gather_grad_op.attr('axis'),
+        }
+        program.global_block()._insert_op(
+            op_idxs[-1] + 1,
+            type='sparse_momentum',
+            inputs=inputs,
+            outputs=outputs,
+            attrs=attrs)
+
+        for idx in reversed(op_idxs):
+            program.global_block()._remove_op(idx, sync=False)
+
+        var_names = []
+        for idx, name in enumerate(program.global_block().vars):
+            if '@GRAD' in name and weight_name in name:
+                var_names.append(name)
+        for name in var_names:
+            program.global_block()._remove_var(name, sync=False)
+        program.global_block()._sync_with_cpp()
+
+
+def amp_pass(program, weight_name):
+    for idx, op in enumerate(program.global_block().ops):
+        if (op.type == 'update_loss_scaling' or
+                op.type == 'check_finite_and_unscale'):
+            input_idxs = []
+            input_arg_names = op.input("X")
+            # input_arg_names.append(gather_grad_op.input('Out@GRAD')[0])
+            for i, name in enumerate(input_arg_names):
+                if '@GRAD' in name and weight_name in name:
+                    input_idxs.append(i)
+            if len(input_idxs) > 0:
+                for i in reversed(input_idxs):
+                    input_arg_names.pop(i)
+                op.desc.set_input("X", input_arg_names)
+
+            output_idxs = []
+            output_arg_names = op.output("Out")
+            # output_arg_names.append(gather_grad_op.input('Out@GRAD')[0])
+            for i, name in enumerate(output_arg_names):
+                if '@GRAD' in name and weight_name in name:
+                    output_idxs.append(i)
+            if len(output_idxs) > 0:
+                for i in reversed(output_idxs):
+                    output_arg_names.pop(i)
+                op.desc.set_output("Out", output_arg_names)
+
+            if op.type == 'check_finite_and_unscale':
+                op_role_idxs = []
+                op_role_var = op.attr("op_role_var")
+                for i, name in enumerate(op_role_var):
+                    if '@GRAD' in name and weight_name in name:
+                        op_role_idxs.append(i)
+                if len(op_role_idxs) > 0:
+                    for i in reversed(op_role_idxs):
+                        op_role_var.pop(i)
+                    op.desc._set_attr("op_role_var", op_role_var)
diff --git a/recognition/arcface_paddle/static/utils/verification.py b/recognition/arcface_paddle/static/utils/verification.py
new file mode 100644
index 0000000..102ebf9
--- /dev/null
+++ b/recognition/arcface_paddle/static/utils/verification.py
@@ -0,0 +1,130 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import time
+import os
+import numpy as np
+import sklearn
+import paddle
+import logging
+
+from utils.verification import evaluate
+from datasets import load_bin
+
+
+def test(rank, batch_size, data_set, executor, test_program, data_feeder,
+         fetch_list):
+
+    data_list = data_set[0]
+    issame_list = data_set[1]
+    embeddings_list = []
+
+    # data_list[0] for normalize
+    # data_list[1] for flip_left_right
+    for i in range(len(data_list)):
+        data = data_list[i]
+        embeddings = None
+        ba = 0
+        while ba < data.shape[0]:
+            bb = min(ba + batch_size, data.shape[0])
+            count = bb - ba
+            _data = []
+            for k in range(bb - batch_size, bb):
+                _data.append((data[k], ))
+            [_embeddings] = executor.run(test_program,
+                                         fetch_list=fetch_list,
+                                         feed=data_feeder.feed(_data),
+                                         use_program_cache=True)
+            if embeddings is None:
+                embeddings = np.zeros((data.shape[0], _embeddings.shape[1]))
+            embeddings[ba:bb, :] = _embeddings[(batch_size - count):, :]
+            ba = bb
+        embeddings_list.append(embeddings)
+
+    xnorm = 0.0
+    xnorm_cnt = 0
+    for embed in embeddings_list:
+        xnorm += np.sqrt((embed * embed).sum(axis=1)).sum(axis=0)
+        xnorm_cnt += embed.shape[0]
+    xnorm /= xnorm_cnt
+
+    embeddings = embeddings_list[0] + embeddings_list[1]
+    embeddings = sklearn.preprocessing.normalize(embeddings)
+    _, _, accuracy, val, val_std, far = evaluate(
+        embeddings, issame_list, nrof_folds=10)
+    acc, std = np.mean(accuracy), np.std(accuracy)
+    return acc, std, xnorm
+
+
+class CallBackVerification(object):
+    def __init__(self,
+                 frequent,
+                 rank,
+                 batch_size,
+                 test_program,
+                 feed_list,
+                 fetch_list,
+                 val_targets,
+                 rec_prefix,
+                 image_size=(112, 112)):
+        self.frequent: int = frequent
+        self.rank: int = rank
+        self.batch_size: int = batch_size
+
+        self.test_program: paddle.static.Program = test_program
+        self.feed_list: List[paddle.fluid.framework.Variable] = feed_list
+        self.fetch_list: List[paddle.fluid.framework.Variable] = fetch_list
+
+        self.highest_acc_list: List[float] = [0.0] * len(val_targets)
+        self.ver_list: List[object] = []
+        self.ver_name_list: List[str] = []
+        self.init_dataset(
+            val_targets=val_targets,
+            data_dir=rec_prefix,
+            image_size=image_size)
+
+        gpu_id = int(os.getenv("FLAGS_selected_gpus", 0))
+        place = paddle.CUDAPlace(gpu_id)
+        self.executor = paddle.static.Executor(place)
+        self.data_feeder = paddle.fluid.DataFeeder(
+            place=place, feed_list=self.feed_list, program=self.test_program)
+
+    def ver_test(self, global_step: int):
+        for i in range(len(self.ver_list)):
+            test_start = time.time()
+            acc2, std2, xnorm = test(
+                self.rank, self.batch_size, self.ver_list[i], self.executor,
+                self.test_program, self.data_feeder, self.fetch_list)
+            logging.info('[%s][%d]XNorm: %f' %
+                         (self.ver_name_list[i], global_step, xnorm))
+            logging.info('[%s][%d]Accuracy-Flip: %1.5f+-%1.5f' %
+                         (self.ver_name_list[i], global_step, acc2, std2))
+            if acc2 > self.highest_acc_list[i]:
+                self.highest_acc_list[i] = acc2
+            logging.info('[%s][%d]Accuracy-Highest: %1.5f' % (
+                self.ver_name_list[i], global_step, self.highest_acc_list[i]))
+            test_end = time.time()
+            logging.info("test time: {:.4f}".format(test_end - test_start))
+
+    def init_dataset(self, val_targets, data_dir, image_size):
+        for name in val_targets:
+            path = os.path.join(data_dir, name + ".bin")
+            if os.path.exists(path):
+                data_set = load_bin(path, image_size)
+                self.ver_list.append(data_set)
+                self.ver_name_list.append(name)
+
+    def __call__(self, num_update):
+        if self.rank == 0 and num_update > 0 and num_update % self.frequent == 0:
+            self.ver_test(num_update)
diff --git a/recognition/arcface_paddle/static/validation.py b/recognition/arcface_paddle/static/validation.py
new file mode 100644
index 0000000..0ab6ac6
--- /dev/null
+++ b/recognition/arcface_paddle/static/validation.py
@@ -0,0 +1,58 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import errno
+import os
+import numpy as np
+import paddle
+
+from .utils.verification import CallBackVerification
+from .utils.io import Checkpoint
+from .static_model import StaticModel
+
+from . import backbones
+
+
+def validation(args):
+    checkpoint = Checkpoint(
+        rank=0,
+        world_size=1,
+        embedding_size=args.embedding_size,
+        num_classes=None,
+        checkpoint_dir=args.checkpoint_dir, )
+
+    test_program = paddle.static.Program()
+    startup_program = paddle.static.Program()
+
+    test_model = StaticModel(
+        main_program=test_program,
+        startup_program=startup_program,
+        backbone_class_name=args.backbone,
+        embedding_size=args.embedding_size,
+        mode='test', )
+
+    gpu_id = int(os.getenv("FLAGS_selected_gpus", 0))
+    place = paddle.CUDAPlace(gpu_id)
+    exe = paddle.static.Executor(place)
+    exe.run(startup_program)
+
+    checkpoint.load(program=test_program, for_train=False)
+
+    callback_verification = CallBackVerification(
+        1, 0, args.batch_size, test_program,
+        list(test_model.backbone.input_dict.values()),
+        list(test_model.backbone.output_dict.values()), args.val_targets,
+        args.data_dir)
+
+    callback_verification(1)
diff --git a/recognition/arcface_paddle/test_time.py b/recognition/arcface_paddle/tools/benchmark_speed.py
similarity index 100%
rename from recognition/arcface_paddle/test_time.py
rename to recognition/arcface_paddle/tools/benchmark_speed.py
diff --git a/recognition/arcface_paddle/tools/export.py b/recognition/arcface_paddle/tools/export.py
new file mode 100644
index 0000000..ace61aa
--- /dev/null
+++ b/recognition/arcface_paddle/tools/export.py
@@ -0,0 +1,72 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import sys
+import os
+sys.path.insert(0, os.path.abspath('.'))
+
+import argparse
+
+
+def str2bool(v):
+    return str(v).lower() in ("true", "t", "1")
+
+
+def parse_args():
+    parser = argparse.ArgumentParser(description='Paddle Face Exporter')
+
+    # Model setting
+    parser.add_argument(
+        '--is_static',
+        type=str2bool,
+        default='False',
+        help='whether to use static mode')
+    parser.add_argument(
+        '--export_type',
+        type=str,
+        default='paddle',
+        help='export type, paddle or onnx')
+    parser.add_argument(
+        '--backbone',
+        type=str,
+        default='FresResNet50',
+        help='backbone network')
+    parser.add_argument(
+        '--embedding_size', type=int, default=512, help='embedding size')
+    parser.add_argument(
+        '--checkpoint_dir',
+        type=str,
+        default='MS1M_v3_arcface/FresResNet50/24/',
+        help='checkpoint direcotry')
+    parser.add_argument(
+        '--output_dir',
+        type=str,
+        default='MS1M_v3_arcface/FresResNet50/exported_model',
+        help='export output direcotry')
+
+    args = parser.parse_args()
+    return args
+
+
+if __name__ == '__main__':
+    args = parse_args()
+    if args.is_static:
+        import paddle
+        paddle.enable_static()
+        from static.export import export
+    else:
+        from dynamic.export import export
+
+    assert args.export_type in ['paddle', 'onnx']
+    export(args)
diff --git a/recognition/arcface_paddle/tools/extract_perf_logs.py b/recognition/arcface_paddle/tools/extract_perf_logs.py
new file mode 100644
index 0000000..933880b
--- /dev/null
+++ b/recognition/arcface_paddle/tools/extract_perf_logs.py
@@ -0,0 +1,153 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import re
+import sys
+import glob
+import json
+import argparse
+import pprint
+
+import numpy as np
+
+pp = pprint.PrettyPrinter(indent=1)
+
+parser = argparse.ArgumentParser(description="flags for benchmark")
+parser.add_argument("--log_dir", type=str, default="./logs/", required=True)
+parser.add_argument(
+    "--output_dir", type=str, default="./logs/", required=False)
+parser.add_argument('--warmup_batches', type=int, default=50)
+parser.add_argument('--train_batches', type=int, default=150)
+
+args = parser.parse_args()
+
+
+class AutoVivification(dict):
+    """Implementation of perl's autovivification feature."""
+
+    def __getitem__(self, item):
+        try:
+            return dict.__getitem__(self, item)
+        except KeyError:
+            value = self[item] = type(self)()
+        return value
+
+
+def compute_median(iter_dict):
+    speed_list = [i for i in iter_dict.values()]
+    return round(np.median(speed_list), 2)
+
+
+def compute_average(iter_dict):
+    i = 0
+    total_speed = 0
+    for iter in iter_dict:
+        i += 1
+        total_speed += iter_dict[iter]
+    return round(total_speed / i, 2)
+
+
+def extract_info_from_file(log_file, result_dict, speed_dict):
+    # extract info from file name
+    exp_config = log_file.split("/")[-2]
+    model = exp_config.split("_")[2]
+    mode = exp_config.split("_")[3]
+    precision = exp_config.split("_")[4]
+    batch_size_per_device = exp_config.split("_")[6]
+    run_case = exp_config.split("_")[7]  # eg: 1n1g
+    test_iter = int(exp_config.split("_")[8][2:])
+    node_num = int(run_case[0])
+    if len(run_case) == 4:
+        card_num = int(run_case[-2])
+    elif len(run_case) == 5:
+        card_num = int(run_case[-3:-1])
+
+    avg_speed_list = []
+    # extract info from file content
+    with open(log_file) as f:
+        lines = f.readlines()
+        for line in lines:
+            if "throughput:" in line:
+                p1 = re.compile(r" throughput: ([0-9]+\.[0-9]+)", re.S)
+                item = re.findall(p1, line)
+                a = float(item[0].strip())
+                avg_speed_list.append(a)
+
+    # compute avg throughoutput
+    avg_speed = round(
+        np.mean(avg_speed_list[args.warmup_batches:args.train_batches]), 2)
+
+    speed_dict[mode][model][run_case][precision][batch_size_per_device][
+        test_iter] = avg_speed
+    average_speed = compute_average(speed_dict[mode][model][run_case][
+        precision][batch_size_per_device])
+    median_speed = compute_median(speed_dict[mode][model][run_case][precision][
+        batch_size_per_device])
+
+    result_dict[mode][model][run_case][precision][batch_size_per_device][
+        'average_speed'] = average_speed
+    result_dict[mode][model][run_case][precision][batch_size_per_device][
+        'median_speed'] = median_speed
+
+    # print(log_file, speed_dict[mode][model][run_case])
+
+
+def compute_speedup(result_dict, speed_dict):
+    mode_list = [key for key in result_dict]  # eg. ['static', 'dynamic']
+    for md in mode_list:
+        model_list = [key for key in result_dict[md]]  # eg.['vgg16', 'r50']
+        for m in model_list:
+            run_case = [key for key in result_dict[md][m]
+                        ]  # eg.['4n8g', '2n8g', '1n8g', '1n4g', '1n1g']
+            for d in run_case:
+                precision = [key for key in result_dict[md][m][d]]
+                for p in precision:
+                    batch_size_per_device = [
+                        key for key in result_dict[md][m][d][p]
+                    ]
+                    for b in batch_size_per_device:
+                        speed_up = 1.0
+                        if result_dict[md][m]['1n1g'][p][b]['median_speed']:
+                            speed_up = result_dict[md][m][d][p][b][
+                                'median_speed'] / result_dict[md][m]['1n1g'][
+                                    p][b]['median_speed']
+                        result_dict[md][m][d][p][b]['speedup'] = round(
+                            speed_up, 2)
+
+
+def extract_result():
+    result_dict = AutoVivification()
+    speed_dict = AutoVivification()
+    logs_list = glob.glob(os.path.join(args.log_dir, "*/workerlog.0"))
+    for l in logs_list:
+        extract_info_from_file(l, result_dict, speed_dict)
+
+    # compute speedup
+    compute_speedup(result_dict, speed_dict)
+
+    # print result
+    pp.pprint(result_dict)
+
+    # write to file as JSON format
+    os.makedirs(args.output_dir, exist_ok=True)
+    result_file_name = os.path.join(args.output_dir,
+                                    "arcface_paddle_result.json")
+    print("Saving result to {}".format(result_file_name))
+    with open(result_file_name, 'w') as f:
+        json.dump(result_dict, f)
+
+
+if __name__ == "__main__":
+    extract_result()
diff --git a/recognition/arcface_paddle/tools/inference.py b/recognition/arcface_paddle/tools/inference.py
new file mode 100644
index 0000000..e9f3365
--- /dev/null
+++ b/recognition/arcface_paddle/tools/inference.py
@@ -0,0 +1,107 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import sys
+import os
+import cv2
+import argparse
+import numpy as np
+
+sys.path.insert(0, os.path.abspath('.'))
+
+
+def parse_args():
+    parser = argparse.ArgumentParser(description='Paddle Face Predictor')
+
+    parser.add_argument(
+        '--export_type', type=str, help='export type, paddle or onnx')
+    parser.add_argument(
+        "--model_file",
+        type=str,
+        required=False,
+        help="paddle save inference model filename")
+    parser.add_argument(
+        "--params_file",
+        type=str,
+        required=False,
+        help="paddle save inference parameter filename")
+    parser.add_argument(
+        "--onnx_file", type=str, required=False, help="onnx model filename")
+    parser.add_argument("--image_path", type=str, help="path to test image")
+
+    args = parser.parse_args()
+    return args
+
+
+def paddle_inference(args):
+    import paddle.inference as paddle_infer
+
+    config = paddle_infer.Config(args.model_file, args.params_file)
+    predictor = paddle_infer.create_predictor(config)
+
+    input_names = predictor.get_input_names()
+    input_handle = predictor.get_input_handle(input_names[0])
+
+    img = cv2.imread(args.image_path)
+    # normalize to mean 0.5, std 0.5
+    img = (img - 127.5) * 0.00784313725
+    # BGR2RGB
+    img = img[:, :, ::-1]
+    img = img.transpose((2, 0, 1))
+    img = np.expand_dims(img, 0)
+    img = img.astype('float32')
+
+    input_handle.copy_from_cpu(img)
+
+    predictor.run()
+
+    output_names = predictor.get_output_names()
+    output_handle = predictor.get_output_handle(output_names[0])
+    output_data = output_handle.copy_to_cpu()
+
+    print('paddle inference result: ', output_data.shape)
+
+
+def onnx_inference(args):
+    import onnxruntime
+
+    ort_sess = onnxruntime.InferenceSession(args.onnx_file)
+
+    img = cv2.imread(args.image_path)
+    # normalize to mean 0.5, std 0.5
+    img = (img - 127.5) * 0.00784313725
+    # BGR2RGB
+    img = img[:, :, ::-1]
+    img = img.transpose((2, 0, 1))
+    img = np.expand_dims(img, 0)
+    img = img.astype('float32')
+
+    ort_inputs = {ort_sess.get_inputs()[0].name: img}
+    ort_outs = ort_sess.run(None, ort_inputs)
+
+    print('onnx inference result: ', ort_outs[0].shape)
+
+
+if __name__ == '__main__':
+
+    args = parse_args()
+
+    assert args.export_type in ['paddle', 'onnx']
+    if args.export_type == 'onnx':
+        assert os.path.exists(args.onnx_file)
+        onnx_inference(args)
+    else:
+        assert os.path.exists(args.model_file)
+        assert os.path.exists(args.params_file)
+        paddle_inference(args)
diff --git a/recognition/arcface_paddle/tools/mx_recordio_2_images.py b/recognition/arcface_paddle/tools/mx_recordio_2_images.py
new file mode 100644
index 0000000..fba25b9
--- /dev/null
+++ b/recognition/arcface_paddle/tools/mx_recordio_2_images.py
@@ -0,0 +1,82 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+# 
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+# 
+#     http://www.apache.org/licenses/LICENSE-2.0
+# 
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+import argparse
+import numpy as np
+import numbers
+import mxnet as mx
+import cv2
+import tqdm
+import shutil
+
+
+def main(args):
+    path_imgrec = os.path.join(args.root_dir, 'train.rec')
+    path_imgidx = os.path.join(args.root_dir, 'train.idx')
+    imgrec = mx.recordio.MXIndexedRecordIO(path_imgidx, path_imgrec, 'r')
+    s = imgrec.read_idx(0)
+    header, _ = mx.recordio.unpack(s)
+    if header.flag > 0:
+        header0 = (int(header.label[0]), int(header.label[1]))
+        imgidx = np.array(range(1, int(header.label[0])))
+    else:
+        imgidx = np.array(list(imgrec.keys))
+
+    classes = set()
+    os.makedirs(os.path.join(args.output_dir, 'images'), exist_ok=True)
+    fp = open(os.path.join(args.output_dir, 'label.txt'), 'w')
+    for idx in tqdm.tqdm(imgidx):
+        s = imgrec.read_idx(idx)
+        header, img = mx.recordio.unpack(s)
+        label = header.label
+        if not isinstance(label, numbers.Number):
+            label = label[0]
+        img = mx.image.imdecode(img).asnumpy()[..., ::-1]
+        label = int(label)
+        classes.add(label)
+
+        filename = 'images/%08d.jpg' % idx
+        fp.write('%s\t%d\n' % (filename, label))
+        cv2.imwrite(
+            os.path.join(args.output_dir, filename), img,
+            [int(cv2.IMWRITE_JPEG_QUALITY), 100])
+    fp.close()
+    shutil.copy(
+        os.path.join(args.root_dir, 'agedb_30.bin'),
+        os.path.join(args.output_dir, 'agedb_30.bin'))
+    shutil.copy(
+        os.path.join(args.root_dir, 'cfp_fp.bin'),
+        os.path.join(args.output_dir, 'cfp_fp.bin'))
+    shutil.copy(
+        os.path.join(args.root_dir, 'lfw.bin'),
+        os.path.join(args.output_dir, 'lfw.bin'))
+    print('num_image: ', len(imgidx), 'num_classes: ', len(classes))
+    with open(os.path.join(args.output_dir, 'README.md'), 'w') as f:
+        f.write('num_image: {}\n'.format(len(imgidx)))
+        f.write('num_classes: {}\n'.format(len(classes)))
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        "--root_dir",
+        type=str,
+        help="Root directory to mxnet dataset.", )
+    parser.add_argument(
+        "--output_dir",
+        type=str,
+        help="Path to output.", )
+    args = parser.parse_args()
+    main(args)
diff --git a/recognition/arcface_paddle/test_recognition.py b/recognition/arcface_paddle/tools/test_recognition.py
similarity index 82%
rename from recognition/arcface_paddle/test_recognition.py
rename to recognition/arcface_paddle/tools/test_recognition.py
index abe8304..db730a5 100644
--- a/recognition/arcface_paddle/test_recognition.py
+++ b/recognition/arcface_paddle/tools/test_recognition.py
@@ -32,8 +32,6 @@ from paddle.inference import Config
 from paddle.inference import create_predictor
 
 __all__ = ["InsightFace", "parser"]
-BASE_INFERENCE_MODEL_DIR = os.path.expanduser("~/.insightface/ppmodels/")
-BASE_DOWNLOAD_URL = "https://paddle-model-ecology.bj.bcebos.com/model/insight-face/{}.tar"
 
 
 def parser(add_help=True):
@@ -45,17 +43,27 @@ def parser(add_help=True):
         "--det", action="store_true", help="Whether to detect.")
     parser.add_argument(
         "--rec", action="store_true", help="Whether to recognize.")
-    
+
     parser.add_argument(
-        "--det_model",
+        "--det_model_file_path",
         type=str,
-        default="BlazeFace",
-        help="The detection model.")
+        default="models/blazeface_fpn_ssh_1000e_v1.0_infer/inference.pdmodel",
+        help="The detection model file path.")
     parser.add_argument(
-        "--rec_model",
+        "--det_params_file_path",
         type=str,
-        default="MobileFace",
-        help="The recognition model.")
+        default="models/blazeface_fpn_ssh_1000e_v1.0_infer/inference.pdiparams",
+        help="The detection params file path.")
+    parser.add_argument(
+        "--rec_model_file_path",
+        type=str,
+        default="models/ms1mv3_r50_static_128_fp16_0.1_epoch_24_infer/FresResNet50.pdmodel",
+        help="The detection model file path.")
+    parser.add_argument(
+        "--rec_params_file_path",
+        type=str,
+        default="models/ms1mv3_r50_static_128_fp16_0.1_epoch_24_infer/FresResNet50.pdiparams",
+        help="The detection params file path.")
     parser.add_argument(
         "--use_gpu",
         type=str2bool,
@@ -78,7 +86,10 @@ def parser(add_help=True):
         type=str,
         help="The path or directory of image(s) or video to be predicted.")
     parser.add_argument(
-        "--output", type=str, default="./output/", help="The directory of prediction result.")
+        "--output",
+        type=str,
+        default="./output/",
+        help="The directory of prediction result.")
     parser.add_argument(
         "--det_thresh",
         type=float,
@@ -89,7 +100,7 @@ def parser(add_help=True):
     parser.add_argument(
         "--cdd_num",
         type=int,
-        default=5,
+        default=10,
         help="The number of candidates in the recognition retrieval. Default by 10."
     )
     parser.add_argument(
@@ -102,6 +113,21 @@ def parser(add_help=True):
         type=int,
         default=1,
         help="The maxium of batch_size to recognize. Default by 1.")
+    parser.add_argument(
+        "--build_index",
+        type=str,
+        default=None,
+        help="The path of index to be build.")
+    parser.add_argument(
+        "--img_dir",
+        type=str,
+        default=None,
+        help="The img(s) dir used to build index.")
+    parser.add_argument(
+        "--label",
+        type=str,
+        default=None,
+        help="The label file path used to build index.")
 
     return parser
 
@@ -119,86 +145,6 @@ def print_config(args):
     print("{}".format("-" * width))
 
 
-def download_with_progressbar(url, save_path):
-    """Download from url with progressbar.
-    """
-    if os.path.isfile(save_path):
-        os.remove(save_path)
-    response = requests.get(url, stream=True)
-    total_size_in_bytes = int(response.headers.get("content-length", 0))
-    block_size = 1024  # 1 Kibibyte
-    progress_bar = tqdm(total=total_size_in_bytes, unit="iB", unit_scale=True)
-    with open(save_path, "wb") as file:
-        for data in response.iter_content(block_size):
-            progress_bar.update(len(data))
-            file.write(data)
-    progress_bar.close()
-    if total_size_in_bytes == 0 or progress_bar.n != total_size_in_bytes or not os.path.isfile(
-            save_path):
-        raise Exception(
-            f"Something went wrong while downloading model/image from {url}")
-
-
-def check_model_file(model):
-    """Check the model files exist and download and untar when no exist.
-    """
-    model_map = {
-        "ArcFace": "arcface_iresnet50_v1.0_infer",
-        "BlazeFace": "blazeface_fpn_ssh_1000e_v1.0_infer",
-        "MobileFace": "mobileface_v1.0_infer"
-    }
-
-    if os.path.isdir(model):
-        model_file_path = os.path.join(model, "inference.pdmodel")
-        params_file_path = os.path.join(model, "inference.pdiparams")
-        if not os.path.exists(model_file_path) or not os.path.exists(
-                params_file_path):
-            raise Exception(
-                f"The specifed model directory error. The drectory must include 'inference.pdmodel' and 'inference.pdiparams'."
-            )
-
-    elif model in model_map:
-        storage_directory = partial(os.path.join, BASE_INFERENCE_MODEL_DIR,
-                                    model)
-        url = BASE_DOWNLOAD_URL.format(model_map[model])
-
-        tar_file_name_list = [
-            "inference.pdiparams", "inference.pdiparams.info",
-            "inference.pdmodel"
-        ]
-        model_file_path = storage_directory("inference.pdmodel")
-        params_file_path = storage_directory("inference.pdiparams")
-        if not os.path.exists(model_file_path) or not os.path.exists(
-                params_file_path):
-            tmp_path = storage_directory(url.split("/")[-1])
-            logging.info(f"Download {url} to {tmp_path}")
-            os.makedirs(storage_directory(), exist_ok=True)
-            download_with_progressbar(url, tmp_path)
-            with tarfile.open(tmp_path, "r") as tarObj:
-                for member in tarObj.getmembers():
-                    filename = None
-                    for tar_file_name in tar_file_name_list:
-                        if tar_file_name in member.name:
-                            filename = tar_file_name
-                    if filename is None:
-                        continue
-                    file = tarObj.extractfile(member)
-                    with open(storage_directory(filename), "wb") as f:
-                        f.write(file.read())
-            os.remove(tmp_path)
-        if not os.path.exists(model_file_path) or not os.path.exists(
-                params_file_path):
-            raise Exception(
-                f"Something went wrong while downloading and unzip the model[{model}] files!"
-            )
-    else:
-        raise Exception(
-            f"The specifed model name error. Support 'BlazeFace' for detection and 'ArcFace' and 'MobileFace' for recognition. And support local directory that include model files ('inference.pdmodel' and 'inference.pdiparams')."
-        )
-
-    return model_file_path, params_file_path
-
-
 def normalize_image(img, scale=None, mean=None, std=None, order='chw'):
     if isinstance(scale, str):
         scale = eval(scale)
@@ -570,9 +516,7 @@ class InsightFace(object):
         if print_info:
             print_config(args)
 
-        self.font_path = os.path.join(
-            os.path.abspath(os.path.dirname(__file__)),
-            "SourceHanSansCN-Medium.otf")
+        self.font_path = "assets/SourceHanSansCN-Medium.otf"
         self.args = args
 
         predictor_config = {
@@ -581,17 +525,13 @@ class InsightFace(object):
             "cpu_threads": args.cpu_threads
         }
         if args.det:
-            model_file_path, params_file_path = check_model_file(
-                args.det_model)
             det_config = {"thresh": args.det_thresh, "target_size": [640, 640]}
-            predictor_config["model_file"] = model_file_path
-            predictor_config["params_file"] = params_file_path
+            predictor_config["model_file"] = args.det_model_file_path
+            predictor_config["params_file"] = args.det_params_file_path
             self.det_predictor = Detector(det_config, predictor_config)
             self.color_map = ColorMap(100)
 
         if args.rec:
-            model_file_path, params_file_path = check_model_file(
-                args.rec_model)
             rec_config = {
                 "max_batch_size": args.max_batch_size,
                 "resize": 112,
@@ -599,8 +539,8 @@ class InsightFace(object):
                 "index": args.index,
                 "cdd_num": args.cdd_num
             }
-            predictor_config["model_file"] = model_file_path
-            predictor_config["params_file"] = params_file_path
+            predictor_config["model_file"] = args.rec_model_file_path
+            predictor_config["params_file"] = args.rec_params_file_path
             self.rec_predictor = Recognizer(rec_config, predictor_config)
 
     def preprocess(self, img):
@@ -704,6 +644,34 @@ class InsightFace(object):
             }
         logging.info(f"Predict complete!")
 
+    def build_index(self):
+        img_dir = self.args.img_dir
+        label_path = self.args.label
+        with open(label_path, "r") as f:
+            sample_list = f.readlines()
+
+        feature_list = []
+        label_list = []
+
+        for idx, sample in enumerate(sample_list):
+            name, label = sample.strip().split("\t")
+            img = cv2.imread(os.path.join(img_dir, name))
+            if img is None:
+                logging.warning(f"Error in reading img {name}! Ignored.")
+                continue
+            box_list, np_feature = self.predict_np_img(img)
+            feature_list.append(np_feature[0])
+            label_list.append(label)
+
+            if idx % 100 == 0:
+                logging.info(f"Build idx: {idx}")
+
+        with open(self.args.build_index, "wb") as f:
+            pickle.dump({"label": label_list, "feature": feature_list}, f)
+        logging.info(
+            f"Build done. Total {len(label_list)}. Index file has been saved in \"{self.args.build_index}\""
+        )
+
 
 # for CLI
 def main(args=None):
@@ -711,9 +679,12 @@ def main(args=None):
 
     args = parser().parse_args()
     predictor = InsightFace(args)
-    res = predictor.predict(args.input, print_info=True)
-    for _ in res:
-        pass
+    if args.build_index:
+        predictor.build_index()
+    else:
+        res = predictor.predict(args.input, print_info=True)
+        for _ in res:
+            pass
 
 
 if __name__ == "__main__":
diff --git a/recognition/arcface_paddle/tools/train.py b/recognition/arcface_paddle/tools/train.py
new file mode 100644
index 0000000..55102e3
--- /dev/null
+++ b/recognition/arcface_paddle/tools/train.py
@@ -0,0 +1,35 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import sys
+import os
+sys.path.insert(0, os.path.abspath('.'))
+
+import paddle
+from configs import argparser as parser
+from utils.logging import init_logging
+
+if __name__ == '__main__':
+    args = parser.parse_args()
+    if args.is_static:
+        from static.train import train
+        paddle.enable_static()
+    else:
+        from dynamic.train import train
+
+    rank = int(os.getenv("PADDLE_TRAINER_ID", 0))
+    os.makedirs(args.output, exist_ok=True)
+    init_logging(rank, args.output)
+    parser.print_args(args)
+    train(args)
diff --git a/recognition/arcface_paddle/tools/validation.py b/recognition/arcface_paddle/tools/validation.py
new file mode 100644
index 0000000..1e34372
--- /dev/null
+++ b/recognition/arcface_paddle/tools/validation.py
@@ -0,0 +1,84 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import sys
+import os
+import logging
+sys.path.insert(0, os.path.abspath('.'))
+
+import argparse
+
+
+def str2bool(v):
+    return str(v).lower() in ("true", "t", "1")
+
+
+def tostrlist(v):
+    if isinstance(v, list):
+        return v
+    elif isinstance(v, str):
+        return [e.strip() for e in v.split(',')]
+
+
+def parse_args():
+    parser = argparse.ArgumentParser(description='Paddle Face Exporter')
+
+    # Model setting
+    parser.add_argument(
+        '--is_static',
+        type=str2bool,
+        default='False',
+        help='whether to use static mode')
+    parser.add_argument(
+        '--backbone',
+        type=str,
+        default='FresResNet50',
+        help='backbone network')
+    parser.add_argument(
+        '--embedding_size', type=int, default=512, help='embedding size')
+    parser.add_argument(
+        '--checkpoint_dir',
+        type=str,
+        default='MS1M_v3_arcface/FresResNet50/24/',
+        help='checkpoint direcotry')
+    parser.add_argument(
+        '--data_dir',
+        type=str,
+        default='./MS1M_v3_bin',
+        help='train dataset directory')
+    parser.add_argument(
+        '--val_targets',
+        type=tostrlist,
+        default=["lfw", "cfp_fp", "agedb_30"],
+        help='val targets, list or str split by comma')
+    parser.add_argument(
+        '--batch_size', type=int, default=128, help='test batch size')
+
+    args = parser.parse_args()
+    return args
+
+
+if __name__ == '__main__':
+    logging.basicConfig(
+        level=logging.INFO, format="Validation: %(asctime)s - %(message)s")
+
+    args = parse_args()
+    if args.is_static:
+        import paddle
+        paddle.enable_static()
+        from static.validation import validation
+    else:
+        from dynamic.validation import validation
+
+    validation(args)
diff --git a/recognition/arcface_paddle/train.py b/recognition/arcface_paddle/train.py
deleted file mode 100644
index c150cc9..0000000
--- a/recognition/arcface_paddle/train.py
+++ /dev/null
@@ -1,171 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-from dataloader import CommonDataset
-
-from paddle.io import DataLoader
-from config import config as cfg
-from partial_fc import PartialFC
-from utils.utils_callbacks import CallBackVerification, CallBackLogging, CallBackModelCheckpoint
-from utils.utils_logging import AverageMeter
-import paddle.nn.functional as F
-from paddle.nn import ClipGradByNorm
-from visualdl import LogWriter
-import paddle
-import backbones
-import argparse
-import losses
-import time
-import os
-import sys
-
-
-def main(args):
-    world_size = int(1.0)
-    rank = int(0.0)
-
-    if not os.path.exists(args.output):
-        os.makedirs(args.output)
-    else:
-        time.sleep(2)
-
-    writer = LogWriter(logdir=args.logdir)
-    trainset = CommonDataset(root_dir=cfg.data_dir, label_file=cfg.file_list, is_bin=args.is_bin)
-    train_loader = DataLoader(
-        dataset=trainset,
-        batch_size=args.batch_size,
-        shuffle=True,
-        drop_last=True,
-        num_workers=0)
-
-    backbone = eval("backbones.{}".format(args.network))()
-    backbone.train()
-
-    clip_by_norm = ClipGradByNorm(5.0)
-    margin_softmax = eval("losses.{}".format(args.loss))()
-
-    module_partial_fc = PartialFC(
-        rank=0,
-        world_size=1,
-        resume=0,
-        batch_size=args.batch_size,
-        margin_softmax=margin_softmax,
-        num_classes=cfg.num_classes,
-        sample_rate=cfg.sample_rate,
-        embedding_size=args.embedding_size,
-        prefix=args.output)
-
-    scheduler_backbone_decay = paddle.optimizer.lr.LambdaDecay(
-        learning_rate=args.lr, lr_lambda=cfg.lr_func, verbose=True)
-    scheduler_backbone = paddle.optimizer.lr.LinearWarmup(
-        learning_rate=scheduler_backbone_decay,
-        warmup_steps=cfg.warmup_epoch,
-        start_lr=0,
-        end_lr=args.lr / 512 * args.batch_size,
-        verbose=True)
-    opt_backbone = paddle.optimizer.Momentum(
-        parameters=backbone.parameters(),
-        learning_rate=scheduler_backbone,
-        momentum=0.9,
-        weight_decay=args.weight_decay,
-        grad_clip=clip_by_norm)
-
-    scheduler_pfc_decay = paddle.optimizer.lr.LambdaDecay(
-        learning_rate=args.lr, lr_lambda=cfg.lr_func, verbose=True)
-    scheduler_pfc = paddle.optimizer.lr.LinearWarmup(
-        learning_rate=scheduler_pfc_decay,
-        warmup_steps=cfg.warmup_epoch,
-        start_lr=0,
-        end_lr=args.lr / 512 * args.batch_size,
-        verbose=True)
-    opt_pfc = paddle.optimizer.Momentum(
-        parameters=module_partial_fc.parameters(),
-        learning_rate=scheduler_pfc,
-        momentum=0.9,
-        weight_decay=args.weight_decay,
-        grad_clip=clip_by_norm)
-
-    start_epoch = 0
-    total_step = int(
-        len(trainset) / args.batch_size / world_size * cfg.num_epoch)
-    if rank == 0:
-        print("Total Step is: %d" % total_step)
-
-    callback_verification = CallBackVerification(2000, rank, cfg.val_targets,
-                                                 cfg.data_dir)
-    callback_logging = CallBackLogging(10, rank, total_step, args.batch_size,
-                                       world_size, writer)
-    callback_checkpoint = CallBackModelCheckpoint(rank, args.output,
-                                                  args.network)
-
-    loss = AverageMeter()
-    global_step = 0
-    for epoch in range(start_epoch, cfg.num_epoch):
-        for step, (img, label) in enumerate(train_loader):
-            label = label.flatten()
-            global_step += 1
-            sys.stdout.flush()
-            features = F.normalize(backbone(img))
-            x_grad, loss_v = module_partial_fc.forward_backward(
-                label, features, opt_pfc)
-            sys.stdout.flush()
-            (features.multiply(x_grad)).backward()
-            sys.stdout.flush()
-            opt_backbone.step()
-            opt_pfc.step()
-            module_partial_fc.update()
-            opt_backbone.clear_gradients()
-            opt_pfc.clear_gradients()
-            sys.stdout.flush()
-
-            lr_backbone_value = opt_backbone._global_learning_rate().numpy()[0]
-            lr_pfc_value = opt_backbone._global_learning_rate().numpy()[0]
-
-            loss.update(loss_v, 1)
-            callback_logging(global_step, loss, epoch, lr_backbone_value,
-                             lr_pfc_value)
-            sys.stdout.flush()
-            callback_verification(global_step, backbone)
-        callback_checkpoint(global_step, backbone, module_partial_fc)
-        scheduler_backbone.step()
-        scheduler_pfc.step()
-    writer.close()
-
-
-if __name__ == '__main__':
-    def str2bool(v):
-        return v.lower() in ("true", "t", "1")
-    
-    parser = argparse.ArgumentParser(description='Paddle ArcFace Training')
-    parser.add_argument(
-        '--network',
-        type=str,
-        default='MobileFaceNet_128',
-        help='backbone network')
-    parser.add_argument(
-        '--loss', type=str, default='ArcFace', help='loss function')
-    parser.add_argument('--lr', type=float, default=0.1, help='learning rate')
-    parser.add_argument(
-        '--batch_size', type=int, default=512, help='batch size')
-    parser.add_argument(
-        '--weight_decay', type=float, default=2e-4, help='weight decay')
-    parser.add_argument(
-        '--embedding_size', type=int, default=128, help='embedding size')
-    parser.add_argument('--logdir', type=str, default='./log', help='log dir')
-    parser.add_argument(
-        '--output', type=str, default='emore_arcface', help='output dir')
-    parser.add_argument('--resume', type=int, default=0, help='model resuming')
-    parser.add_argument('--is_bin', type=str2bool, default=True, help='whether the train data is bin or original image file')
-    args = parser.parse_args()
-    main(args)
diff --git a/recognition/arcface_paddle/utils/__init__.py b/recognition/arcface_paddle/utils/__init__.py
index 61d5aa2..185a92b 100644
--- a/recognition/arcface_paddle/utils/__init__.py
+++ b/recognition/arcface_paddle/utils/__init__.py
@@ -10,4 +10,4 @@
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
-# limitations under the License.
\ No newline at end of file
+# limitations under the License.
diff --git a/recognition/arcface_paddle/utils/logging.py b/recognition/arcface_paddle/utils/logging.py
new file mode 100644
index 0000000..6ac1534
--- /dev/null
+++ b/recognition/arcface_paddle/utils/logging.py
@@ -0,0 +1,101 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import logging
+import os
+import sys
+import time
+
+
+class AverageMeter(object):
+    """Computes and stores the average and current value
+    """
+
+    def __init__(self):
+        self.val = None
+        self.avg = None
+        self.sum = None
+        self.count = None
+        self.reset()
+
+    def reset(self):
+        self.val = 0
+        self.avg = 0
+        self.sum = 0
+        self.count = 0
+
+    def update(self, val, n=1):
+        self.val = val
+        self.sum += val * n
+        self.count += n
+        self.avg = self.sum / self.count
+
+
+def init_logging(rank, models_root):
+    if rank is 0:
+        log_root = logging.getLogger()
+        log_root.setLevel(logging.INFO)
+        formatter = logging.Formatter("Training: %(asctime)s - %(message)s")
+        handler_file = logging.FileHandler(
+            os.path.join(models_root, "training.log"))
+        handler_stream = logging.StreamHandler(sys.stdout)
+        handler_file.setFormatter(formatter)
+        handler_stream.setFormatter(formatter)
+        log_root.addHandler(handler_file)
+        log_root.addHandler(handler_stream)
+        log_root.info('rank: %d' % rank)
+
+
+class CallBackLogging(object):
+    def __init__(self,
+                 frequent,
+                 rank,
+                 world_size,
+                 total_step,
+                 batch_size,
+                 writer=None):
+        self.frequent: int = frequent
+        self.rank: int = rank
+        self.world_size: int = world_size
+        self.time_start = time.time()
+        self.total_step: int = total_step
+        self.batch_size: int = batch_size
+        self.writer = writer
+
+        self.tic = time.time()
+
+    def __call__(self, global_step, loss: AverageMeter, epoch: int, lr_value):
+
+        if self.rank is 0 and global_step > 0 and global_step % self.frequent == 0:
+            try:
+                speed: float = self.frequent * self.batch_size / (
+                    time.time() - self.tic)
+                speed_total = speed * self.world_size
+
+            except ZeroDivisionError:
+                speed_total = float('inf')
+
+            time_now = (time.time() - self.time_start) / 3600
+            time_total = time_now / ((global_step + 1) / self.total_step)
+            time_for_end = time_total - time_now
+            if self.writer is not None:
+                self.writer.add_scalar('time_for_end', time_for_end,
+                                       global_step)
+                self.writer.add_scalar('loss', loss.avg, global_step)
+            msg = "loss %.4f, lr: %f, epoch: %d, step: %d, eta: %1.2f hours, throughput: %.2f imgs/sec" % (
+                loss.avg, lr_value, epoch, global_step, time_for_end,
+                speed_total)
+            logging.info(msg)
+            loss.reset()
+            self.tic = time.time()
diff --git a/recognition/arcface_paddle/utils/losses.py b/recognition/arcface_paddle/utils/losses.py
new file mode 100644
index 0000000..297f7e8
--- /dev/null
+++ b/recognition/arcface_paddle/utils/losses.py
@@ -0,0 +1,40 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+class CosFace(object):
+    def __init__(self, m1=1.0, m2=0.0, m3=0.35, s=64.0):
+        super(CosFace, self).__init__()
+        self.margin1 = m1
+        self.margin2 = m2
+        self.margin3 = m3
+        self.scale = s
+
+
+class ArcFace(object):
+    def __init__(self, m1=1.0, m2=0.5, m3=0.0, s=64.0):
+        super(ArcFace, self).__init__()
+        self.margin1 = m1
+        self.margin2 = m2
+        self.margin3 = m3
+        self.scale = s
+
+
+class SphereFace(object):
+    def __init__(self, m1=1.35, m2=0.0, m3=0.0, s=64.0):
+        super(SphereFace, self).__init__()
+        self.margin1 = m1
+        self.margin2 = m2
+        self.margin3 = m3
+        self.scale = s
diff --git a/recognition/arcface_paddle/utils/rearrange_weight.py b/recognition/arcface_paddle/utils/rearrange_weight.py
new file mode 100644
index 0000000..eb1d7a7
--- /dev/null
+++ b/recognition/arcface_paddle/utils/rearrange_weight.py
@@ -0,0 +1,133 @@
+# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+def rearrange_weight(weight_dict, init_num_rank, new_num_rank):
+    """
+    A help function to convert pre-trained distributed fc parameters for
+    inference or fine-tuning. Note that the number of ranks or GPUs for
+    inference or fine-tuning can be different from that for pre-training.
+
+    Args:
+        weight_dict(dict): the dict store distributed parameters,
+            key: eg. dist@fc@rank@00000.w_0
+            value: numpy.ndarray
+        init_num_rank(int) : pre-trained weight at init_num_rank gpu device.
+        new_num_rank(int) : want to rearrange weight to new_num_rank gpu device.
+
+    Returns:
+        dict: rearranged weight for new_num_rank gpu device.
+    """
+
+    ret_dict = {}
+    if init_num_rank == new_num_rank:
+        return weight_dict
+
+    if len(weight_dict) == 0:
+        return weight_dict
+
+    # generate name format
+    name_format = list(weight_dict.keys())[0]
+    name_format = name_format.split('.')
+    name_format[0] = name_format[0].split('@')
+    name_format[0][-1] = '%05d'
+    name_format[0] = '@'.join(name_format[0])
+    name_format = '.'.join(name_format)
+
+    # calculate num class of pretrain shard
+    # num class of new shard
+    num_class = sum([
+        w.shape[1] if len(w.shape) == 2 else len(w)
+        for _, w in weight_dict.items()
+    ])
+    init_nshard = (num_class + init_num_rank - 1) // init_num_rank
+    new_nshard = (num_class + new_num_rank - 1) // new_num_rank
+
+    if new_nshard * (new_num_rank - 1) >= num_class:
+        raise ValueError(
+            "num class {} cann't be rationally splited by num rank {}".format(
+                num_class, new_num_rank))
+
+    if init_num_rank > new_num_rank:
+        for new_idx in range(new_num_rank):
+            start = new_idx * new_nshard
+            end = min((new_idx + 1) * new_nshard - 1, num_class - 1)
+            init_shard_idx_start = start // init_nshard
+            init_shard_idx_end = end // init_nshard
+
+            weight_list = []
+            for init_idx in range(init_shard_idx_start,
+                                  init_shard_idx_end + 1):
+                name = name_format % init_idx
+                init_weight = weight_dict[name]
+                s = max(start - init_idx * init_nshard, 0)
+                if init_idx == init_shard_idx_end:
+                    e = min(end - init_idx * init_nshard + 1, init_nshard)
+                else:
+                    e = init_nshard
+                if len(init_weight.shape) == 2:
+                    weight_list.append(init_weight[:, s:e])
+                else:
+                    weight_list.append(init_weight[s:e])
+
+            name = name_format % new_idx
+            # for 2-dimention, we concat at axis=1,
+            # else for 1-dimention, we concat at axis=0
+            ret_dict[name] = np.concatenate(
+                weight_list, axis=len(weight_list[0].shape) - 1)
+    else:
+        for new_idx in range(new_num_rank):
+            start = new_idx * new_nshard
+            end = min((new_idx + 1) * new_nshard - 1, num_class - 1)
+            init_shard_idx_start = start // init_nshard
+            init_shard_idx_end = end // init_nshard
+
+            if init_shard_idx_start == init_shard_idx_end:
+                name = name_format % init_shard_idx_start
+                init_weight = weight_dict[name]
+                init_start = init_shard_idx_start * init_nshard
+                s = max(start - init_start, 0)
+                e = min((init_shard_idx_start + 1) * init_nshard,
+                        end) - init_start + 1
+                if len(init_weight.shape) == 2:
+                    new_weight = init_weight[:, s:e]
+                else:
+                    new_weight = init_weight[s:e]
+            else:
+                # init_shard_idx_start + 1 == init_shard_idx_end
+                name = name_format % init_shard_idx_start
+                init_weight = weight_dict[name]
+                init_start = init_shard_idx_start * init_nshard
+                s = max(start - init_start, 0)
+                if len(init_weight.shape) == 2:
+                    new_weight = init_weight[:, s:]
+                else:
+                    new_weight = init_weight[s:]
+
+                e = end - (init_shard_idx_end * init_nshard) + 1
+                if e > 0:
+                    name = name_format % init_shard_idx_end
+                    init_weight = weight_dict[name]
+                    if len(init_weight.shape) == 2:
+                        new_weight2 = init_weight[:, :e]
+                    else:
+                        new_weight2 = init_weight[:e]
+
+                    new_weight = np.concatenate(
+                        [new_weight, new_weight2],
+                        axis=len(new_weight.shape) - 1)
+            name = name_format % new_idx
+            ret_dict[name] = new_weight
+
+    return ret_dict
diff --git a/recognition/arcface_paddle/utils/utils_callbacks.py b/recognition/arcface_paddle/utils/utils_callbacks.py
deleted file mode 100644
index 566dcab..0000000
--- a/recognition/arcface_paddle/utils/utils_callbacks.py
+++ /dev/null
@@ -1,144 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import os
-from typing import List
-import paddle
-import logging
-from eval import verification
-from utils.utils_logging import AverageMeter
-from partial_fc import PartialFC
-import time
-
-
-class CallBackVerification(object):
-    def __init__(self,
-                 frequent,
-                 rank,
-                 val_targets,
-                 rec_prefix,
-                 image_size=(112, 112)):
-        self.frequent: int = frequent
-        self.rank: int = rank
-        self.highest_acc: float = 0.0
-        self.highest_acc_list: List[float] = [0.0] * len(val_targets)
-        self.ver_list: List[object] = []
-        self.ver_name_list: List[str] = []
-        if self.rank == 0:
-            self.init_dataset(
-                val_targets=val_targets,
-                data_dir=rec_prefix,
-                image_size=image_size)
-
-    def ver_test(self,
-                 backbone: paddle.nn.Layer,
-                 global_step: int,
-                 batch_size: int):
-        results = []
-        for i in range(len(self.ver_list)):
-            acc1, std1, acc2, std2, xnorm, embeddings_list = verification.test(
-                self.ver_list[i], backbone, batch_size, 10)
-            logging.info('[%s][%d]XNorm: %f' %
-                         (self.ver_name_list[i], global_step, xnorm))
-            logging.info('[%s][%d]Accuracy-Flip: %1.5f+-%1.5f' %
-                         (self.ver_name_list[i], global_step, acc2, std2))
-            if acc2 > self.highest_acc_list[i]:
-                self.highest_acc_list[i] = acc2
-            logging.info('[%s][%d]Accuracy-Highest: %1.5f' % (
-                self.ver_name_list[i], global_step, self.highest_acc_list[i]))
-            results.append(acc2)
-
-    def init_dataset(self, val_targets, data_dir, image_size):
-        for name in val_targets:
-            path = os.path.join(data_dir, name + ".bin")
-            if os.path.exists(path):
-                data_set = verification.load_bin(path, image_size)
-                self.ver_list.append(data_set)
-                self.ver_name_list.append(name)
-
-    def __call__(self, num_update, backbone: paddle.nn.Layer, batch_size=10):
-        if self.rank == 0 and num_update > 0 and num_update % self.frequent == 0:
-            backbone.eval()
-            self.ver_test(backbone, num_update, batch_size)
-            backbone.train()
-
-
-class CallBackLogging(object):
-    def __init__(self,
-                 frequent,
-                 rank,
-                 total_step,
-                 batch_size,
-                 world_size,
-                 writer=None):
-        self.frequent: int = frequent
-        self.rank: int = rank
-        self.time_start = time.time()
-        self.total_step: int = total_step
-        self.batch_size: int = batch_size
-        self.world_size: int = world_size
-        self.writer = writer
-
-        self.init = False
-        self.tic = 0
-
-    def __call__(self,
-                 global_step,
-                 loss: AverageMeter,
-                 epoch: int,
-                 lr_backbone_value,
-                 lr_pfc_value):
-        if self.rank is 0 and global_step > 0 and global_step % self.frequent == 0:
-            if self.init:
-                try:
-                    speed: float = self.frequent * self.batch_size / (
-                        time.time() - self.tic)
-                    speed_total = speed * self.world_size
-                except ZeroDivisionError:
-                    speed_total = float('inf')
-
-                time_now = (time.time() - self.time_start) / 3600
-                time_total = time_now / ((global_step + 1) / self.total_step)
-                time_for_end = time_total - time_now
-                if self.writer is not None:
-                    self.writer.add_scalar('time_for_end', time_for_end,
-                                           global_step)
-                    self.writer.add_scalar('loss', loss.avg, global_step)
-                msg = "Speed %.2f samples/sec   Loss %.4f   Epoch: %d   Global Step: %d   Required: %1.f hours, lr_backbone_value: %f, lr_pfc_value: %f" % (
-                    speed_total, loss.avg, epoch, global_step, time_for_end,
-                    lr_backbone_value, lr_pfc_value)
-                logging.info(msg)
-                loss.reset()
-                self.tic = time.time()
-            else:
-                self.init = True
-                self.tic = time.time()
-
-
-class CallBackModelCheckpoint(object):
-    def __init__(self, rank, output="./", model_name="mobilefacenet"):
-        self.rank: int = rank
-        self.output: str = output
-        self.model_name: str = model_name
-
-    def __call__(self,
-                 global_step,
-                 backbone: paddle.nn.Layer,
-                 partial_fc: PartialFC=None):
-        if global_step > 100 and self.rank is 0:
-            paddle.save(backbone.state_dict(),
-                        os.path.join(self.output,
-                                     self.model_name + ".pdparams"))
-        if global_step > 100 and partial_fc is not None:
-            partial_fc.save_params()
diff --git a/recognition/arcface_paddle/utils/utils_logging.py b/recognition/arcface_paddle/utils/utils_logging.py
deleted file mode 100644
index b2c2da7..0000000
--- a/recognition/arcface_paddle/utils/utils_logging.py
+++ /dev/null
@@ -1,55 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import logging
-import os
-import sys
-
-
-class AverageMeter(object):
-    """Computes and stores the average and current value
-    """
-
-    def __init__(self):
-        self.val = None
-        self.avg = None
-        self.sum = None
-        self.count = None
-        self.reset()
-
-    def reset(self):
-        self.val = 0
-        self.avg = 0
-        self.sum = 0
-        self.count = 0
-
-    def update(self, val, n=1):
-        self.val = val
-        self.sum += val * n
-        self.count += n
-        self.avg = self.sum / self.count
-
-
-def init_logging(log_root, rank, models_root):
-    if rank is 0:
-        log_root.setLevel(logging.INFO)
-        formatter = logging.Formatter("Training: %(asctime)s-%(message)s")
-        handler_file = logging.FileHandler(
-            os.path.join(models_root, "training.log"))
-        handler_stream = logging.StreamHandler(sys.stdout)
-        handler_file.setFormatter(formatter)
-        handler_stream.setFormatter(formatter)
-        log_root.addHandler(handler_file)
-        log_root.addHandler(handler_stream)
-        log_root.info('rank_id: %d' % rank)
diff --git a/recognition/arcface_paddle/eval/verification.py b/recognition/arcface_paddle/utils/verification.py
similarity index 66%
rename from recognition/arcface_paddle/eval/verification.py
rename to recognition/arcface_paddle/utils/verification.py
index fa0ff41..ab2e8b7 100644
--- a/recognition/arcface_paddle/eval/verification.py
+++ b/recognition/arcface_paddle/utils/verification.py
@@ -14,13 +14,8 @@
 
 import datetime
 import os
-import pickle
-from io import BytesIO
-from PIL import Image
-import cv2
 import numpy as np
 import sklearn
-import paddle
 from scipy import interpolate
 from sklearn.decomposition import PCA
 from sklearn.model_selection import KFold
@@ -185,91 +180,3 @@ def evaluate(embeddings, actual_issame, nrof_folds=10, pca=0):
         1e-3,
         nrof_folds=nrof_folds)
     return tpr, fpr, accuracy, val, val_std, far
-
-
-# 返回为numpy
-@paddle.no_grad()
-def load_bin(path, image_size):
-    try:
-        with open(path, 'rb') as f:
-            bins, issame_list = pickle.load(f)  # py2
-    except UnicodeDecodeError as e:
-        with open(path, 'rb') as f:
-            bins, issame_list = pickle.load(f, encoding='bytes')  # py3
-    data_list = []
-    for flip in [0, 1]:
-        data = np.empty(
-            shape=[len(issame_list) * 2, 3, image_size[0], image_size[1]],
-            dtype=np.float32)
-        data_list.append(data)
-    for idx in range(len(issame_list) * 2):
-        _bin = bins[idx]
-        img = np.array(Image.open(BytesIO(_bin)), dtype=np.float32)
-        if img.shape[1] != image_size[0]:
-            img = cv2.resize(img, (image_size[0], image_size[0]))
-        img = img.transpose(2, 0, 1)
-        for flip in [0, 1]:
-            if flip == 1:
-                img = np.flip(img, 2)
-            data_list[flip][idx][:] = img
-        if idx % 1000 == 0:
-            print('loading bin', idx)
-    print(data_list[0].shape)
-    return data_list, issame_list
-
-
-@paddle.no_grad()
-def test(data_set, backbone, batch_size, nfolds=10):
-    print('testing verification..')
-    data_list = data_set[0]
-    issame_list = data_set[1]
-    embeddings_list = []
-    time_consumed = 0.0
-    for i in range(len(data_list)):
-        data = data_list[i]
-        embeddings = None
-        ba = 0
-        while ba < data.shape[0]:
-            bb = min(ba + batch_size, data.shape[0])
-            count = bb - ba
-            _data = data[bb - batch_size:bb]
-            time0 = datetime.datetime.now()
-            img = ((_data / 255) - 0.5) / 0.5
-            # 将numpy转Tensor
-            img = paddle.to_tensor(img)
-            net_out: paddle.Tensor = backbone(img)
-            _embeddings = net_out.detach().cpu().numpy()
-            time_now = datetime.datetime.now()
-            diff = time_now - time0
-            time_consumed += diff.total_seconds()
-            if embeddings is None:
-                embeddings = np.zeros((data.shape[0], _embeddings.shape[1]))
-            embeddings[ba:bb, :] = _embeddings[(batch_size - count):, :]
-            ba = bb
-        embeddings_list.append(embeddings)
-
-    _xnorm = 0.0
-    _xnorm_cnt = 0
-    for embed in embeddings_list:
-        for i in range(embed.shape[0]):
-            _em = embed[i]
-            _norm = np.linalg.norm(_em)
-            _xnorm += _norm
-            _xnorm_cnt += 1
-    _xnorm /= _xnorm_cnt
-
-    embeddings = embeddings_list[0].copy()
-    try:
-        embeddings = sklearn.preprocessing.normalize(embeddings)
-    except:
-        print(embeddings)
-    acc1 = 0.0
-    std1 = 0.0
-    embeddings = embeddings_list[0] + embeddings_list[1]
-    embeddings = sklearn.preprocessing.normalize(embeddings)
-    print(embeddings.shape)
-    print('infer time', time_consumed)
-    _, _, accuracy, val, val_std, far = evaluate(
-        embeddings, issame_list, nrof_folds=nfolds)
-    acc2, std2 = np.mean(accuracy), np.std(accuracy)
-    return acc1, std1, acc2, std2, _xnorm, embeddings_list
diff --git a/recognition/arcface_paddle/valid.py b/recognition/arcface_paddle/valid.py
deleted file mode 100644
index 76a8739..0000000
--- a/recognition/arcface_paddle/valid.py
+++ /dev/null
@@ -1,53 +0,0 @@
-# Copyright (c) 2021 PaddlePaddle Authors. All Rights Reserved.
-#
-# Licensed under the Apache License, Version 2.0 (the "License");
-# you may not use this file except in compliance with the License.
-# You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-import os
-import paddle
-import argparse
-import backbones
-from utils.utils_callbacks import CallBackVerification
-
-
-def main(args):
-    '''
-    For the CallBackVerification class, you can place you val_dataset,
-    like ["lfw"], also you can use ["lfw", "cplfw", "calfw"].
-    
-    For the callback_verification function, the batch_size must be divisible by 12000!
-    Cause the length of dataset is 12000.
-    '''
-    backbone = eval("backbones.{}".format(args.network))()
-    model_params = args.network + '.pdparams'
-    print('INFO:' + args.network + ' chose! ' + model_params + ' loaded!')
-    state_dict = paddle.load(os.path.join(args.checkpoint, model_params))
-    backbone.set_state_dict(state_dict)
-    callback_verification = CallBackVerification(
-        1, 0, ["lfw", "cfp_fp", "agedb_30"], "MS1M_v2")
-    callback_verification(1, backbone, batch_size=50)
-
-
-if __name__ == '__main__':
-    parser = argparse.ArgumentParser(description='Paddle ArcFace Testing')
-    parser.add_argument(
-        '--network',
-        type=str,
-        default='MobileFaceNet_128',
-        help='backbone network')
-    parser.add_argument(
-        '--checkpoint',
-        type=str,
-        default='emore_arcface',
-        help='checkpoint dir')
-    args = parser.parse_args()
-    main(args)