mirror of
https://github.com/deepinsight/insightface.git
synced 2026-05-17 22:27:54 +00:00
Merge pull request #1814 from olojuwin/arcface_oneflow
Refactoring with new APIs for arcface_oneflow
This commit is contained in:
@@ -27,16 +27,11 @@ It introduces how to train InsightFace in OneFlow, and do verification over the
|
||||
|
||||
\- [2. Transformation from MS1M recordio to OFRecord](#2-transformation-from-ms1m-recordio-to-ofrecord)
|
||||
|
||||
\- [Pretrained model](#Pretrained-model)
|
||||
|
||||
\- [Training and verification](#training-and-verification)
|
||||
|
||||
\- [Training](#training)
|
||||
|
||||
\- [Varification](#varification)
|
||||
|
||||
\- [Benchmark](#benchmark)
|
||||
|
||||
\- [OneFLow2ONNX](#OneFLow2ONNX)
|
||||
|
||||
## Background
|
||||
|
||||
@@ -109,7 +104,7 @@ First of all, before execution, please make sure that:
|
||||
According to steps in [Install OneFlow](https://github.com/Oneflow-Inc/oneflow#install-oneflow) install the newest release master whl packages.
|
||||
|
||||
```
|
||||
python3 -m pip install --find-links https://release.oneflow.info oneflow_cu102 --user
|
||||
python3 -m pip install oneflow -f https://oneflow-staging.oss-cn-beijing.aliyuncs.com/branch/master/cu102/6aa719d70119b65837b25cc5f186eb19ef2b7891/index.html --user
|
||||
```
|
||||
|
||||
|
||||
@@ -160,7 +155,7 @@ Only need to execute 2.1 or 2.2
|
||||
|
||||
Run
|
||||
```
|
||||
python tools/mx_recordio_2_ofrecord_shuffled_npart.py --data_dir datasets/faces_emore --output_filepath faces_emore/ofrecord/train --part_num 16
|
||||
python tools/mx_recordio_2_ofrecord_shuffled_npart.py --data_dir datasets/faces_emore --output_filepath faces_emore/ofrecord/train --num_part 16
|
||||
```
|
||||
And you will get the number of `part_num` parts of OFRecord, it's 16 parts in this example, it showed like this
|
||||
```
|
||||
@@ -237,29 +232,6 @@ ofrecord/test/
|
||||
```
|
||||
|
||||
|
||||
|
||||
## Pretrained model
|
||||
|
||||
The accuracy comparison of OneFlow and MXNet pretrained models on the verification set of the 1:1 verification accuracy on insightface recognition test (IFRT) are as follows:
|
||||
|
||||
| **Framework** | **African** | **Caucasian** | **Indian** | **Asian** | **All** |
|
||||
| ------------- | ----------- | ------------- | ---------- | --------- | ------- |
|
||||
| OneFlow | 90.4076 | 94.583 | 93.702 | 68.754 | 89.684 |
|
||||
| MXNet | 90.45 | 94.60 | 93.96 | 63.91 | 88.23 |
|
||||
|
||||
The download link of the OneFlow pretrain model:[of_005_model.tar.gz](http://oneflow-public.oss-cn-beijing.aliyuncs.com/face_dataset/pretrained_model/of_glint360k_partial_fc/of_005_model.tar.gz)
|
||||
|
||||
We also provide the MXNet model which converted from OneFlow:[of_to_mxnet_model_005.tar.gz](http://oneflow-public.oss-cn-beijing.aliyuncs.com/face_dataset/pretrained_model/of_2_mxnet_glint360k_partial_fc/of_to_mxnet_model_005.tar.gz)
|
||||
|
||||
|
||||
|
||||
## OneFLow2ONNX
|
||||
|
||||
```
|
||||
pip install oneflow-onnx==0.3.4
|
||||
./convert.sh
|
||||
```
|
||||
|
||||
## Training and verification
|
||||
|
||||
|
||||
@@ -268,9 +240,15 @@ pip install oneflow-onnx==0.3.4
|
||||
|
||||
To reduce the usage cost of user, OneFlow draws close the scripts to Torch style, you can directly modify parameters via configs/*.py
|
||||
|
||||
#### eager
|
||||
```
|
||||
./run.sh
|
||||
./train_ddp.sh
|
||||
```
|
||||
#### Graph
|
||||
```
|
||||
train_graph_distributed.sh
|
||||
```
|
||||
|
||||
|
||||
### Varification
|
||||
|
||||
@@ -280,70 +258,9 @@ Moreover, OneFlow offers a validation script to do verification separately, val.
|
||||
./val.sh
|
||||
|
||||
```
|
||||
## OneFLow2ONNX
|
||||
|
||||
## Benchmark
|
||||
|
||||
### Training Speed Benchmark
|
||||
|
||||
#### Face_emore Dataset & FP32
|
||||
|
||||
| Backbone | GPU | model_parallel | partial_fc | BatchSize / it | Throughput img / sec |
|
||||
| -------- | ------------------------ | -------------- | ---------- | -------------- | -------------------- |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | False | False | 64 | 1836.8 |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | True | False | 64 | 1854.15 |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | True | True | 64 | 1872.81 |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | False | False | 96(Max) | 1931.76 |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | True | False | 115(Max) | 1921.87 |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | True | True | 120(Max) | 1962.76 |
|
||||
| Y1 | 8 * Tesla V100-SXM2-16GB | False | False | 256 | 14298.02 |
|
||||
| Y1 | 8 * Tesla V100-SXM2-16GB | True | False | 256 | 14049.75 |
|
||||
| Y1 | 8 * Tesla V100-SXM2-16GB | False | False | 350(Max) | 14756.03 |
|
||||
| Y1 | 8 * Tesla V100-SXM2-16GB | True | True | 400(Max) | 14436.38 |
|
||||
|
||||
#### Glint360k Dataset & FP32
|
||||
|
||||
| Backbone | GPU | partial_fc sample_ratio | BatchSize / it | Throughput img / sec |
|
||||
| -------- | ------------------------ | ----------------------- | -------------- | -------------------- |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | 0.1 | 64 | 1858.57 |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | 0.1 | 115 | 1933.88 |
|
||||
|
||||
|
||||
|
||||
### Evaluation on Lfw, Cfp_fp, Agedb_30
|
||||
|
||||
- Data Parallelism
|
||||
|
||||
| Backbone | Dataset | Lfw | Cfp_fp | Agedb_30 |
|
||||
| ------------- | ------- | ------ | ------ | -------- |
|
||||
| R100 | MS1M | 99.717 | 98.643 | 98.150 |
|
||||
| MobileFaceNet | MS1M | 99.5 | 92.657 | 95.6 |
|
||||
|
||||
- Model Parallelism
|
||||
|
||||
| Backbone | Dataset | Lfw | Cfp_fp | Agedb_30 |
|
||||
| ------------- | ------- | ------ | ------ | -------- |
|
||||
| R100 | MS1M | 99.733 | 98.329 | 98.033 |
|
||||
| MobileFaceNet | MS1M | 99.483 | 93.457 | 95.7 |
|
||||
|
||||
- Partial FC
|
||||
|
||||
| Backbone | Dataset | Lfw | Cfp_fp | Agedb_30 |
|
||||
| -------- | ------- | ------ | ------ | -------- |
|
||||
| R100 | MS1M | 99.817 | 98.443 | 98.217 |
|
||||
|
||||
### Evaluation on IFRT
|
||||
|
||||
r denotes the sampling rate of negative class centers.
|
||||
|
||||
| Backbone | Dataset | African | Caucasian | Indian | Asian | ALL |
|
||||
| -------- | -------------------- | ------- | --------- | ------ | ------ | ------ |
|
||||
| R100 | **Glint360k**(r=0.1) | 90.4076 | 94.583 | 93.702 | 68.754 | 89.684 |
|
||||
|
||||
### Max num_classses
|
||||
|
||||
| node_num | gpu_num_per_node | batch_size_per_device | fp16 | Model Parallel | Partial FC | num_classes |
|
||||
| -------- | ---------------- | --------------------- | ---- | -------------- | ---------- | ----------- |
|
||||
| 1 | 1 | 64 | True | True | True | 2000000 |
|
||||
| 1 | 8 | 64 | True | True | True | 13500000 |
|
||||
|
||||
More test details could refer to [OneFlow DLPerf](https://github.com/Oneflow-Inc/DLPerf#insightface).
|
||||
```
|
||||
pip install oneflow-onnx==0.5.1
|
||||
./convert.sh
|
||||
```
|
||||
@@ -17,17 +17,11 @@
|
||||
- [1. 下载数据集](#1-下载数据集)
|
||||
- [2. 将训练数据集 MS1M 从 recordio 格式转换为 OFRecord 格式](#2-将训练数据集-ms1m-从-recordio-格式转换为-ofrecord-格式)
|
||||
|
||||
- [预训练模型](#预训练模型)
|
||||
- [训练和验证](#训练和验证)
|
||||
- [训练](#训练)
|
||||
- [验证](#验证)
|
||||
- [基准测试](#基准测试)
|
||||
- [训练速度基准](#训练速度基准)
|
||||
- [Face_emore 数据集 & FP32](#face_emore-数据集--fp32)
|
||||
- [Glint360k 数据集 & FP32](#glint360k-数据集--fp32)
|
||||
- [Evaluation on Lfw, Cfp_fp, Agedb_30](#evaluation-on-lfw-cfp_fp-agedb_30)
|
||||
- [Evaluation on IFRT](#evaluation-on-ifrt)
|
||||
- [Max num_classses](#max-num_classses)
|
||||
- [OneFLow2ONNX](#OneFLow2ONNX)
|
||||
|
||||
|
||||
## 背景介绍
|
||||
|
||||
@@ -80,7 +74,7 @@
|
||||
根据 [Install OneFlow](https://github.com/Oneflow-Inc/oneflow#install-oneflow) 的步骤进行安装最新 master whl 包即可。
|
||||
|
||||
```
|
||||
python3 -m pip install --find-links https://release.oneflow.info oneflow_cu102 --user
|
||||
python3 -m pip install oneflow -f https://oneflow-staging.oss-cn-beijing.aliyuncs.com/branch/master/cu102/6aa719d70119b65837b25cc5f186eb19ef2b7891/index.html --user
|
||||
```
|
||||
|
||||
### 准备数据集
|
||||
@@ -199,24 +193,7 @@ ofrecord/test/
|
||||
```
|
||||
|
||||
|
||||
## 预训练模型
|
||||
|
||||
基于 oneflow 的人脸识别模型在 The 1:1 verification accuracy on InsightFace Recognition Test (IFRT) 验证集上与 MXNet 的预训练模型精度对比如下:
|
||||
|
||||
| **Framework** | **African** | **Caucasian** | **Indian** | **Asian** | **All** |
|
||||
| ------------- | ----------- | ------------- | ---------- | --------- | ------- |
|
||||
| OneFlow | 90.4076 | 94.583 | 93.702 | 68.754 | 89.684 |
|
||||
| MXNet | 90.45 | 94.60 | 93.96 | 63.91 | 88.23 |
|
||||
|
||||
oneflow 的人脸预训练模型下载链接:[of_005_model.tar.gz](http://oneflow-public.oss-cn-beijing.aliyuncs.com/face_dataset/pretrained_model/of_glint360k_partial_fc/of_005_model.tar.gz)
|
||||
|
||||
我们也提供了转换成 MXNet 的模型:[of_to_mxnet_model_005.tar.gz](http://oneflow-public.oss-cn-beijing.aliyuncs.com/face_dataset/pretrained_model/of_2_mxnet_glint360k_partial_fc/of_to_mxnet_model_005.tar.gz)
|
||||
|
||||
## 模型转换
|
||||
```
|
||||
pip install oneflow-onnx==0.3.4
|
||||
./convert.sh
|
||||
```
|
||||
|
||||
## 训练和验证
|
||||
|
||||
@@ -227,10 +204,14 @@ pip install oneflow-onnx==0.3.4
|
||||
|
||||
运行脚本:
|
||||
|
||||
#### eager
|
||||
```
|
||||
./run.sh
|
||||
./train_ddp.sh
|
||||
```
|
||||
#### Graph
|
||||
```
|
||||
train_graph_distributed.sh
|
||||
```
|
||||
|
||||
|
||||
### 验证
|
||||
|
||||
@@ -242,70 +223,9 @@ pip install oneflow-onnx==0.3.4
|
||||
./val.sh
|
||||
```
|
||||
|
||||
## OneFLow2ONNX
|
||||
|
||||
## 基准测试
|
||||
|
||||
### 训练速度基准
|
||||
|
||||
#### Face_emore 数据集 & FP32
|
||||
|
||||
| Backbone | GPU | model_parallel | partial_fc | BatchSize / it | Throughput img / sec |
|
||||
| -------- | ------------------------ | -------------- | ---------- | -------------- | -------------------- |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | False | False | 64 | 1832.02 |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | True | False | 64 | 1851.63 |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | True | True | 64 | 1854.25 |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | True | True | 96(Max) | 1925.6 |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | True | False | 115(Max) | 1925.59 |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | True | True | 128(Max) | 1953.46 |
|
||||
| Y1 | 8 * Tesla V100-SXM2-16GB | False | False | 256 | 14298.02 |
|
||||
| Y1 | 8 * Tesla V100-SXM2-16GB | True | False | 256 | 14049.75 |
|
||||
| Y1 | 8 * Tesla V100-SXM2-16GB | False | False | 350(Max) | 14756.03 |
|
||||
| Y1 | 8 * Tesla V100-SXM2-16GB | True | True | 400(Max) | 14436.38 |
|
||||
|
||||
#### Glint360k 数据集 & FP32
|
||||
|
||||
| Backbone | GPU | partial_fc sample_ratio | BatchSize / it | Throughput img / sec |
|
||||
| -------- | ------------------------ | ----------------------- | -------------- | -------------------- |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | 1 | 64 | 1808.27 |
|
||||
| R100 | 8 * Tesla V100-SXM2-16GB | 0.1 | 64 | 1858.57 |
|
||||
|
||||
|
||||
|
||||
### Evaluation on Lfw, Cfp_fp, Agedb_30
|
||||
|
||||
- Data Parallelism
|
||||
|
||||
| Backbone | Dataset | Lfw | Cfp_fp | Agedb_30 |
|
||||
| ------------- | ------- | ------ | ------ | -------- |
|
||||
| R100 | MS1M | 99.717 | 98.643 | 98.150 |
|
||||
| MobileFaceNet | MS1M | 99.5 | 92.657 | 95.6 |
|
||||
|
||||
- Model Parallelism
|
||||
|
||||
| Backbone | Dataset | Lfw | Cfp_fp | Agedb_30 |
|
||||
| ------------- | ------- | ------ | ------ | -------- |
|
||||
| R100 | MS1M | 99.733 | 98.329 | 98.033 |
|
||||
| MobileFaceNet | MS1M | 99.483 | 93.457 | 95.7 |
|
||||
|
||||
- Partial FC
|
||||
|
||||
| Backbone | Dataset | Lfw | Cfp_fp | Agedb_30 |
|
||||
| -------- | ------- | ------ | ------ | -------- |
|
||||
| R100 | MS1M | 99.817 | 98.443 | 98.217 |
|
||||
|
||||
### Evaluation on IFRT
|
||||
|
||||
r denotes the sampling rate of negative class centers.
|
||||
|
||||
| Backbone | Dataset | African | Caucasian | Indian | Asian | ALL |
|
||||
| -------- | -------------------- | ------- | --------- | ------ | ------ | ------ |
|
||||
| R100 | **Glint360k**(r=0.1) | 90.4076 | 94.583 | 93.702 | 68.754 | 89.684 |
|
||||
|
||||
### Max num_classses
|
||||
|
||||
| node_num | gpu_num_per_node | batch_size_per_device | fp16 | Model Parallel | Partial FC | num_classes |
|
||||
| -------- | ---------------- | --------------------- | ---- | -------------- | ---------- | ----------- |
|
||||
| 1 | 1 | 64 | True | True | True | 2000000 |
|
||||
| 1 | 8 | 64 | True | True | True | 13500000 |
|
||||
|
||||
更多详情请移步 [OneFlow DLPerf](https://github.com/Oneflow-Inc/DLPerf#insightface).
|
||||
```
|
||||
pip install oneflow-onnx==0.5.1
|
||||
./convert.sh
|
||||
```
|
||||
@@ -1,19 +1,16 @@
|
||||
from .ir_resnet import iresnet18, iresnet34, iresnet50, iresnet100, iresnet200
|
||||
from .fmobilefacenet import mobilefacenet
|
||||
|
||||
|
||||
def get_model(name, input_blob, cfg):
|
||||
def get_model(name, **kwargs):
|
||||
if name == "r18":
|
||||
return iresnet18(input_blob, cfg)
|
||||
return iresnet18(False, **kwargs)
|
||||
elif name == "r34":
|
||||
return iresnet34(input_blob, cfg)
|
||||
return iresnet34(False, **kwargs)
|
||||
elif name == "r50":
|
||||
return iresnet50(input_blob, cfg)
|
||||
return iresnet50(False, **kwargs)
|
||||
elif name == "r100":
|
||||
return iresnet100(input_blob, cfg)
|
||||
return iresnet100(False, **kwargs)
|
||||
elif name == "r200":
|
||||
return iresnet200(input_blob, cfg)
|
||||
elif name == "mbf":
|
||||
return mobilefacenet(input_blob, cfg)
|
||||
return iresnet200(False, **kwargs)
|
||||
else:
|
||||
raise ValueError()
|
||||
|
||||
@@ -1,242 +0,0 @@
|
||||
import oneflow as flow
|
||||
|
||||
|
||||
# same as torch
|
||||
def _get_initializer():
|
||||
return flow.random_normal_initializer(mean=0.0, stddev=0.1)
|
||||
|
||||
|
||||
def _get_initializer_FC():
|
||||
return flow.random_normal_initializer(mean=0.0, stddev=0.01)
|
||||
|
||||
|
||||
def _get_regularizer(name):
|
||||
return flow.regularizers.l2(0.0005)
|
||||
|
||||
|
||||
def _dropout(input_blob, dropout_prob):
|
||||
return flow.nn.dropout(input_blob, rate=dropout_prob)
|
||||
|
||||
|
||||
def _prelu(inputs, data_format="NCHW", name=None):
|
||||
return flow.layers.prelu(
|
||||
inputs,
|
||||
alpha_initializer=flow.constant_initializer(0.25),
|
||||
alpha_regularizer=_get_regularizer("alpha"),
|
||||
shared_axes=[2, 3] if data_format == "NCHW" else [1, 2],
|
||||
name=name,
|
||||
)
|
||||
|
||||
|
||||
def _relu(inputs, data_format="NCHW", name=None):
|
||||
return flow.nn.relu(
|
||||
inputs,
|
||||
name=name,
|
||||
)
|
||||
|
||||
|
||||
def _avg_pool(inputs, pool_size, strides, padding, data_format="NCHW", name=None):
|
||||
return flow.nn.avg_pool2d(
|
||||
input=inputs, ksize=pool_size, strides=strides, padding=padding, data_format=data_format, name=name
|
||||
)
|
||||
|
||||
|
||||
def _batch_norm(
|
||||
inputs,
|
||||
epsilon,
|
||||
center=True,
|
||||
scale=True,
|
||||
trainable=True,
|
||||
is_training=True,
|
||||
data_format="NCHW",
|
||||
name=None,
|
||||
):
|
||||
|
||||
return flow.layers.batch_normalization(
|
||||
inputs=inputs,
|
||||
axis=3 if data_format == "NHWC" and inputs.shape == 4 else 1,
|
||||
momentum=0.9,
|
||||
epsilon=epsilon,
|
||||
center=center,
|
||||
scale=scale,
|
||||
beta_initializer=flow.zeros_initializer(),
|
||||
gamma_initializer=flow.ones_initializer(),
|
||||
beta_regularizer=_get_regularizer("beta"),
|
||||
gamma_regularizer=_get_regularizer("gamma"),
|
||||
moving_mean_initializer=flow.zeros_initializer(),
|
||||
moving_variance_initializer=flow.ones_initializer(),
|
||||
trainable=trainable,
|
||||
training=is_training,
|
||||
name=name,
|
||||
)
|
||||
|
||||
|
||||
def _conv2d_layer(
|
||||
name,
|
||||
input,
|
||||
filters,
|
||||
kernel_size=3,
|
||||
strides=1,
|
||||
padding="SAME",
|
||||
group_num=1,
|
||||
data_format="NCHW",
|
||||
dilation_rate=1,
|
||||
activation=None,
|
||||
use_bias=False,
|
||||
weight_initializer=_get_initializer(),
|
||||
bias_initializer=flow.zeros_initializer(),
|
||||
weight_regularizer=_get_regularizer("weight"),
|
||||
bias_regularizer=_get_regularizer("bias"),
|
||||
):
|
||||
return flow.layers.conv2d(inputs=input, filters=filters, kernel_size=kernel_size, strides=strides, padding=padding, data_format=data_format, dilation_rate=dilation_rate, groups=group_num, activation=activation, use_bias=use_bias, kernel_initializer=weight_initializer, bias_initializer=bias_initializer, kernel_regularizer=weight_regularizer, bias_regularizer=bias_regularizer, name=name)
|
||||
|
||||
|
||||
def Linear(
|
||||
input_blob,
|
||||
num_filter=1,
|
||||
kernel=None,
|
||||
stride=None,
|
||||
pad="valid",
|
||||
num_group=1,
|
||||
bn_is_training=True,
|
||||
data_format="NCHW",
|
||||
name=None,
|
||||
suffix="",
|
||||
):
|
||||
conv = _conv2d_layer(
|
||||
name="%s%s_conv2d" % (name, suffix),
|
||||
input=input_blob,
|
||||
filters=num_filter,
|
||||
kernel_size=kernel,
|
||||
strides=stride,
|
||||
padding=pad,
|
||||
data_format=data_format,
|
||||
group_num=num_group,
|
||||
use_bias=False,
|
||||
dilation_rate=1,
|
||||
activation=None,
|
||||
)
|
||||
|
||||
bn = _batch_norm(
|
||||
conv,
|
||||
epsilon=0.001,
|
||||
is_training=bn_is_training,
|
||||
data_format=data_format,
|
||||
name="%s%s_batchnorm" % (name, suffix),
|
||||
)
|
||||
return bn
|
||||
|
||||
|
||||
def get_fc1(last_conv, num_classes, fc_type, input_channel=512):
|
||||
body = last_conv
|
||||
if fc_type == "Z":
|
||||
body = _batch_norm(
|
||||
body,
|
||||
epsilon=2e-5,
|
||||
scale=False,
|
||||
center=True,
|
||||
is_training=True,
|
||||
data_format="NCHW",
|
||||
name="bn2"
|
||||
)
|
||||
body = _dropout(body, 0.4)
|
||||
fc1 = body
|
||||
elif fc_type == "E":
|
||||
body = _batch_norm(
|
||||
body,
|
||||
epsilon=2e-5,
|
||||
is_training=True,
|
||||
data_format="NCHW",
|
||||
name="bn2"
|
||||
)
|
||||
body = _dropout(body, dropout_prob=0.4)
|
||||
|
||||
body = flow.flatten(body, 1)
|
||||
fc1 = flow.layers.dense(
|
||||
inputs=body,
|
||||
units=num_classes,
|
||||
activation=None,
|
||||
use_bias=True,
|
||||
kernel_initializer=_get_initializer(),
|
||||
bias_initializer=flow.zeros_initializer(),
|
||||
kernel_regularizer=_get_regularizer("weight"),
|
||||
bias_regularizer=_get_regularizer("bias"),
|
||||
trainable=True,
|
||||
name="pre_fc1",
|
||||
)
|
||||
fc1 = _batch_norm(
|
||||
fc1,
|
||||
epsilon=2e-5,
|
||||
scale=False,
|
||||
center=True,
|
||||
is_training=True,
|
||||
data_format="NCHW",
|
||||
name="fc1",
|
||||
)
|
||||
elif fc_type == "FC":
|
||||
body = _batch_norm(
|
||||
body,
|
||||
epsilon=2e-5,
|
||||
is_training=True,
|
||||
data_format="NCHW",
|
||||
name="bn2"
|
||||
)
|
||||
|
||||
|
||||
body = flow.flatten(body, 1)
|
||||
fc1 = flow.layers.dense(
|
||||
inputs=body,
|
||||
units=num_classes,
|
||||
activation=None,
|
||||
use_bias=True,
|
||||
kernel_initializer=_get_initializer(),
|
||||
bias_initializer=flow.zeros_initializer(),
|
||||
kernel_regularizer=_get_regularizer("weight"),
|
||||
bias_regularizer=_get_regularizer("bias"),
|
||||
trainable=True,
|
||||
name="fc"
|
||||
)
|
||||
fc1 = _batch_norm(
|
||||
fc1,
|
||||
epsilon=2e-5,
|
||||
scale=False,
|
||||
center=True,
|
||||
is_training=True,
|
||||
data_format="NCHW",
|
||||
name="features"
|
||||
)
|
||||
elif fc_type == "GDC":
|
||||
conv_6_dw = Linear(
|
||||
last_conv,
|
||||
num_filter=input_channel, # 512
|
||||
num_group=input_channel, # 512
|
||||
kernel=7,
|
||||
pad="valid",
|
||||
stride=[1, 1],
|
||||
bn_is_training=True,
|
||||
data_format="NCHW",
|
||||
name="conv_6dw7_7",
|
||||
)
|
||||
conv_6_dw = flow.reshape(conv_6_dw, (body.shape[0], -1))
|
||||
conv_6_f = flow.layers.dense(
|
||||
inputs=conv_6_dw,
|
||||
units=num_classes,
|
||||
activation=None,
|
||||
use_bias=True,
|
||||
kernel_initializer=_get_initializer(),
|
||||
bias_initializer=flow.zeros_initializer(),
|
||||
kernel_regularizer=_get_regularizer("weight"),
|
||||
bias_regularizer=_get_regularizer("bias"),
|
||||
trainable=True,
|
||||
name="pre_fc1",
|
||||
)
|
||||
fc1 = _batch_norm(
|
||||
conv_6_f,
|
||||
epsilon=2e-5,
|
||||
scale=False,
|
||||
center=True,
|
||||
is_training=True,
|
||||
data_format="NCHW",
|
||||
name="fc1",
|
||||
)
|
||||
return fc1
|
||||
@@ -1,257 +0,0 @@
|
||||
import oneflow as flow
|
||||
import oneflow.core.operator.op_conf_pb2 as op_conf_util
|
||||
from .common import _get_initializer, _conv2d_layer, _batch_norm, _prelu, Linear, get_fc1
|
||||
|
||||
|
||||
"""
|
||||
References:
|
||||
https://github.com/deepinsight/insightface/blob/master/recognition/symbol/fmobilefacenet.py
|
||||
"""
|
||||
|
||||
|
||||
def Conv(
|
||||
input_blob,
|
||||
num_filter=1,
|
||||
kernel=None,
|
||||
stride=None,
|
||||
pad="valid",
|
||||
data_format="NCHW",
|
||||
num_group=1,
|
||||
bn_is_training=True,
|
||||
name=None,
|
||||
suffix="",
|
||||
):
|
||||
conv = _conv2d_layer(
|
||||
name="%s%s_conv2d" % (name, suffix),
|
||||
input=input_blob,
|
||||
filters=num_filter,
|
||||
kernel_size=kernel,
|
||||
strides=stride,
|
||||
padding=pad,
|
||||
data_format=data_format,
|
||||
group_num=num_group,
|
||||
dilation_rate=1,
|
||||
activation=None,
|
||||
use_bias=False,
|
||||
)
|
||||
|
||||
bn = _batch_norm(
|
||||
conv,
|
||||
epsilon=0.001,
|
||||
is_training=bn_is_training,
|
||||
data_format=data_format,
|
||||
name="%s%s_batchnorm" % (name, suffix),
|
||||
)
|
||||
prelu = _prelu(bn, data_format, name="%s%s_relu" % (name, suffix))
|
||||
|
||||
return prelu
|
||||
|
||||
|
||||
def DResidual_v1(
|
||||
input_blob,
|
||||
num_out=1,
|
||||
kernel=None,
|
||||
stride=None,
|
||||
pad="same",
|
||||
num_group=1,
|
||||
bn_is_training=True,
|
||||
data_format="NCHW",
|
||||
name=None,
|
||||
suffix="",
|
||||
):
|
||||
conv = Conv(
|
||||
input_blob=input_blob,
|
||||
num_filter=num_group,
|
||||
kernel=1,
|
||||
pad="valid",
|
||||
data_format=data_format,
|
||||
stride=[1, 1],
|
||||
bn_is_training=bn_is_training,
|
||||
name="%s%s_conv_sep" % (name, suffix),
|
||||
)
|
||||
conv_dw = Conv(
|
||||
input_blob=conv,
|
||||
num_filter=num_group,
|
||||
num_group=num_group,
|
||||
kernel=kernel,
|
||||
pad=pad,
|
||||
data_format=data_format,
|
||||
stride=stride,
|
||||
bn_is_training=bn_is_training,
|
||||
name="%s%s_conv_dw" % (name, suffix),
|
||||
)
|
||||
proj = Linear(
|
||||
input_blob=conv_dw,
|
||||
num_filter=num_out,
|
||||
kernel=1,
|
||||
pad="valid",
|
||||
data_format=data_format,
|
||||
stride=[1, 1],
|
||||
bn_is_training=bn_is_training,
|
||||
name="%s%s_conv_proj" % (name, suffix),
|
||||
)
|
||||
return proj
|
||||
|
||||
|
||||
def Residual(
|
||||
input_blob,
|
||||
num_block=1,
|
||||
num_out=1,
|
||||
kernel=None,
|
||||
stride=None,
|
||||
pad="same",
|
||||
data_format="NCHW",
|
||||
num_group=1,
|
||||
bn_is_training=True,
|
||||
name=None,
|
||||
suffix="",
|
||||
):
|
||||
identity = input_blob
|
||||
for i in range(num_block):
|
||||
shortcut = identity
|
||||
conv = DResidual_v1(
|
||||
input_blob=identity,
|
||||
num_out=num_out,
|
||||
kernel=kernel,
|
||||
stride=stride,
|
||||
pad=pad,
|
||||
data_format=data_format,
|
||||
num_group=num_group,
|
||||
|
||||
name="%s%s_block" % (name, suffix),
|
||||
suffix="%d" % i,
|
||||
)
|
||||
identity = flow.math.add(conv, shortcut)
|
||||
return identity
|
||||
|
||||
|
||||
def get_symbol(input_blob, net_blocks, config):
|
||||
num_classes = config.embedding_size
|
||||
fc_type = 'GDC'
|
||||
data_format = "NCHW"
|
||||
bn_is_training = True
|
||||
|
||||
conv_1 = Conv(
|
||||
input_blob,
|
||||
num_filter=64,
|
||||
kernel=3,
|
||||
stride=[2, 2],
|
||||
pad="same",
|
||||
data_format=data_format,
|
||||
bn_is_training=bn_is_training,
|
||||
name="conv_1",
|
||||
)
|
||||
|
||||
if net_blocks[0] == 1:
|
||||
conv_2_dw = Conv(
|
||||
conv_1,
|
||||
num_filter=64,
|
||||
kernel=3,
|
||||
stride=[1, 1],
|
||||
pad="same",
|
||||
data_format=data_format,
|
||||
num_group=64,
|
||||
bn_is_training=bn_is_training,
|
||||
name="conv_2_dw",
|
||||
)
|
||||
else:
|
||||
conv_2_dw = Residual(
|
||||
conv_1,
|
||||
num_block=net_blocks[0],
|
||||
num_out=64,
|
||||
kernel=3,
|
||||
stride=[1, 1],
|
||||
pad="same",
|
||||
data_format=data_format,
|
||||
num_group=64,
|
||||
bn_is_training=bn_is_training,
|
||||
name="res_2",
|
||||
)
|
||||
|
||||
conv_23 = DResidual_v1(
|
||||
conv_2_dw,
|
||||
num_out=64,
|
||||
kernel=3,
|
||||
stride=[2, 2],
|
||||
pad="same",
|
||||
data_format=data_format,
|
||||
num_group=128,
|
||||
bn_is_training=bn_is_training,
|
||||
name="dconv_23",
|
||||
)
|
||||
conv_3 = Residual(
|
||||
conv_23,
|
||||
num_block=net_blocks[1],
|
||||
num_out=64,
|
||||
kernel=3,
|
||||
stride=[1, 1],
|
||||
pad="same",
|
||||
data_format=data_format,
|
||||
num_group=128,
|
||||
bn_is_training=bn_is_training,
|
||||
name="res_3",
|
||||
)
|
||||
|
||||
conv_34 = DResidual_v1(
|
||||
conv_3,
|
||||
num_out=128,
|
||||
kernel=3,
|
||||
stride=[2, 2],
|
||||
pad="same",
|
||||
data_format=data_format,
|
||||
num_group=256,
|
||||
bn_is_training=bn_is_training,
|
||||
name="dconv_34",
|
||||
)
|
||||
conv_4 = Residual(
|
||||
conv_34,
|
||||
num_block=net_blocks[2],
|
||||
num_out=128,
|
||||
kernel=3,
|
||||
stride=[1, 1],
|
||||
pad="same",
|
||||
data_format=data_format,
|
||||
num_group=256,
|
||||
bn_is_training=bn_is_training,
|
||||
name="res_4",
|
||||
)
|
||||
|
||||
conv_45 = DResidual_v1(
|
||||
conv_4,
|
||||
num_out=128,
|
||||
kernel=3,
|
||||
stride=[2, 2],
|
||||
pad="same",
|
||||
data_format=data_format,
|
||||
num_group=512,
|
||||
bn_is_training=bn_is_training,
|
||||
name="dconv_45",
|
||||
)
|
||||
conv_5 = Residual(
|
||||
conv_45,
|
||||
num_block=net_blocks[3],
|
||||
num_out=128,
|
||||
kernel=3,
|
||||
stride=[1, 1],
|
||||
pad="same",
|
||||
data_format=data_format,
|
||||
num_group=256,
|
||||
bn_is_training=bn_is_training,
|
||||
name="res_5",
|
||||
)
|
||||
conv_6_sep = Conv(
|
||||
conv_5,
|
||||
num_filter=512,
|
||||
kernel=1,
|
||||
pad="valid",
|
||||
data_format=data_format,
|
||||
stride=[1, 1],
|
||||
bn_is_training=bn_is_training,
|
||||
name="conv_6sep",
|
||||
)
|
||||
fc1 = get_fc1(conv_6_sep, num_classes, fc_type, input_channel=512)
|
||||
return fc1
|
||||
|
||||
|
||||
def mobilefacenet(input_blob, cfg):
|
||||
return get_symbol(input_blob, [1, 4, 6, 2], cfg)
|
||||
@@ -1,189 +1,219 @@
|
||||
import oneflow as flow
|
||||
from .common import _batch_norm, _conv2d_layer, _avg_pool, _prelu, get_fc1
|
||||
import oneflow.nn as nn
|
||||
from typing import Type, Any, Callable, Union, List, Optional
|
||||
|
||||
|
||||
def residual_unit_v3(
|
||||
in_data, num_filter, stride, dim_match, bn_is_training, data_format, name
|
||||
):
|
||||
|
||||
suffix = ""
|
||||
use_se = 0
|
||||
bn1 = _batch_norm(
|
||||
in_data,
|
||||
epsilon=2e-5,
|
||||
is_training=bn_is_training,
|
||||
data_format=data_format,
|
||||
name="%s%s.bn1" % (name, suffix),
|
||||
)
|
||||
conv1 = _conv2d_layer(
|
||||
name="%s%s.conv1" % (name, suffix),
|
||||
input=bn1,
|
||||
filters=num_filter,
|
||||
def conv3x3(
|
||||
in_planes: int, out_planes: int, stride: int = 1, groups: int = 1, dilation: int = 1
|
||||
) -> nn.Conv2d:
|
||||
"""3x3 convolution with padding"""
|
||||
return nn.Conv2d(
|
||||
in_planes,
|
||||
out_planes,
|
||||
kernel_size=3,
|
||||
strides=[1, 1],
|
||||
padding="same",
|
||||
data_format=data_format,
|
||||
use_bias=False,
|
||||
dilation_rate=1,
|
||||
activation=None,
|
||||
)
|
||||
bn2 = _batch_norm(
|
||||
conv1,
|
||||
epsilon=2e-5,
|
||||
is_training=bn_is_training,
|
||||
data_format=data_format,
|
||||
name="%s%s.bn2" % (name, suffix),
|
||||
)
|
||||
prelu = _prelu(bn2, data_format=data_format,
|
||||
name="%s%s_relu1" % (name, suffix))
|
||||
conv2 = _conv2d_layer(
|
||||
name="%s%s.conv2" % (name, suffix),
|
||||
input=prelu,
|
||||
filters=num_filter,
|
||||
kernel_size=3,
|
||||
strides=stride,
|
||||
padding="same",
|
||||
data_format=data_format,
|
||||
use_bias=False,
|
||||
dilation_rate=1,
|
||||
activation=None,
|
||||
)
|
||||
bn3 = _batch_norm(
|
||||
conv2,
|
||||
epsilon=2e-5,
|
||||
is_training=bn_is_training,
|
||||
data_format=data_format,
|
||||
name="%s%s.bn3" % (name, suffix),
|
||||
stride=stride,
|
||||
padding=dilation,
|
||||
groups=groups,
|
||||
bias=False,
|
||||
dilation=dilation,
|
||||
)
|
||||
|
||||
if use_se:
|
||||
# se begin
|
||||
input_blob = _avg_pool(
|
||||
bn3, pool_size=[7, 7], strides=[1, 1], padding="VALID"
|
||||
)
|
||||
input_blob = _conv2d_layer(
|
||||
name="%s%s_se_conv1" % (name, suffix),
|
||||
input=input_blob,
|
||||
filters=num_filter // 16,
|
||||
kernel_size=1,
|
||||
strides=[1, 1],
|
||||
padding="valid",
|
||||
data_format=data_format,
|
||||
use_bias=True,
|
||||
dilation_rate=1,
|
||||
activation=None,
|
||||
)
|
||||
input_blob = _prelu(input_blob, name="%s%s_se_relu1" % (name, suffix))
|
||||
input_blob = _conv2d_layer(
|
||||
name="%s%s_se_conv2" % (name, suffix),
|
||||
input=input_blob,
|
||||
filters=num_filter,
|
||||
kernel_size=1,
|
||||
strides=[1, 1],
|
||||
padding="valid",
|
||||
data_format=data_format,
|
||||
use_bias=True,
|
||||
dilation_rate=1,
|
||||
activation=None,
|
||||
)
|
||||
input_blob = flow.math.sigmoid(input=input_blob)
|
||||
bn3 = flow.math.multiply(x=input_blob, y=bn3)
|
||||
# se end
|
||||
|
||||
if dim_match:
|
||||
input_blob = in_data
|
||||
else:
|
||||
input_blob = _conv2d_layer(
|
||||
name="%s%s.downsample.0" % (name, suffix),
|
||||
input=in_data,
|
||||
filters=num_filter,
|
||||
kernel_size=1,
|
||||
strides=stride,
|
||||
padding="valid",
|
||||
data_format=data_format,
|
||||
use_bias=False,
|
||||
dilation_rate=1,
|
||||
activation=None,
|
||||
)
|
||||
input_blob = _batch_norm(
|
||||
input_blob,
|
||||
epsilon=2e-5,
|
||||
is_training=bn_is_training,
|
||||
data_format=data_format,
|
||||
name="%s%s.downsample.1" % (name, suffix),
|
||||
)
|
||||
|
||||
identity = flow.math.add(x=bn3, y=input_blob)
|
||||
return identity
|
||||
def conv1x1(in_planes: int, out_planes: int, stride: int = 1) -> nn.Conv2d:
|
||||
"""1x1 convolution"""
|
||||
return nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride, bias=False)
|
||||
|
||||
|
||||
def get_symbol(input_blob, units, cfg):
|
||||
filter_list = [64, 64, 128, 256, 512]
|
||||
num_stages = 4
|
||||
units = units
|
||||
class IBasicBlock(nn.Module):
|
||||
expansion = 1
|
||||
|
||||
num_classes = cfg.embedding_size
|
||||
def __init__(
|
||||
self,
|
||||
inplanes,
|
||||
planes,
|
||||
stride=1,
|
||||
downsample=None,
|
||||
groups=1,
|
||||
base_width=64,
|
||||
dilation=1,
|
||||
):
|
||||
super(IBasicBlock, self).__init__()
|
||||
if groups != 1 or base_width != 64:
|
||||
raise ValueError("BasicBlock only supports groups=1 and base_width=64")
|
||||
if dilation > 1:
|
||||
raise NotImplementedError("Dilation > 1 not supported in BasicBlock")
|
||||
self.bn1 = nn.BatchNorm2d(inplanes, eps=1e-05,)
|
||||
self.conv1 = conv3x3(inplanes, planes)
|
||||
self.bn2 = nn.BatchNorm2d(planes, eps=1e-05,)
|
||||
self.prelu = nn.ReLU(planes)
|
||||
self.conv2 = conv3x3(planes, planes, stride)
|
||||
self.bn3 = nn.BatchNorm2d(planes, eps=1e-05,)
|
||||
self.downsample = downsample
|
||||
self.stride = stride
|
||||
|
||||
fc_type = cfg.fc_type
|
||||
bn_is_training = True
|
||||
data_format = "NCHW"
|
||||
def forward(self, x):
|
||||
identity = x
|
||||
out = self.bn1(x)
|
||||
out = self.conv1(out)
|
||||
out = self.bn2(out)
|
||||
out = self.prelu(out)
|
||||
out = self.conv2(out)
|
||||
out = self.bn3(out)
|
||||
if self.downsample is not None:
|
||||
identity = self.downsample(x)
|
||||
out += identity
|
||||
return out
|
||||
|
||||
input_blob = _conv2d_layer(
|
||||
name="conv1",
|
||||
input=input_blob,
|
||||
filters=filter_list[0],
|
||||
kernel_size=3,
|
||||
strides=[1, 1],
|
||||
padding="same",
|
||||
data_format=data_format,
|
||||
use_bias=False,
|
||||
dilation_rate=1,
|
||||
activation=None,
|
||||
)
|
||||
input_blob = _batch_norm(
|
||||
input_blob, epsilon=2e-5, is_training=bn_is_training, data_format=data_format, name="bn1"
|
||||
)
|
||||
input_blob = _prelu(input_blob, data_format=data_format, name="relu0")
|
||||
|
||||
for i in range(num_stages):
|
||||
input_blob = residual_unit_v3(
|
||||
input_blob,
|
||||
filter_list[i + 1],
|
||||
[2, 2],
|
||||
False,
|
||||
bn_is_training=bn_is_training,
|
||||
data_format=data_format,
|
||||
name="layer%d.%d" % (i + 1, 0),
|
||||
)
|
||||
for j in range(units[i] - 1):
|
||||
input_blob = residual_unit_v3(
|
||||
input_blob,
|
||||
filter_list[i + 1],
|
||||
[1, 1],
|
||||
True,
|
||||
bn_is_training=bn_is_training,
|
||||
data_format=data_format,
|
||||
name="layer%d.%d" % (i + 1, j + 1),
|
||||
class IResNet(nn.Module):
|
||||
fc_scale = 7 * 7
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
block,
|
||||
layers,
|
||||
dropout=0,
|
||||
num_features=512,
|
||||
zero_init_residual=False,
|
||||
groups=1,
|
||||
width_per_group=64,
|
||||
replace_stride_with_dilation=None,
|
||||
fp16=False,
|
||||
):
|
||||
super(IResNet, self).__init__()
|
||||
self.fp16 = fp16
|
||||
self.inplanes = 64
|
||||
self.dilation = 1
|
||||
if replace_stride_with_dilation is None:
|
||||
replace_stride_with_dilation = [False, False, False]
|
||||
if len(replace_stride_with_dilation) != 3:
|
||||
raise ValueError(
|
||||
"replace_stride_with_dilation should be None "
|
||||
"or a 3-element tuple, got {}".format(replace_stride_with_dilation)
|
||||
)
|
||||
fc1 = get_fc1(input_blob, num_classes, fc_type)
|
||||
return fc1
|
||||
self.groups = groups
|
||||
self.base_width = width_per_group
|
||||
self.conv1 = nn.Conv2d(
|
||||
3, self.inplanes, kernel_size=3, stride=1, padding=1, bias=False
|
||||
)
|
||||
self.bn1 = nn.BatchNorm2d(self.inplanes, eps=1e-05)
|
||||
self.prelu = nn.ReLU(self.inplanes)
|
||||
self.layer1 = self._make_layer(block, 64, layers[0], stride=2)
|
||||
self.layer2 = self._make_layer(
|
||||
block, 128, layers[1], stride=2, dilate=replace_stride_with_dilation[0]
|
||||
)
|
||||
self.layer3 = self._make_layer(
|
||||
block, 256, layers[2], stride=2, dilate=replace_stride_with_dilation[1]
|
||||
)
|
||||
self.layer4 = self._make_layer(
|
||||
block, 512, layers[3], stride=2, dilate=replace_stride_with_dilation[2]
|
||||
)
|
||||
self.bn2 = nn.BatchNorm2d(512 * block.expansion, eps=1e-05,)
|
||||
self.dropout = nn.Dropout(p=dropout, inplace=True)
|
||||
self.fc = nn.Linear(512 * block.expansion * self.fc_scale, num_features)
|
||||
self.features = nn.BatchNorm1d(num_features, eps=1e-05)
|
||||
nn.init.constant_(self.features.weight, 1.0)
|
||||
self.features.weight.requires_grad = False
|
||||
|
||||
for m in self.modules():
|
||||
if isinstance(m, nn.Conv2d):
|
||||
nn.init.normal_(m.weight, 0, 0.1)
|
||||
elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):
|
||||
nn.init.constant_(m.weight, 1)
|
||||
nn.init.constant_(m.bias, 0)
|
||||
|
||||
if zero_init_residual:
|
||||
for m in self.modules():
|
||||
if isinstance(m, IBasicBlock):
|
||||
nn.init.constant_(m.bn2.weight, 0)
|
||||
|
||||
def _make_layer(self, block, planes, blocks, stride=1, dilate=False):
|
||||
downsample = None
|
||||
previous_dilation = self.dilation
|
||||
if dilate:
|
||||
self.dilation *= stride
|
||||
stride = 1
|
||||
if stride != 1 or self.inplanes != planes * block.expansion:
|
||||
downsample = nn.Sequential(
|
||||
conv1x1(self.inplanes, planes * block.expansion, stride),
|
||||
nn.BatchNorm2d(planes * block.expansion, eps=1e-05,),
|
||||
)
|
||||
layers = []
|
||||
layers.append(
|
||||
block(
|
||||
self.inplanes,
|
||||
planes,
|
||||
stride,
|
||||
downsample,
|
||||
self.groups,
|
||||
self.base_width,
|
||||
previous_dilation,
|
||||
)
|
||||
)
|
||||
self.inplanes = planes * block.expansion
|
||||
for _ in range(1, blocks):
|
||||
layers.append(
|
||||
block(
|
||||
self.inplanes,
|
||||
planes,
|
||||
groups=self.groups,
|
||||
base_width=self.base_width,
|
||||
dilation=self.dilation,
|
||||
)
|
||||
)
|
||||
|
||||
return nn.Sequential(*layers)
|
||||
|
||||
def forward(self, x):
|
||||
|
||||
x = self.conv1(x)
|
||||
x = self.bn1(x)
|
||||
x = self.prelu(x)
|
||||
x = self.layer1(x)
|
||||
x = self.layer2(x)
|
||||
x = self.layer3(x)
|
||||
x = self.layer4(x)
|
||||
x = self.bn2(x)
|
||||
x = flow.flatten(x, 1)
|
||||
x = self.dropout(x)
|
||||
x = self.fc(x)
|
||||
x = self.features(x)
|
||||
|
||||
return x
|
||||
|
||||
|
||||
def iresnet18(input_blob, cfg):
|
||||
return get_symbol([2, 2, 2, 2], cfg)
|
||||
def _iresnet(arch, block, layers, pretrained, progress, **kwargs):
|
||||
model = IResNet(block, layers, **kwargs)
|
||||
if pretrained:
|
||||
raise ValueError()
|
||||
return model
|
||||
|
||||
|
||||
def iresnet34(input_blob, cfg):
|
||||
return get_symbol(input_blob, [3, 4, 6, 3], cfg)
|
||||
def iresnet18(pretrained=False, progress=True, **kwargs):
|
||||
return _iresnet(
|
||||
"iresnet18", IBasicBlock, [2, 2, 2, 2], pretrained, progress, **kwargs
|
||||
)
|
||||
|
||||
|
||||
def iresnet50(input_blob, cfg):
|
||||
return get_symbol(input_blob, [3, 4, 14, 3], cfg)
|
||||
def iresnet34(pretrained=False, progress=True, **kwargs):
|
||||
return _iresnet(
|
||||
"iresnet34", IBasicBlock, [3, 4, 6, 3], pretrained, progress, **kwargs
|
||||
)
|
||||
|
||||
|
||||
def iresnet100(input_blob, cfg):
|
||||
return get_symbol(input_blob, [3, 13, 30, 3], cfg)
|
||||
def iresnet50(pretrained=False, progress=True, **kwargs):
|
||||
return _iresnet(
|
||||
"iresnet50", IBasicBlock, [3, 4, 14, 3], pretrained, progress, **kwargs
|
||||
)
|
||||
|
||||
|
||||
def iresnet200(input_blob, cfg):
|
||||
return get_symbol(input_blob, [6, 26, 60, 6], cfg)
|
||||
def iresnet100(pretrained=False, progress=True, **kwargs):
|
||||
return _iresnet(
|
||||
"iresnet100", IBasicBlock, [3, 13, 30, 3], pretrained, progress, **kwargs
|
||||
)
|
||||
|
||||
|
||||
def iresnet200(pretrained=False, progress=True, **kwargs):
|
||||
return _iresnet(
|
||||
"iresnet200", IBasicBlock, [6, 26, 60, 6], pretrained, progress, **kwargs
|
||||
)
|
||||
|
||||
@@ -1,68 +1,53 @@
|
||||
from pickle import TRUE
|
||||
from easydict import EasyDict as edict
|
||||
import math
|
||||
import numpy as np
|
||||
|
||||
# make training faster
|
||||
# our RAM is 256G
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "cosface"
|
||||
config.loss = "arcface"
|
||||
config.network = "r50"
|
||||
config.resume = False
|
||||
config.output = "ms1mv3_arcface_r50"
|
||||
|
||||
config.dataset = "ms1m-retinaface-t1"
|
||||
config.embedding_size = 512
|
||||
config.sample_rate = 1
|
||||
config.fp16 = True
|
||||
config.fp16 = False
|
||||
config.model_parallel = False
|
||||
config.sample_rate = 1.0
|
||||
config.partial_fc = False
|
||||
config.graph = True
|
||||
config.synthetic = False
|
||||
config.scale_grad = False
|
||||
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.1 # batch size is 512
|
||||
config.model_load_dir = ''
|
||||
|
||||
config.val_batch_size = 10
|
||||
|
||||
config.node_ips = ["192.168.1.13"]
|
||||
config.num_nodes = 1
|
||||
config.device_num_per_node = 1
|
||||
config.model_parallel = 1
|
||||
config.partial_fc = 0
|
||||
|
||||
config.use_synthetic_data = False
|
||||
|
||||
|
||||
config.fc_type = "FC"
|
||||
config.nccl_fusion_threshold_mb = 16
|
||||
config.nccl_fusion_max_ops = 64
|
||||
config.val_dataset_dir = "/train_tmp/glint360k/val"
|
||||
|
||||
|
||||
config.part_name_prefix = "part-"
|
||||
config.part_name_suffix_length = 5
|
||||
config.train_data_part_num = 16
|
||||
config.shuffle = True
|
||||
|
||||
|
||||
config.val_image_num = {"lfw": 12000, "cfp_fp": 14000, "agedb_30": 12000}
|
||||
if config.dataset == "emore":
|
||||
config.ofrecord_path = "/train_tmp/faces_emore"
|
||||
config.num_classes = 85742
|
||||
config.num_image = 5822653
|
||||
config.num_epoch = 16
|
||||
config.warmup_epoch = -1
|
||||
config.decay_epoch = [8, 14, ]
|
||||
config.val_targets = ["lfw", ]
|
||||
config.train_data_part_num = 32
|
||||
config.decay_epoch = [
|
||||
8,
|
||||
14,
|
||||
]
|
||||
config.val_targets = [
|
||||
"lfw",
|
||||
]
|
||||
|
||||
elif config.dataset == "ms1m-retinaface-t1":
|
||||
config.ofrecord_path = "/dev/shm/ms1m-retinaface-t1/ofrecord"
|
||||
config.num_classes = 93432
|
||||
config.num_classes = 93431
|
||||
config.num_image = 5179510
|
||||
config.num_epoch = 25
|
||||
config.warmup_epoch = -1
|
||||
config.decay_epoch = [11, 17, 22]
|
||||
config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
|
||||
config.train_data_part_num = 32
|
||||
|
||||
elif config.dataset == "glint360k":
|
||||
config.ofrecord_path = "/train_tmp/glint360k"
|
||||
|
||||
@@ -1,35 +0,0 @@
|
||||
from easydict import EasyDict as edict
|
||||
|
||||
# make training faster
|
||||
# our RAM is 256G
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "r100"
|
||||
config.resume = False
|
||||
config.output = "lazy_r100"
|
||||
config.embedding_size = 512
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.1 # batch size is 512
|
||||
config.model_parallel = True
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 1.0
|
||||
config.device_num_per_node = 8
|
||||
|
||||
|
||||
config.ofrecord_path = "/dev/shm/faces_emore/ofrecord/train"
|
||||
config.eval_ofrecord_path = "/dev/shm/faces_emore/ofrecord/val"
|
||||
config.num_classes = 85742
|
||||
config.num_image = 5822653
|
||||
config.num_epoch = 16
|
||||
config.warmup_epoch = -1
|
||||
config.decay_epoch = [8, 14, ]
|
||||
config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
|
||||
|
||||
|
||||
config.node_ips = ["192.168.1.13"]
|
||||
config.num_nodes = 1
|
||||
@@ -8,26 +8,20 @@ config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "mbf"
|
||||
config.resume = False
|
||||
config.output = "lazy_mbf"
|
||||
config.embedding_size = 128
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 0.1
|
||||
config.model_parallel = True
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.weight_decay = 2e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.1 # batch size is 512
|
||||
config.model_parallel = True
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 1.0
|
||||
config.device_num_per_node = 8
|
||||
|
||||
|
||||
config.ofrecord_path = "/train_tmp/glint360k/train"
|
||||
config.eval_ofrecord_path = "/train_tmp/glint360k/val"
|
||||
config.num_classes = 93432
|
||||
config.num_image = 5179510
|
||||
config.train_data_part_num = 200
|
||||
|
||||
config.ofrecord_path = "/train_tmp/glint360k"
|
||||
config.dataset = "glint360k"
|
||||
config.ofrecord_path = "/train_tmp/glint360k/"
|
||||
config.ofrecord_part_num = 200
|
||||
config.num_classes = 360232
|
||||
config.num_image = 17091657
|
||||
config.num_epoch = 20
|
||||
|
||||
@@ -8,22 +8,21 @@ config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "r100"
|
||||
config.resume = False
|
||||
config.output = "lazy_r100"
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 0.1
|
||||
config.model_parallel = True
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.1 # batch size is 512
|
||||
config.model_parallel = True
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 1.0
|
||||
config.device_num_per_node = 8
|
||||
|
||||
|
||||
config.ofrecord_path = "/train_tmp/glint360k/train"
|
||||
config.eval_ofrecord_path = "/train_tmp/glint360k/val"
|
||||
config.train_data_part_num = 200
|
||||
config.dataset = "glint360k"
|
||||
config.ofrecord_path = "/train_tmp/glint360k/"
|
||||
config.ofrecord_part_num = 200
|
||||
config.num_classes = 360232
|
||||
config.num_image = 17091657
|
||||
config.num_epoch = 20
|
||||
|
||||
@@ -8,26 +8,20 @@ config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "r18"
|
||||
config.resume = False
|
||||
config.output = "lazy_r18"
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 0.1
|
||||
config.model_parallel = True
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.1 # batch size is 512
|
||||
config.model_parallel = True
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 1.0
|
||||
config.device_num_per_node = 8
|
||||
|
||||
|
||||
config.ofrecord_path = "/train_tmp/glint360k/train"
|
||||
config.eval_ofrecord_path = "/train_tmp/glint360k/val"
|
||||
config.num_classes = 93432
|
||||
config.num_image = 5179510
|
||||
config.train_data_part_num = 200
|
||||
|
||||
config.ofrecord_path = "/train_tmp/glint360k"
|
||||
config.dataset = "glint360k"
|
||||
config.ofrecord_path = "/train_tmp/glint360k/"
|
||||
config.ofrecord_part_num = 200
|
||||
config.num_classes = 360232
|
||||
config.num_image = 17091657
|
||||
config.num_epoch = 20
|
||||
|
||||
@@ -8,26 +8,20 @@ config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "r34"
|
||||
config.resume = False
|
||||
config.output = "lazy_r34"
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 0.1
|
||||
config.model_parallel = True
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.1 # batch size is 512
|
||||
config.model_parallel = True
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 1.0
|
||||
config.device_num_per_node = 8
|
||||
|
||||
|
||||
config.ofrecord_path = "/train_tmp/glint360k/train"
|
||||
config.eval_ofrecord_path = "/train_tmp/glint360k/val"
|
||||
config.num_classes = 93432
|
||||
config.num_image = 5179510
|
||||
config.train_data_part_num = 200
|
||||
|
||||
config.ofrecord_path = "/train_tmp/glint360k"
|
||||
config.dataset = "glint360k"
|
||||
config.ofrecord_path = "/train_tmp/glint360k/"
|
||||
config.ofrecord_part_num = 200
|
||||
config.num_classes = 360232
|
||||
config.num_image = 17091657
|
||||
config.num_epoch = 20
|
||||
|
||||
@@ -8,26 +8,20 @@ config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "r50"
|
||||
config.resume = False
|
||||
config.output = "lazy_r50"
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 0.1
|
||||
config.model_parallel = True
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.1 # batch size is 512
|
||||
config.model_parallel = True
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 1.0
|
||||
config.device_num_per_node = 8
|
||||
|
||||
|
||||
config.ofrecord_path = "/train_tmp/glint360k/train"
|
||||
config.eval_ofrecord_path = "/train_tmp/glint360k/val"
|
||||
config.num_classes = 93432
|
||||
config.num_image = 5179510
|
||||
config.train_data_part_num = 200
|
||||
|
||||
config.ofrecord_path = "/train_tmp/glint360k"
|
||||
config.dataset = "glint360k"
|
||||
config.ofrecord_path = "/train_tmp/glint360k/"
|
||||
config.ofrecord_part_num = 200
|
||||
config.num_classes = 360232
|
||||
config.num_image = 17091657
|
||||
config.num_epoch = 20
|
||||
|
||||
@@ -1,37 +0,0 @@
|
||||
from easydict import EasyDict as edict
|
||||
|
||||
# make training faster
|
||||
# our RAM is 256G
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "r100"
|
||||
config.resume = False
|
||||
config.output = "lazy_r100"
|
||||
config.embedding_size = 512
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.1 # batch size is 512
|
||||
config.model_parallel = True
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 1.0
|
||||
config.device_num_per_node = 8
|
||||
|
||||
|
||||
config.ofrecord_path = "/dev/shm/ms1m-retinaface-t1/ofrecord/train"
|
||||
config.eval_ofrecord_path = "/dev/shm/ms1m-retinaface-t1/ofrecord/val"
|
||||
config.num_classes = 93432
|
||||
config.num_image = 5179510
|
||||
config.train_data_part_num = 32
|
||||
|
||||
config.num_epoch = 25
|
||||
config.warmup_epoch = -1
|
||||
config.decay_epoch = [10, 16, 22]
|
||||
config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
|
||||
#config.val_targets = []
|
||||
|
||||
config.node_ips = ["192.168.1.13"]
|
||||
config.num_nodes = 1
|
||||
@@ -5,33 +5,25 @@ from easydict import EasyDict as edict
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "cosface"
|
||||
config.loss = "arcface"
|
||||
config.network = "mbf"
|
||||
config.resume = False
|
||||
config.output = "lazy_mbf"
|
||||
config.embedding_size = 128
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.1 # batch size is 512
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.model_parallel = True
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 1.0
|
||||
config.device_num_per_node = 8
|
||||
config.sample_rate = 0.1
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 2e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.1 # batch size is 512
|
||||
|
||||
|
||||
config.ofrecord_path = "/dev/shm/ms1m-retinaface-t1/ofrecord/train"
|
||||
config.eval_ofrecord_path = "/dev/shm/ms1m-retinaface-t1/ofrecord/val"
|
||||
config.num_classes = 93432
|
||||
config.ofrecord_path = "/train_tmp/ms1m-retinaface-t1"
|
||||
config.ofrecord_part_num = 8
|
||||
config.num_classes = 93431
|
||||
config.num_image = 5179510
|
||||
config.train_data_part_num = 32
|
||||
|
||||
config.num_epoch = 25
|
||||
config.num_epoch = 30
|
||||
config.warmup_epoch = -1
|
||||
config.decay_epoch = [10, 16, 22]
|
||||
config.decay_epoch = [10, 20, 25]
|
||||
config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
|
||||
|
||||
|
||||
config.node_ips = ["192.168.1.13"]
|
||||
config.num_nodes = 1
|
||||
|
||||
@@ -5,33 +5,25 @@ from easydict import EasyDict as edict
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "cosface"
|
||||
config.loss = "arcface"
|
||||
config.network = "r18"
|
||||
config.resume = False
|
||||
config.output = "lazy_r18"
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.model_parallel = True
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 0.1
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.1 # batch size is 512
|
||||
config.model_parallel = True
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 1.0
|
||||
config.device_num_per_node = 8
|
||||
|
||||
|
||||
config.ofrecord_path = "/dev/shm/ms1m-retinaface-t1/ofrecord/train"
|
||||
config.eval_ofrecord_path = "/dev/shm/ms1m-retinaface-t1/ofrecord/val"
|
||||
config.num_classes = 93432
|
||||
config.ofrecord_path = "/train_tmp/ms1m-retinaface-t1"
|
||||
config.ofrecord_part_num = 8
|
||||
config.num_classes = 93431
|
||||
config.num_image = 5179510
|
||||
config.train_data_part_num = 32
|
||||
|
||||
config.num_epoch = 25
|
||||
config.warmup_epoch = -1
|
||||
config.decay_epoch = [10, 16, 22]
|
||||
config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
|
||||
#config.val_targets = []
|
||||
|
||||
config.node_ips = ["192.168.1.13"]
|
||||
config.num_nodes = 1
|
||||
|
||||
@@ -5,33 +5,26 @@ from easydict import EasyDict as edict
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "cosface"
|
||||
config.loss = "arcface"
|
||||
config.network = "r34"
|
||||
config.resume = False
|
||||
config.output = "lazy_r34"
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.model_parallel = True
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 0.1
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.1 # batch size is 512
|
||||
config.model_parallel = True
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 1.0
|
||||
config.device_num_per_node = 8
|
||||
|
||||
|
||||
config.ofrecord_path = "/dev/shm/ms1m-retinaface-t1/ofrecord/train"
|
||||
config.eval_ofrecord_path = "/dev/shm/ms1m-retinaface-t1/ofrecord/val"
|
||||
config.num_classes = 93432
|
||||
config.ofrecord_path = "/train_tmp/ms1m-retinaface-t1"
|
||||
config.ofrecord_part_num = 8
|
||||
config.num_classes = 93431
|
||||
config.num_image = 5179510
|
||||
config.train_data_part_num = 32
|
||||
|
||||
config.num_epoch = 25
|
||||
config.warmup_epoch = -1
|
||||
config.decay_epoch = [10, 16, 22]
|
||||
config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
|
||||
#config.val_targets = []
|
||||
|
||||
config.node_ips = ["192.168.1.13"]
|
||||
config.num_nodes = 1
|
||||
|
||||
@@ -8,30 +8,23 @@ config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "r50"
|
||||
config.resume = False
|
||||
config.output = "lazy_r50"
|
||||
config.output = "partial_fc"
|
||||
config.embedding_size = 512
|
||||
config.model_parallel = True
|
||||
config.partial_fc = 0
|
||||
config.sample_rate = 0.1
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.1 # batch size is 512
|
||||
config.model_parallel = True
|
||||
config.partial_fc = 1
|
||||
config.sample_rate = 1.0
|
||||
config.device_num_per_node = 8
|
||||
|
||||
|
||||
config.ofrecord_path = "/dev/shm/ms1m-retinaface-t1/ofrecord/train"
|
||||
config.eval_ofrecord_path = "/dev/shm/ms1m-retinaface-t1/ofrecord/val"
|
||||
config.ofrecord_path = "/train_tmp/ms1m-retinaface-t1/ofrecord/"
|
||||
config.ofrecord_part_num = 8
|
||||
config.num_classes = 93432
|
||||
config.num_image = 5179510
|
||||
config.train_data_part_num = 32
|
||||
|
||||
config.num_epoch = 25
|
||||
config.warmup_epoch = -1
|
||||
config.decay_epoch = [10, 16, 22]
|
||||
config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
|
||||
|
||||
|
||||
config.node_ips = ["192.168.1.13"]
|
||||
config.num_nodes = 1
|
||||
|
||||
@@ -8,17 +8,17 @@ config.network = "r50"
|
||||
config.resume = False
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.model_parallel = True
|
||||
config.sample_rate = 1.0
|
||||
config.fp16 = True
|
||||
config.fp16 = False
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.1 # batch size is 512
|
||||
|
||||
config.rec = "synthetic"
|
||||
config.num_classes = 100 * 10000
|
||||
config.synthetic = True
|
||||
config.num_classes = 100000
|
||||
config.num_epoch = 30
|
||||
config.warmup_epoch = -1
|
||||
config.decay_epoch = [10, 16, 22]
|
||||
config.val_targets = []
|
||||
config.use_synthetic_data = True
|
||||
|
||||
@@ -1 +1,2 @@
|
||||
python oneflow2onnx.py configs/ms1mv3_r50 --model_path work_dir/lazy_r50/snapshot_0
|
||||
|
||||
python3 oneflow2onnx.py configs/ms1mv3_r50 --model_path /workdir/epoch_0
|
||||
|
||||
@@ -25,13 +25,13 @@ class ArcFaceORT:
|
||||
return "model_path should be directory"
|
||||
onnx_files = []
|
||||
for _file in os.listdir(self.model_path):
|
||||
print('file_:', _file)
|
||||
if _file.endswith('.onnx'):
|
||||
print("file_:", _file)
|
||||
if _file.endswith(".onnx"):
|
||||
onnx_files.append(osp.join(self.model_path, _file))
|
||||
if len(onnx_files) == 0:
|
||||
return "do not have onnx files"
|
||||
self.model_file = sorted(onnx_files)[-1]
|
||||
print('use onnx-model:', self.model_file)
|
||||
print("use onnx-model:", self.model_file)
|
||||
try:
|
||||
session = onnxruntime.InferenceSession(self.model_file, None)
|
||||
except:
|
||||
@@ -39,18 +39,18 @@ class ArcFaceORT:
|
||||
|
||||
input_cfg = session.get_inputs()[0]
|
||||
input_shape = input_cfg.shape
|
||||
print('input-shape:', input_shape)
|
||||
print("input-shape:", input_shape)
|
||||
if len(input_shape) != 4:
|
||||
return "length of input_shape should be 4"
|
||||
if not isinstance(input_shape[0], str):
|
||||
# return "input_shape[0] should be str to support batch-inference"
|
||||
print('reset input-shape[0] to None')
|
||||
print("reset input-shape[0] to None")
|
||||
model = onnx.load(self.model_file)
|
||||
model.graph.input[0].type.tensor_type.shape.dim[0].dim_param = 'None'
|
||||
new_model_file = osp.join(self.model_path, 'zzzzrefined.onnx')
|
||||
model.graph.input[0].type.tensor_type.shape.dim[0].dim_param = "None"
|
||||
new_model_file = osp.join(self.model_path, "zzzzrefined.onnx")
|
||||
onnx.save(model, new_model_file)
|
||||
self.model_file = new_model_file
|
||||
print('use new onnx-model:', self.model_file)
|
||||
print("use new onnx-model:", self.model_file)
|
||||
try:
|
||||
session = onnxruntime.InferenceSession(self.model_file, None)
|
||||
except:
|
||||
@@ -58,7 +58,7 @@ class ArcFaceORT:
|
||||
|
||||
input_cfg = session.get_inputs()[0]
|
||||
input_shape = input_cfg.shape
|
||||
print('new-input-shape:', input_shape)
|
||||
print("new-input-shape:", input_shape)
|
||||
|
||||
self.image_size = tuple(input_shape[2:4][::-1])
|
||||
|
||||
@@ -82,28 +82,30 @@ class ArcFaceORT:
|
||||
input_size = (112, 112)
|
||||
self.crop = None
|
||||
if True:
|
||||
crop_file = osp.join(self.model_path, 'crop.txt')
|
||||
crop_file = osp.join(self.model_path, "crop.txt")
|
||||
if osp.exists(crop_file):
|
||||
lines = open(crop_file, 'r').readlines()
|
||||
lines = open(crop_file, "r").readlines()
|
||||
if len(lines) != 6:
|
||||
return "crop.txt should contain 6 lines"
|
||||
lines = [int(x) for x in lines]
|
||||
self.crop = lines[:4]
|
||||
input_size = tuple(lines[4:6])
|
||||
if input_size != self.image_size:
|
||||
return "input-size is inconsistant with onnx model input, %s vs %s" % (input_size, self.image_size)
|
||||
return "input-size is inconsistant with onnx model input, %s vs %s" % (
|
||||
input_size,
|
||||
self.image_size,
|
||||
)
|
||||
|
||||
self.model_size_mb = os.path.getsize(
|
||||
self.model_file) / float(1024 * 1024)
|
||||
self.model_size_mb = os.path.getsize(self.model_file) / float(1024 * 1024)
|
||||
if self.model_size_mb > max_model_size_mb:
|
||||
return "max model size exceed, given %.3f-MB" % self.model_size_mb
|
||||
|
||||
input_mean = None
|
||||
input_std = None
|
||||
if True:
|
||||
pn_file = osp.join(self.model_path, 'pixel_norm.txt')
|
||||
pn_file = osp.join(self.model_path, "pixel_norm.txt")
|
||||
if osp.exists(pn_file):
|
||||
lines = open(pn_file, 'r').readlines()
|
||||
lines = open(pn_file, "r").readlines()
|
||||
if len(lines) != 2:
|
||||
return "pixel_norm.txt should contain 2 lines"
|
||||
input_mean = float(lines[0])
|
||||
@@ -116,9 +118,9 @@ class ArcFaceORT:
|
||||
find_mul = False
|
||||
for nid, node in enumerate(graph.node[:8]):
|
||||
print(nid, node.name)
|
||||
if node.name.startswith('Sub') or node.name.startswith('_minus'):
|
||||
if node.name.startswith("Sub") or node.name.startswith("_minus"):
|
||||
find_sub = True
|
||||
if node.name.startswith('Mul') or node.name.startswith('_mul'):
|
||||
if node.name.startswith("Mul") or node.name.startswith("_mul"):
|
||||
find_mul = True
|
||||
if find_sub and find_mul:
|
||||
# mxnet arcface model
|
||||
@@ -134,10 +136,11 @@ class ArcFaceORT:
|
||||
|
||||
dt = weight_array.dtype
|
||||
if dt.itemsize < 4:
|
||||
return 'invalid weight type - (%s:%s)' % (initn.name, dt.name)
|
||||
return "invalid weight type - (%s:%s)" % (initn.name, dt.name)
|
||||
if test_img is None:
|
||||
test_img = np.random.randint(0, 255, size=(
|
||||
self.image_size[1], self.image_size[0], 3), dtype=np.uint8)
|
||||
test_img = np.random.randint(
|
||||
0, 255, size=(self.image_size[1], self.image_size[0], 3), dtype=np.uint8
|
||||
)
|
||||
else:
|
||||
test_img = cv2.resize(test_img, self.image_size)
|
||||
feat, cost = self.benchmark(test_img)
|
||||
@@ -149,12 +152,23 @@ class ArcFaceORT:
|
||||
return "max time cost exceed, given %.4f" % cost_ms
|
||||
self.cost_ms = cost_ms
|
||||
print(
|
||||
'check stat:, model-size-mb: %.4f, feat-dim: %d, time-cost-ms: %.4f, input-mean: %.3f, input-std: %.3f' % (
|
||||
self.model_size_mb, self.feat_dim, self.cost_ms, self.input_mean, self.input_std))
|
||||
"check stat:, model-size-mb: %.4f, feat-dim: %d, time-cost-ms: %.4f, input-mean: %.3f, input-std: %.3f"
|
||||
% (
|
||||
self.model_size_mb,
|
||||
self.feat_dim,
|
||||
self.cost_ms,
|
||||
self.input_mean,
|
||||
self.input_std,
|
||||
)
|
||||
)
|
||||
return None
|
||||
|
||||
def meta_info(self):
|
||||
return {'model-size-mb': self.model_size_mb, 'feature-dim': self.feat_dim, 'infer': self.cost_ms}
|
||||
return {
|
||||
"model-size-mb": self.model_size_mb,
|
||||
"feature-dim": self.feat_dim,
|
||||
"infer": self.cost_ms,
|
||||
}
|
||||
|
||||
def forward(self, imgs):
|
||||
if not isinstance(imgs, list):
|
||||
@@ -163,32 +177,39 @@ class ArcFaceORT:
|
||||
if self.crop is not None:
|
||||
nimgs = []
|
||||
for img in imgs:
|
||||
nimg = img[self.crop[1]:self.crop[3],
|
||||
self.crop[0]:self.crop[2], :]
|
||||
nimg = img[self.crop[1] : self.crop[3], self.crop[0] : self.crop[2], :]
|
||||
if nimg.shape[0] != input_size[1] or nimg.shape[1] != input_size[0]:
|
||||
nimg = cv2.resize(nimg, input_size)
|
||||
nimgs.append(nimg)
|
||||
imgs = nimgs
|
||||
blob = cv2.dnn.blobFromImages(imgs, 1.0 / self.input_std, input_size,
|
||||
(self.input_mean, self.input_mean, self.input_mean), swapRB=True)
|
||||
net_out = self.session.run(
|
||||
self.output_names, {self.input_name: blob})[0]
|
||||
blob = cv2.dnn.blobFromImages(
|
||||
imgs,
|
||||
1.0 / self.input_std,
|
||||
input_size,
|
||||
(self.input_mean, self.input_mean, self.input_mean),
|
||||
swapRB=True,
|
||||
)
|
||||
net_out = self.session.run(self.output_names, {self.input_name: blob})[0]
|
||||
return net_out
|
||||
|
||||
def benchmark(self, img):
|
||||
input_size = self.image_size
|
||||
if self.crop is not None:
|
||||
nimg = img[self.crop[1]:self.crop[3], self.crop[0]:self.crop[2], :]
|
||||
nimg = img[self.crop[1] : self.crop[3], self.crop[0] : self.crop[2], :]
|
||||
if nimg.shape[0] != input_size[1] or nimg.shape[1] != input_size[0]:
|
||||
nimg = cv2.resize(nimg, input_size)
|
||||
img = nimg
|
||||
blob = cv2.dnn.blobFromImage(img, 1.0 / self.input_std, input_size,
|
||||
(self.input_mean, self.input_mean, self.input_mean), swapRB=True)
|
||||
blob = cv2.dnn.blobFromImage(
|
||||
img,
|
||||
1.0 / self.input_std,
|
||||
input_size,
|
||||
(self.input_mean, self.input_mean, self.input_mean),
|
||||
swapRB=True,
|
||||
)
|
||||
costs = []
|
||||
for _ in range(50):
|
||||
ta = datetime.datetime.now()
|
||||
net_out = self.session.run(
|
||||
self.output_names, {self.input_name: blob})[0]
|
||||
net_out = self.session.run(self.output_names, {self.input_name: blob})[0]
|
||||
tb = datetime.datetime.now()
|
||||
cost = (tb - ta).total_seconds()
|
||||
costs.append(cost)
|
||||
@@ -197,9 +218,10 @@ class ArcFaceORT:
|
||||
return net_out, cost
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument(
|
||||
"--model_root", help="onnx model root, default is './'", default="./")
|
||||
"--model_root", help="onnx model root, default is './'", default="./"
|
||||
)
|
||||
args = parser.parse_args()
|
||||
ArcFaceORT(args.model_root).check()
|
||||
@@ -20,8 +20,10 @@ SRC = np.array(
|
||||
[65.5318, 51.5014],
|
||||
[48.0252, 71.7366],
|
||||
[33.5493, 92.3655],
|
||||
[62.7299, 92.2041]]
|
||||
, dtype=np.float32)
|
||||
[62.7299, 92.2041],
|
||||
],
|
||||
dtype=np.float32,
|
||||
)
|
||||
SRC[:, 0] += 8.0
|
||||
|
||||
|
||||
@@ -36,10 +38,12 @@ class AlignedDataSet(mx.gluon.data.Dataset):
|
||||
|
||||
def __getitem__(self, idx):
|
||||
each_line = self.lines[idx]
|
||||
name_lmk_score = each_line.strip().split(' ')
|
||||
name_lmk_score = each_line.strip().split(" ")
|
||||
name = os.path.join(self.root, name_lmk_score[0])
|
||||
img = cv2.cvtColor(cv2.imread(name), cv2.COLOR_BGR2RGB)
|
||||
landmark5 = np.array([float(x) for x in name_lmk_score[1:-1]], dtype=np.float32).reshape((5, 2))
|
||||
landmark5 = np.array(
|
||||
[float(x) for x in name_lmk_score[1:-1]], dtype=np.float32
|
||||
).reshape((5, 2))
|
||||
st = skimage.transform.SimilarityTransform()
|
||||
st.estimate(landmark5, SRC)
|
||||
img = cv2.warpAffine(img, st.params[0:2, :], (112, 112), borderValue=0.0)
|
||||
@@ -60,15 +64,21 @@ def extract(model_root, dataset):
|
||||
return mx.nd.concat(*data, dim=0)
|
||||
|
||||
data_loader = mx.gluon.data.DataLoader(
|
||||
dataset, 128, last_batch='keep', num_workers=4,
|
||||
thread_pool=True, prefetch=16, batchify_fn=batchify_fn)
|
||||
dataset,
|
||||
128,
|
||||
last_batch="keep",
|
||||
num_workers=4,
|
||||
thread_pool=True,
|
||||
prefetch=16,
|
||||
batchify_fn=batchify_fn,
|
||||
)
|
||||
num_iter = 0
|
||||
for batch in data_loader:
|
||||
batch = batch.asnumpy()
|
||||
batch = (batch - model.input_mean) / model.input_std
|
||||
feat = model.session.run(model.output_names, {model.input_name: batch})[0]
|
||||
feat = np.reshape(feat, (-1, model.feat_dim * 2))
|
||||
feat_mat[128 * num_iter: 128 * num_iter + feat.shape[0], :] = feat
|
||||
feat_mat[128 * num_iter : 128 * num_iter + feat.shape[0], :] = feat
|
||||
num_iter += 1
|
||||
if num_iter % 50 == 0:
|
||||
print(num_iter)
|
||||
@@ -76,14 +86,14 @@ def extract(model_root, dataset):
|
||||
|
||||
|
||||
def read_template_media_list(path):
|
||||
ijb_meta = pd.read_csv(path, sep=' ', header=None).values
|
||||
ijb_meta = pd.read_csv(path, sep=" ", header=None).values
|
||||
templates = ijb_meta[:, 1].astype(np.int)
|
||||
medias = ijb_meta[:, 2].astype(np.int)
|
||||
return templates, medias
|
||||
|
||||
|
||||
def read_template_pair_list(path):
|
||||
pairs = pd.read_csv(path, sep=' ', header=None).values
|
||||
pairs = pd.read_csv(path, sep=" ", header=None).values
|
||||
t1 = pairs[:, 0].astype(np.int)
|
||||
t2 = pairs[:, 1].astype(np.int)
|
||||
label = pairs[:, 2].astype(np.int)
|
||||
@@ -91,14 +101,12 @@ def read_template_pair_list(path):
|
||||
|
||||
|
||||
def read_image_feature(path):
|
||||
with open(path, 'rb') as fid:
|
||||
with open(path, "rb") as fid:
|
||||
img_feats = pickle.load(fid)
|
||||
return img_feats
|
||||
|
||||
|
||||
def image2template_feature(img_feats=None,
|
||||
templates=None,
|
||||
medias=None):
|
||||
def image2template_feature(img_feats=None, templates=None, medias=None):
|
||||
unique_templates = np.unique(templates)
|
||||
template_feats = np.zeros((len(unique_templates), img_feats.shape[1]))
|
||||
for count_template, uqt in enumerate(unique_templates):
|
||||
@@ -112,27 +120,25 @@ def image2template_feature(img_feats=None,
|
||||
if ct == 1:
|
||||
media_norm_feats += [face_norm_feats[ind_m]]
|
||||
else: # image features from the same video will be aggregated into one feature
|
||||
media_norm_feats += [np.mean(face_norm_feats[ind_m], axis=0, keepdims=True), ]
|
||||
media_norm_feats += [
|
||||
np.mean(face_norm_feats[ind_m], axis=0, keepdims=True),
|
||||
]
|
||||
media_norm_feats = np.array(media_norm_feats)
|
||||
template_feats[count_template] = np.sum(media_norm_feats, axis=0)
|
||||
if count_template % 2000 == 0:
|
||||
print('Finish Calculating {} template features.'.format(
|
||||
count_template))
|
||||
print("Finish Calculating {} template features.".format(count_template))
|
||||
template_norm_feats = normalize(template_feats)
|
||||
return template_norm_feats, unique_templates
|
||||
|
||||
|
||||
def verification(template_norm_feats=None,
|
||||
unique_templates=None,
|
||||
p1=None,
|
||||
p2=None):
|
||||
def verification(template_norm_feats=None, unique_templates=None, p1=None, p2=None):
|
||||
template2id = np.zeros((max(unique_templates) + 1, 1), dtype=int)
|
||||
for count_template, uqt in enumerate(unique_templates):
|
||||
template2id[uqt] = count_template
|
||||
score = np.zeros((len(p1),))
|
||||
total_pairs = np.array(range(len(p1)))
|
||||
batchsize = 100000
|
||||
sublists = [total_pairs[i: i + batchsize] for i in range(0, len(p1), batchsize)]
|
||||
sublists = [total_pairs[i : i + batchsize] for i in range(0, len(p1), batchsize)]
|
||||
total_sublists = len(sublists)
|
||||
for c, s in enumerate(sublists):
|
||||
feat1 = template_norm_feats[template2id[p1[s]]]
|
||||
@@ -140,21 +146,19 @@ def verification(template_norm_feats=None,
|
||||
similarity_score = np.sum(feat1 * feat2, -1)
|
||||
score[s] = similarity_score.flatten()
|
||||
if c % 10 == 0:
|
||||
print('Finish {}/{} pairs.'.format(c, total_sublists))
|
||||
print("Finish {}/{} pairs.".format(c, total_sublists))
|
||||
return score
|
||||
|
||||
|
||||
def verification2(template_norm_feats=None,
|
||||
unique_templates=None,
|
||||
p1=None,
|
||||
p2=None):
|
||||
def verification2(template_norm_feats=None, unique_templates=None, p1=None, p2=None):
|
||||
template2id = np.zeros((max(unique_templates) + 1, 1), dtype=int)
|
||||
for count_template, uqt in enumerate(unique_templates):
|
||||
template2id[uqt] = count_template
|
||||
score = np.zeros((len(p1),)) # save cosine distance between pairs
|
||||
total_pairs = np.array(range(len(p1)))
|
||||
batchsize = 100000 # small batchsize instead of all pairs in one batch due to the memory limiation
|
||||
sublists = [total_pairs[i:i + batchsize] for i in range(0, len(p1), batchsize)]
|
||||
# small batchsize instead of all pairs in one batch due to the memory limiation
|
||||
batchsize = 100000
|
||||
sublists = [total_pairs[i : i + batchsize] for i in range(0, len(p1), batchsize)]
|
||||
total_sublists = len(sublists)
|
||||
for c, s in enumerate(sublists):
|
||||
feat1 = template_norm_feats[template2id[p1[s]]]
|
||||
@@ -162,7 +166,7 @@ def verification2(template_norm_feats=None,
|
||||
similarity_score = np.sum(feat1 * feat2, -1)
|
||||
score[s] = similarity_score.flatten()
|
||||
if c % 10 == 0:
|
||||
print('Finish {}/{} pairs.'.format(c, total_sublists))
|
||||
print("Finish {}/{} pairs.".format(c, total_sublists))
|
||||
return score
|
||||
|
||||
|
||||
@@ -170,24 +174,33 @@ def main(args):
|
||||
use_norm_score = True # if Ture, TestMode(N1)
|
||||
use_detector_score = True # if Ture, TestMode(D1)
|
||||
use_flip_test = True # if Ture, TestMode(F1)
|
||||
assert args.target == 'IJBC' or args.target == 'IJBB'
|
||||
assert args.target == "IJBC" or args.target == "IJBB"
|
||||
|
||||
start = timeit.default_timer()
|
||||
templates, medias = read_template_media_list(
|
||||
os.path.join('%s/meta' % args.image_path, '%s_face_tid_mid.txt' % args.target.lower()))
|
||||
os.path.join(
|
||||
"%s/meta" % args.image_path, "%s_face_tid_mid.txt" % args.target.lower()
|
||||
)
|
||||
)
|
||||
stop = timeit.default_timer()
|
||||
print('Time: %.2f s. ' % (stop - start))
|
||||
print("Time: %.2f s. " % (stop - start))
|
||||
|
||||
start = timeit.default_timer()
|
||||
p1, p2, label = read_template_pair_list(
|
||||
os.path.join('%s/meta' % args.image_path,
|
||||
'%s_template_pair_label.txt' % args.target.lower()))
|
||||
os.path.join(
|
||||
"%s/meta" % args.image_path,
|
||||
"%s_template_pair_label.txt" % args.target.lower(),
|
||||
)
|
||||
)
|
||||
stop = timeit.default_timer()
|
||||
print('Time: %.2f s. ' % (stop - start))
|
||||
print("Time: %.2f s. " % (stop - start))
|
||||
|
||||
start = timeit.default_timer()
|
||||
img_path = '%s/loose_crop' % args.image_path
|
||||
img_list_path = '%s/meta/%s_name_5pts_score.txt' % (args.image_path, args.target.lower())
|
||||
img_path = "%s/loose_crop" % args.image_path
|
||||
img_list_path = "%s/meta/%s_name_5pts_score.txt" % (
|
||||
args.image_path,
|
||||
args.target.lower(),
|
||||
)
|
||||
img_list = open(img_list_path)
|
||||
files = img_list.readlines()
|
||||
dataset = AlignedDataSet(root=img_path, lines=files, align=True)
|
||||
@@ -199,19 +212,24 @@ def main(args):
|
||||
faceness_scores.append(name_lmk_score[-1])
|
||||
faceness_scores = np.array(faceness_scores).astype(np.float32)
|
||||
stop = timeit.default_timer()
|
||||
print('Time: %.2f s. ' % (stop - start))
|
||||
print('Feature Shape: ({} , {}) .'.format(img_feats.shape[0], img_feats.shape[1]))
|
||||
print("Time: %.2f s. " % (stop - start))
|
||||
print("Feature Shape: ({} , {}) .".format(img_feats.shape[0], img_feats.shape[1]))
|
||||
start = timeit.default_timer()
|
||||
|
||||
if use_flip_test:
|
||||
img_input_feats = img_feats[:, 0:img_feats.shape[1] // 2] + img_feats[:, img_feats.shape[1] // 2:]
|
||||
img_input_feats = (
|
||||
img_feats[:, 0 : img_feats.shape[1] // 2]
|
||||
+ img_feats[:, img_feats.shape[1] // 2 :]
|
||||
)
|
||||
else:
|
||||
img_input_feats = img_feats[:, 0:img_feats.shape[1] // 2]
|
||||
img_input_feats = img_feats[:, 0 : img_feats.shape[1] // 2]
|
||||
|
||||
if use_norm_score:
|
||||
img_input_feats = img_input_feats
|
||||
else:
|
||||
img_input_feats = img_input_feats / np.sqrt(np.sum(img_input_feats ** 2, -1, keepdims=True))
|
||||
img_input_feats = img_input_feats / np.sqrt(
|
||||
np.sum(img_input_feats ** 2, -1, keepdims=True)
|
||||
)
|
||||
|
||||
if use_detector_score:
|
||||
print(img_input_feats.shape, faceness_scores.shape)
|
||||
@@ -220,14 +238,15 @@ def main(args):
|
||||
img_input_feats = img_input_feats
|
||||
|
||||
template_norm_feats, unique_templates = image2template_feature(
|
||||
img_input_feats, templates, medias)
|
||||
img_input_feats, templates, medias
|
||||
)
|
||||
stop = timeit.default_timer()
|
||||
print('Time: %.2f s. ' % (stop - start))
|
||||
print("Time: %.2f s. " % (stop - start))
|
||||
|
||||
start = timeit.default_timer()
|
||||
score = verification(template_norm_feats, unique_templates, p1, p2)
|
||||
stop = timeit.default_timer()
|
||||
print('Time: %.2f s. ' % (stop - start))
|
||||
print("Time: %.2f s. " % (stop - start))
|
||||
save_path = os.path.join(args.result_dir, "{}_result".format(args.target))
|
||||
if not os.path.exists(save_path):
|
||||
os.makedirs(save_path)
|
||||
@@ -242,7 +261,7 @@ def main(args):
|
||||
methods = np.array(methods)
|
||||
scores = dict(zip(methods, scores))
|
||||
x_labels = [10 ** -6, 10 ** -5, 10 ** -4, 10 ** -3, 10 ** -2, 10 ** -1]
|
||||
tpr_fpr_table = prettytable.PrettyTable(['Methods'] + [str(x) for x in x_labels])
|
||||
tpr_fpr_table = prettytable.PrettyTable(["Methods"] + [str(x) for x in x_labels])
|
||||
for method in methods:
|
||||
fpr, tpr, _ = roc_curve(label, scores[method])
|
||||
fpr = np.flipud(fpr)
|
||||
@@ -251,17 +270,20 @@ def main(args):
|
||||
tpr_fpr_row.append("%s-%s" % (method, args.target))
|
||||
for fpr_iter in np.arange(len(x_labels)):
|
||||
_, min_index = min(
|
||||
list(zip(abs(fpr - x_labels[fpr_iter]), range(len(fpr)))))
|
||||
tpr_fpr_row.append('%.2f' % (tpr[min_index] * 100))
|
||||
list(zip(abs(fpr - x_labels[fpr_iter]), range(len(fpr))))
|
||||
)
|
||||
tpr_fpr_row.append("%.2f" % (tpr[min_index] * 100))
|
||||
tpr_fpr_table.add_row(tpr_fpr_row)
|
||||
print(tpr_fpr_table)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser(description='do ijb test')
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser(description="do ijb test")
|
||||
# general
|
||||
parser.add_argument('--model-root', default='', help='path to load model.')
|
||||
parser.add_argument('--image-path', default='', type=str, help='')
|
||||
parser.add_argument('--result-dir', default='.', type=str, help='')
|
||||
parser.add_argument('--target', default='IJBC', type=str, help='target, set to IJBC or IJBB')
|
||||
parser.add_argument("--model-root", default="", help="path to load model.")
|
||||
parser.add_argument("--image-path", default="", type=str, help="")
|
||||
parser.add_argument("--result-dir", default=".", type=str, help="")
|
||||
parser.add_argument(
|
||||
"--target", default="IJBC", type=str, help="target, set to IJBC or IJBB"
|
||||
)
|
||||
main(parser.parse_args())
|
||||
@@ -1,4 +1,4 @@
|
||||
"""Helper for evaluation on the Labeled Faces in the Wild dataset
|
||||
"""Helper for evaluation on the Labeled Faces in the Wild dataset
|
||||
"""
|
||||
|
||||
# MIT License
|
||||
@@ -28,6 +28,8 @@ import datetime
|
||||
import os
|
||||
import pickle
|
||||
|
||||
|
||||
import numpy as np
|
||||
import sklearn
|
||||
import oneflow as flow
|
||||
|
||||
@@ -35,8 +37,7 @@ from scipy import interpolate
|
||||
from sklearn.decomposition import PCA
|
||||
from sklearn.model_selection import KFold
|
||||
import cv2 as cv
|
||||
import numpy as np
|
||||
import sklearn
|
||||
import logging
|
||||
|
||||
|
||||
class LFold:
|
||||
@@ -52,14 +53,11 @@ class LFold:
|
||||
return [(indices, indices)]
|
||||
|
||||
|
||||
def calculate_roc(thresholds,
|
||||
embeddings1,
|
||||
embeddings2,
|
||||
actual_issame,
|
||||
nrof_folds=10,
|
||||
pca=0):
|
||||
assert (embeddings1.shape[0] == embeddings2.shape[0])
|
||||
assert (embeddings1.shape[1] == embeddings2.shape[1])
|
||||
def calculate_roc(
|
||||
thresholds, embeddings1, embeddings2, actual_issame, nrof_folds=10, pca=0
|
||||
):
|
||||
assert embeddings1.shape[0] == embeddings2.shape[0]
|
||||
assert embeddings1.shape[1] == embeddings2.shape[1]
|
||||
nrof_pairs = min(len(actual_issame), embeddings1.shape[0])
|
||||
nrof_thresholds = len(thresholds)
|
||||
k_fold = LFold(n_splits=nrof_folds, shuffle=False)
|
||||
@@ -75,7 +73,7 @@ def calculate_roc(thresholds,
|
||||
|
||||
for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)):
|
||||
if pca > 0:
|
||||
print('doing pca on', fold_idx)
|
||||
print("doing pca on", fold_idx)
|
||||
embed1_train = embeddings1[train_set]
|
||||
embed2_train = embeddings2[train_set]
|
||||
_embed_train = np.concatenate((embed1_train, embed2_train), axis=0)
|
||||
@@ -92,15 +90,18 @@ def calculate_roc(thresholds,
|
||||
acc_train = np.zeros((nrof_thresholds))
|
||||
for threshold_idx, threshold in enumerate(thresholds):
|
||||
_, _, acc_train[threshold_idx] = calculate_accuracy(
|
||||
threshold, dist[train_set], actual_issame[train_set])
|
||||
threshold, dist[train_set], actual_issame[train_set]
|
||||
)
|
||||
best_threshold_index = np.argmax(acc_train)
|
||||
for threshold_idx, threshold in enumerate(thresholds):
|
||||
tprs[fold_idx, threshold_idx], fprs[fold_idx, threshold_idx], _ = calculate_accuracy(
|
||||
threshold, dist[test_set],
|
||||
actual_issame[test_set])
|
||||
(
|
||||
tprs[fold_idx, threshold_idx],
|
||||
fprs[fold_idx, threshold_idx],
|
||||
_,
|
||||
) = calculate_accuracy(threshold, dist[test_set], actual_issame[test_set])
|
||||
_, _, accuracy[fold_idx] = calculate_accuracy(
|
||||
thresholds[best_threshold_index], dist[test_set],
|
||||
actual_issame[test_set])
|
||||
thresholds[best_threshold_index], dist[test_set], actual_issame[test_set]
|
||||
)
|
||||
|
||||
tpr = np.mean(tprs, 0)
|
||||
fpr = np.mean(fprs, 0)
|
||||
@@ -112,8 +113,8 @@ def calculate_accuracy(threshold, dist, actual_issame):
|
||||
tp = np.sum(np.logical_and(predict_issame, actual_issame))
|
||||
fp = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame)))
|
||||
tn = np.sum(
|
||||
np.logical_and(np.logical_not(predict_issame),
|
||||
np.logical_not(actual_issame)))
|
||||
np.logical_and(np.logical_not(predict_issame), np.logical_not(actual_issame))
|
||||
)
|
||||
fn = np.sum(np.logical_and(np.logical_not(predict_issame), actual_issame))
|
||||
|
||||
tpr = 0 if (tp + fn == 0) else float(tp) / float(tp + fn)
|
||||
@@ -122,14 +123,11 @@ def calculate_accuracy(threshold, dist, actual_issame):
|
||||
return tpr, fpr, acc
|
||||
|
||||
|
||||
def calculate_val(thresholds,
|
||||
embeddings1,
|
||||
embeddings2,
|
||||
actual_issame,
|
||||
far_target,
|
||||
nrof_folds=10):
|
||||
assert (embeddings1.shape[0] == embeddings2.shape[0])
|
||||
assert (embeddings1.shape[1] == embeddings2.shape[1])
|
||||
def calculate_val(
|
||||
thresholds, embeddings1, embeddings2, actual_issame, far_target, nrof_folds=10
|
||||
):
|
||||
assert embeddings1.shape[0] == embeddings2.shape[0]
|
||||
assert embeddings1.shape[1] == embeddings2.shape[1]
|
||||
nrof_pairs = min(len(actual_issame), embeddings1.shape[0])
|
||||
nrof_thresholds = len(thresholds)
|
||||
k_fold = LFold(n_splits=nrof_folds, shuffle=False)
|
||||
@@ -147,15 +145,17 @@ def calculate_val(thresholds,
|
||||
far_train = np.zeros(nrof_thresholds)
|
||||
for threshold_idx, threshold in enumerate(thresholds):
|
||||
_, far_train[threshold_idx] = calculate_val_far(
|
||||
threshold, dist[train_set], actual_issame[train_set])
|
||||
threshold, dist[train_set], actual_issame[train_set]
|
||||
)
|
||||
if np.max(far_train) >= far_target:
|
||||
f = interpolate.interp1d(far_train, thresholds, kind='slinear')
|
||||
f = interpolate.interp1d(far_train, thresholds, kind="slinear")
|
||||
threshold = f(far_target)
|
||||
else:
|
||||
threshold = 0.0
|
||||
|
||||
val[fold_idx], far[fold_idx] = calculate_val_far(
|
||||
threshold, dist[test_set], actual_issame[test_set])
|
||||
threshold, dist[test_set], actual_issame[test_set]
|
||||
)
|
||||
|
||||
val_mean = np.mean(val)
|
||||
far_mean = np.mean(far)
|
||||
@@ -166,11 +166,11 @@ def calculate_val(thresholds,
|
||||
def calculate_val_far(threshold, dist, actual_issame):
|
||||
predict_issame = np.less(dist, threshold)
|
||||
true_accept = np.sum(np.logical_and(predict_issame, actual_issame))
|
||||
false_accept = np.sum(
|
||||
np.logical_and(predict_issame, np.logical_not(actual_issame)))
|
||||
false_accept = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame)))
|
||||
n_same = np.sum(actual_issame)
|
||||
n_diff = np.sum(np.logical_not(actual_issame))
|
||||
|
||||
# print(true_accept, false_accept)
|
||||
# print(n_same, n_diff)
|
||||
val = float(true_accept) / float(n_same)
|
||||
far = float(false_accept) / float(n_diff)
|
||||
return val, far
|
||||
@@ -181,28 +181,31 @@ def evaluate(embeddings, actual_issame, nrof_folds=10, pca=0):
|
||||
thresholds = np.arange(0, 4, 0.01)
|
||||
embeddings1 = embeddings[0::2]
|
||||
embeddings2 = embeddings[1::2]
|
||||
tpr, fpr, accuracy = calculate_roc(thresholds,
|
||||
embeddings1,
|
||||
embeddings2,
|
||||
np.asarray(actual_issame),
|
||||
nrof_folds=nrof_folds,
|
||||
pca=pca)
|
||||
tpr, fpr, accuracy = calculate_roc(
|
||||
thresholds,
|
||||
embeddings1,
|
||||
embeddings2,
|
||||
np.asarray(actual_issame),
|
||||
nrof_folds=nrof_folds,
|
||||
pca=pca,
|
||||
)
|
||||
thresholds = np.arange(0, 4, 0.001)
|
||||
val, val_std, far = calculate_val(thresholds,
|
||||
embeddings1,
|
||||
embeddings2,
|
||||
np.asarray(actual_issame),
|
||||
1e-3,
|
||||
nrof_folds=nrof_folds)
|
||||
val, val_std, far = calculate_val(
|
||||
thresholds,
|
||||
embeddings1,
|
||||
embeddings2,
|
||||
np.asarray(actual_issame),
|
||||
1e-3,
|
||||
nrof_folds=nrof_folds,
|
||||
)
|
||||
return tpr, fpr, accuracy, val, val_std, far
|
||||
|
||||
|
||||
def load_bin_cv(path, image_size):
|
||||
bins, issame_list = pickle.load(open(path, 'rb'), encoding='bytes')
|
||||
bins, issame_list = pickle.load(open(path, "rb"), encoding="bytes")
|
||||
data_list = []
|
||||
for flip in [0, 1]:
|
||||
data = np.empty(
|
||||
(len(issame_list) * 2, 3, image_size[0], image_size[1]))
|
||||
data = flow.empty(len(issame_list) * 2, 3, image_size[0], image_size[1])
|
||||
data_list.append(data)
|
||||
for i in range(len(issame_list) * 2):
|
||||
_bin = bins[i]
|
||||
@@ -214,21 +217,24 @@ def load_bin_cv(path, image_size):
|
||||
img = cv.flip(img, 1)
|
||||
img = np.array(img).transpose((2, 0, 1))
|
||||
img = (img - 127.5) * 0.00784313725
|
||||
data_list[flip][i][:] = img
|
||||
data_list[flip][i] = flow.tensor(img, dtype=flow.float)
|
||||
|
||||
if i % 1000 == 0:
|
||||
print('loading bin', i)
|
||||
print(data_list[0].shape)
|
||||
logging.info("loading bin:%d", i)
|
||||
logging.info(data_list[0].shape)
|
||||
return data_list, issame_list
|
||||
|
||||
|
||||
def test(data_set, backbone, batch_size, nfolds=10):
|
||||
print('testing verification..')
|
||||
|
||||
@flow.no_grad()
|
||||
def test(data_set, backbone, batch_size, nfolds=10, is_consistent=False):
|
||||
logging.info("testing verification..")
|
||||
data_list = data_set[0]
|
||||
|
||||
issame_list = data_set[1]
|
||||
embeddings_list = []
|
||||
time_consumed = 0.0
|
||||
if is_consistent:
|
||||
placement = flow.env.all_device_placement("cpu")
|
||||
sbp = flow.sbp.split(0)
|
||||
|
||||
for i in range(len(data_list)):
|
||||
data = data_list[i]
|
||||
@@ -237,17 +243,23 @@ def test(data_set, backbone, batch_size, nfolds=10):
|
||||
while ba < data.shape[0]:
|
||||
bb = min(ba + batch_size, data.shape[0])
|
||||
count = bb - ba
|
||||
_data = data[bb - batch_size: bb]
|
||||
img = data[bb - batch_size : bb]
|
||||
time0 = datetime.datetime.now()
|
||||
with flow.no_grad():
|
||||
if is_consistent:
|
||||
img = img.to_consistent(placement=placement, sbp=sbp)
|
||||
net_out = backbone(img.to("cuda"))
|
||||
|
||||
net_out = backbone(_data)
|
||||
_embeddings = net_out.get().numpy()
|
||||
if is_consistent:
|
||||
_embeddings = net_out.to_local().numpy()
|
||||
else:
|
||||
_embeddings = net_out.detach().numpy()
|
||||
time_now = datetime.datetime.now()
|
||||
diff = time_now - time0
|
||||
time_consumed += diff.total_seconds()
|
||||
if embeddings is None:
|
||||
embeddings = np.zeros((data.shape[0], _embeddings.shape[1]))
|
||||
embeddings[ba:bb, :] = _embeddings[(batch_size - count):, :]
|
||||
embeddings[ba:bb, :] = _embeddings[(batch_size - count) :, :]
|
||||
ba = bb
|
||||
embeddings_list.append(embeddings)
|
||||
|
||||
@@ -267,9 +279,49 @@ def test(data_set, backbone, batch_size, nfolds=10):
|
||||
std1 = 0.0
|
||||
embeddings = embeddings_list[0] + embeddings_list[1]
|
||||
embeddings = sklearn.preprocessing.normalize(embeddings)
|
||||
print(embeddings.shape)
|
||||
print('infer time', time_consumed)
|
||||
logging.info(embeddings.shape)
|
||||
logging.info("infer time:%f" % time_consumed)
|
||||
_, _, accuracy, val, val_std, far = evaluate(
|
||||
embeddings, issame_list, nrof_folds=nfolds)
|
||||
embeddings, issame_list, nrof_folds=nfolds
|
||||
)
|
||||
acc2, std2 = np.mean(accuracy), np.std(accuracy)
|
||||
return acc1, std1, acc2, std2, _xnorm, embeddings_list
|
||||
|
||||
|
||||
def dumpR(data_set, backbone, batch_size, name="", data_extra=None, label_shape=None):
|
||||
print("dump verification embedding..")
|
||||
data_list = data_set[0]
|
||||
issame_list = data_set[1]
|
||||
embeddings_list = []
|
||||
time_consumed = 0.0
|
||||
for i in range(len(data_list)):
|
||||
data = data_list[i]
|
||||
embeddings = None
|
||||
ba = 0
|
||||
while ba < data.shape[0]:
|
||||
bb = min(ba + batch_size, data.shape[0])
|
||||
count = bb - ba
|
||||
|
||||
_data = nd.slice_axis(data, axis=0, begin=bb - batch_size, end=bb)
|
||||
time0 = datetime.datetime.now()
|
||||
if data_extra is None:
|
||||
db = mx.io.DataBatch(data=(_data,), label=(_label,))
|
||||
else:
|
||||
db = mx.io.DataBatch(data=(_data, _data_extra), label=(_label,))
|
||||
model.forward(db, is_train=False)
|
||||
net_out = model.get_outputs()
|
||||
_embeddings = net_out[0].asnumpy()
|
||||
time_now = datetime.datetime.now()
|
||||
diff = time_now - time0
|
||||
time_consumed += diff.total_seconds()
|
||||
if embeddings is None:
|
||||
embeddings = np.zeros((data.shape[0], _embeddings.shape[1]))
|
||||
embeddings[ba:bb, :] = _embeddings[(batch_size - count) :, :]
|
||||
ba = bb
|
||||
embeddings_list.append(embeddings)
|
||||
embeddings = embeddings_list[0] + embeddings_list[1]
|
||||
embeddings = sklearn.preprocessing.normalize(embeddings)
|
||||
actual_issame = np.asarray(issame_list)
|
||||
outname = os.path.join("temp.bin")
|
||||
with open(outname, "wb") as f:
|
||||
pickle.dump((embeddings, issame_list), f, protocol=pickle.HIGHEST_PROTOCOL)
|
||||
|
||||
@@ -1,174 +1,261 @@
|
||||
|
||||
import argparse
|
||||
import logging
|
||||
import os
|
||||
|
||||
import oneflow as flow
|
||||
import oneflow.nn as nn
|
||||
|
||||
import sys
|
||||
from oneflow.nn.parallel import DistributedDataParallel as ddp
|
||||
from utils.ofrecord_data_utils import OFRecordDataLoader, SyntheticDataLoader
|
||||
from utils.utils_logging import AverageMeter
|
||||
from utils.utils_callbacks import CallBackVerification, CallBackLogging, CallBackModelCheckpoint
|
||||
from backbones import get_model
|
||||
import math
|
||||
from utils.utils_config import get_config
|
||||
import numpy as np
|
||||
import pickle
|
||||
import time
|
||||
from utils.ofrecord_data_utils import load_train_dataset, load_synthetic
|
||||
from graph import TrainGraph, EvalGraph
|
||||
from losses import CrossEntropyLoss_sbp
|
||||
import logging
|
||||
|
||||
|
||||
class Validator(object):
|
||||
def __init__(self, cfg):
|
||||
self.cfg = cfg
|
||||
def make_data_loader(args, mode, is_consistent=False, synthetic=False):
|
||||
assert mode in ("train", "validation")
|
||||
|
||||
def get_val_config():
|
||||
config = flow.function_config()
|
||||
config.default_logical_view(flow.scope.consistent_view())
|
||||
config.default_data_type(flow.float)
|
||||
return config
|
||||
function_config = get_val_config()
|
||||
if mode == "train":
|
||||
total_batch_size = args.batch_size*flow.env.get_world_size()
|
||||
batch_size = args.batch_size
|
||||
num_samples = args.num_image
|
||||
else:
|
||||
total_batch_size = args.val_global_batch_size
|
||||
batch_size = args.val_batch_size
|
||||
num_samples = args.val_samples_per_epoch
|
||||
|
||||
@flow.global_function(type="predict", function_config=function_config)
|
||||
def get_symbol_val_job(
|
||||
images: flow.typing.Numpy.Placeholder(
|
||||
(self.cfg.val_batch_size, 3, 112, 112)
|
||||
)
|
||||
):
|
||||
print("val batch data: ", images.shape)
|
||||
embedding = get_model(cfg.network, images, cfg)
|
||||
return embedding
|
||||
placement = None
|
||||
sbp = None
|
||||
|
||||
self.get_symbol_val_fn = get_symbol_val_job
|
||||
if is_consistent:
|
||||
placement = flow.env.all_device_placement("cpu")
|
||||
sbp = flow.sbp.split(0)
|
||||
batch_size = total_batch_size
|
||||
|
||||
def load_checkpoint(self, model_path):
|
||||
flow.load_variables(flow.checkpoint.get(model_path))
|
||||
if synthetic:
|
||||
|
||||
|
||||
def get_train_config(cfg):
|
||||
|
||||
cfg.cudnn_conv_heuristic_search_algo = False
|
||||
cfg.enable_fuse_model_update_ops = True
|
||||
cfg.enable_fuse_add_to_output = True
|
||||
func_config = flow.FunctionConfig()
|
||||
func_config.default_logical_view(flow.scope.consistent_view())
|
||||
func_config.default_data_type(flow.float)
|
||||
func_config.cudnn_conv_heuristic_search_algo(
|
||||
cfg.cudnn_conv_heuristic_search_algo
|
||||
)
|
||||
|
||||
func_config.enable_fuse_model_update_ops(
|
||||
cfg.enable_fuse_model_update_ops)
|
||||
func_config.enable_fuse_add_to_output(cfg.enable_fuse_add_to_output)
|
||||
if cfg.fp16:
|
||||
logging.info("Training with FP16 now.")
|
||||
func_config.enable_auto_mixed_precision(True)
|
||||
if cfg.partial_fc:
|
||||
func_config.enable_fuse_model_update_ops(False)
|
||||
func_config.indexed_slices_optimizer_conf(
|
||||
dict(include_op_names=dict(op_name=['fc7-weight'])))
|
||||
if cfg.fp16 and (cfg.num_nodes * cfg.device_num_per_node) > 1:
|
||||
flow.config.collective_boxing.nccl_fusion_all_reduce_use_buffer(False)
|
||||
if cfg.nccl_fusion_threshold_mb:
|
||||
flow.config.collective_boxing.nccl_fusion_threshold_mb(
|
||||
cfg.nccl_fusion_threshold_mb)
|
||||
if cfg.nccl_fusion_max_ops:
|
||||
flow.config.collective_boxing.nccl_fusion_max_ops(
|
||||
cfg.nccl_fusion_max_ops)
|
||||
|
||||
return func_config
|
||||
|
||||
|
||||
def make_train_func(cfg):
|
||||
@flow.global_function(type="train", function_config=get_train_config(cfg))
|
||||
def get_symbol_train_job():
|
||||
if cfg.use_synthetic_data:
|
||||
(labels, images) = load_synthetic(cfg)
|
||||
else:
|
||||
labels, images = load_train_dataset(cfg)
|
||||
image_size = images.shape[2:]
|
||||
assert len(
|
||||
image_size) == 2, "The length of image size must be equal to 2."
|
||||
assert image_size[0] == image_size[1], "image_size[0] should be equal to image_size[1]."
|
||||
|
||||
embedding = get_model(cfg.network, images, cfg)
|
||||
|
||||
def _get_initializer():
|
||||
return flow.random_normal_initializer(mean=0.0, stddev=0.01)
|
||||
|
||||
trainable = True
|
||||
|
||||
if cfg.model_parallel and cfg.device_num_per_node > 1:
|
||||
logging.info("Training is using model parallelism now.")
|
||||
labels = labels.with_distribute(flow.distribute.broadcast())
|
||||
fc1_distribute = flow.distribute.broadcast()
|
||||
fc7_data_distribute = flow.distribute.split(1)
|
||||
fc7_model_distribute = flow.distribute.split(0)
|
||||
else:
|
||||
fc1_distribute = flow.distribute.split(0)
|
||||
fc7_data_distribute = flow.distribute.split(0)
|
||||
fc7_model_distribute = flow.distribute.broadcast()
|
||||
weight_regularizer = flow.regularizers.l2(0.0005)
|
||||
fc7_weight = flow.get_variable(
|
||||
name="fc7-weight",
|
||||
shape=(cfg.num_classes, embedding.shape[1]),
|
||||
dtype=embedding.dtype,
|
||||
initializer=_get_initializer(),
|
||||
regularizer=weight_regularizer,
|
||||
trainable=trainable,
|
||||
model_name="weight",
|
||||
distribute=fc7_model_distribute,
|
||||
data_loader = SyntheticDataLoader(
|
||||
batch_size=batch_size,
|
||||
num_classes=args.num_classes,
|
||||
placement=placement,
|
||||
sbp=sbp,
|
||||
)
|
||||
if cfg.partial_fc and cfg.model_parallel:
|
||||
logging.info(
|
||||
"Training is using model parallelism and optimized by partial_fc now."
|
||||
)
|
||||
return data_loader.to("cuda")
|
||||
|
||||
size = cfg.device_num_per_node * cfg.num_nodes
|
||||
num_local = (cfg.num_classes + size - 1) // size
|
||||
num_sample = int(num_local * cfg.sample_rate)
|
||||
total_num_sample = num_sample * size
|
||||
ofrecord_data_loader = OFRecordDataLoader(
|
||||
ofrecord_root=args.ofrecord_path,
|
||||
mode=mode,
|
||||
dataset_size=num_samples,
|
||||
batch_size=batch_size,
|
||||
total_batch_size=total_batch_size,
|
||||
data_part_num=args.ofrecord_part_num,
|
||||
placement=placement,
|
||||
sbp=sbp,
|
||||
)
|
||||
return ofrecord_data_loader
|
||||
|
||||
|
||||
def make_optimizer(args, model):
|
||||
param_group = {"params": [p for p in model.parameters() if p is not None]}
|
||||
|
||||
optimizer = flow.optim.SGD(
|
||||
[param_group],
|
||||
lr=args.lr,
|
||||
momentum=args.momentum,
|
||||
weight_decay=args.weight_decay,
|
||||
)
|
||||
return optimizer
|
||||
|
||||
|
||||
class FC7(flow.nn.Module):
|
||||
def __init__(self, embedding_size, num_classes, cfg, partial_fc=False, bias=False):
|
||||
super(FC7, self).__init__()
|
||||
self.weight = flow.nn.Parameter(
|
||||
flow.empty(num_classes, embedding_size))
|
||||
flow.nn.init.normal_(self.weight, mean=0, std=0.01)
|
||||
|
||||
self.partial_fc = partial_fc
|
||||
|
||||
size = flow.env.get_world_size()
|
||||
num_local = (cfg.num_classes + size - 1) // size
|
||||
self.num_sample = int(num_local * cfg.sample_rate)
|
||||
self.total_num_sample = self.num_sample * size
|
||||
|
||||
def forward(self, x, label):
|
||||
x = flow.nn.functional.l2_normalize(input=x, dim=1, epsilon=1e-10)
|
||||
if self.partial_fc:
|
||||
(
|
||||
mapped_label,
|
||||
sampled_label,
|
||||
sampled_weight,
|
||||
) = flow.distributed_partial_fc_sample(
|
||||
weight=fc7_weight, label=labels, num_sample=total_num_sample,
|
||||
weight=self.weight, label=label, num_sample=self.total_num_sample,
|
||||
)
|
||||
labels = mapped_label
|
||||
fc7_weight = sampled_weight
|
||||
fc7_weight = flow.math.l2_normalize(
|
||||
input=fc7_weight, axis=1, epsilon=1e-10)
|
||||
fc1 = flow.math.l2_normalize(
|
||||
input=embedding, axis=1, epsilon=1e-10)
|
||||
fc7 = flow.matmul(
|
||||
a=fc1.with_distribute(fc1_distribute), b=fc7_weight, transpose_b=True
|
||||
)
|
||||
fc7 = fc7.with_distribute(fc7_data_distribute)
|
||||
|
||||
if cfg.loss == "cosface":
|
||||
fc7 = (flow.combined_margin_loss(
|
||||
fc7, labels, m1=1, m2=0.0, m3=0.4) * 64)
|
||||
elif cfg.loss == "arcface":
|
||||
fc7 = (flow.combined_margin_loss(
|
||||
fc7, labels, m1=1, m2=0.5, m3=0.0) * 64)
|
||||
label = mapped_label
|
||||
weight = sampled_weight
|
||||
else:
|
||||
raise ValueError()
|
||||
weight = self.weight
|
||||
weight = flow.nn.functional.l2_normalize(
|
||||
input=weight, dim=1, epsilon=1e-10)
|
||||
x = flow.matmul(x, weight, transpose_b=True)
|
||||
if x.is_consistent:
|
||||
return x, label
|
||||
else:
|
||||
return x
|
||||
|
||||
fc7 = fc7.with_distribute(fc7_data_distribute)
|
||||
|
||||
loss = flow.nn.sparse_softmax_cross_entropy_with_logits(
|
||||
labels, fc7, name="softmax_loss"
|
||||
class Train_Module(flow.nn.Module):
|
||||
def __init__(self, cfg, backbone, placement, world_size):
|
||||
super(Train_Module, self).__init__()
|
||||
self.placement = placement
|
||||
|
||||
if cfg.graph:
|
||||
if cfg.model_parallel:
|
||||
input_size = cfg.embedding_size
|
||||
output_size = int(cfg.num_classes/world_size)
|
||||
self.fc = FC7(input_size, output_size, cfg, partial_fc=cfg.partial_fc).to_consistent(
|
||||
placement=placement, sbp=flow.sbp.split(0))
|
||||
else:
|
||||
self.fc = FC7(cfg.embedding_size, cfg.num_classes, cfg).to_consistent(
|
||||
placement=placement, sbp=flow.sbp.broadcast)
|
||||
self.backbone = backbone.to_consistent(
|
||||
placement=placement, sbp=flow.sbp.broadcast)
|
||||
else:
|
||||
self.backbone = backbone
|
||||
self.fc = FC7(cfg.embedding_size, cfg.num_classes, cfg)
|
||||
|
||||
def forward(self, x, labels):
|
||||
x = self.backbone(x)
|
||||
if x.is_consistent:
|
||||
x = x.to_consistent(sbp=flow.sbp.broadcast)
|
||||
x = self.fc(x, labels)
|
||||
return x
|
||||
|
||||
|
||||
class Trainer(object):
|
||||
def __init__(self, cfg, placement, load_path, world_size, rank):
|
||||
self.placement = placement
|
||||
self.load_path = load_path
|
||||
self.cfg = cfg
|
||||
self.world_size = world_size
|
||||
self.rank = rank
|
||||
|
||||
# model
|
||||
self.backbone = get_model(cfg.network, dropout=0.0,
|
||||
num_features=cfg.embedding_size).to("cuda")
|
||||
self.train_module = Train_Module(
|
||||
cfg, self.backbone, self.placement, world_size).to("cuda")
|
||||
if cfg.resume:
|
||||
if load_path is not None:
|
||||
self.load_state_dict()
|
||||
else:
|
||||
logging.info("Model resume failed! load path is None ")
|
||||
|
||||
# optimizer
|
||||
self.optimizer = make_optimizer(cfg, self.train_module)
|
||||
|
||||
# data
|
||||
self.train_data_loader = make_data_loader(
|
||||
cfg, 'train', self.cfg.graph, self.cfg.synthetic)
|
||||
|
||||
# loss
|
||||
if cfg.loss == "cosface":
|
||||
self.margin_softmax = flow.nn.CombinedMarginLoss(
|
||||
1, 0., 0.4).to("cuda")
|
||||
else:
|
||||
self.margin_softmax = flow.nn.CombinedMarginLoss(
|
||||
1, 0.5, 0.).to("cuda")
|
||||
|
||||
self.of_cross_entropy = CrossEntropyLoss_sbp()
|
||||
|
||||
# lr_scheduler
|
||||
self.decay_step = self.cal_decay_step()
|
||||
self.scheduler = flow.optim.lr_scheduler.MultiStepLR(
|
||||
optimizer=self.optimizer, milestones=self.decay_step, gamma=0.1
|
||||
)
|
||||
|
||||
lr_scheduler = flow.optimizer.PiecewiseScalingScheduler(
|
||||
base_lr=cfg.lr,
|
||||
boundaries=cfg.lr_steps,
|
||||
scale=cfg.lr_scales,
|
||||
warmup=None
|
||||
)
|
||||
flow.optimizer.SGD(lr_scheduler,
|
||||
momentum=cfg.momentum if cfg.momentum > 0 else None,
|
||||
).minimize(loss)
|
||||
# log
|
||||
self.callback_logging = CallBackLogging(
|
||||
50, rank, cfg.total_step, cfg.batch_size, world_size, None)
|
||||
# val
|
||||
self.callback_verification = CallBackVerification(
|
||||
600, rank, cfg.val_targets, cfg.ofrecord_path, is_consistent=cfg.graph)
|
||||
# save checkpoint
|
||||
self.callback_checkpoint = CallBackModelCheckpoint(rank, cfg.output)
|
||||
|
||||
return loss
|
||||
self.losses = AverageMeter()
|
||||
self.start_epoch = 0
|
||||
self.global_step = 0
|
||||
|
||||
return get_symbol_train_job
|
||||
def __call__(self):
|
||||
# Train
|
||||
if self.cfg.graph:
|
||||
self.train_graph()
|
||||
else:
|
||||
self.train_eager()
|
||||
|
||||
def load_state_dict(self):
|
||||
|
||||
if self.is_consistent:
|
||||
state_dict = flow.load(self.load_path, consistent_src_rank=0)
|
||||
elif self.rank == 0:
|
||||
state_dict = flow.load(self.load_path)
|
||||
else:
|
||||
return
|
||||
logging.info("Model resume successfully!")
|
||||
self.model.load_state_dict(state_dict)
|
||||
|
||||
def cal_decay_step(self):
|
||||
cfg = self.cfg
|
||||
num_image = cfg.num_image
|
||||
total_batch_size = cfg.batch_size * self.world_size
|
||||
self.warmup_step = num_image // total_batch_size * cfg.warmup_epoch
|
||||
self.cfg.total_step = num_image // total_batch_size * cfg.num_epoch
|
||||
logging.info("Total Step is:%d" % self.cfg.total_step)
|
||||
return [x * num_image // total_batch_size for x in cfg.decay_epoch]
|
||||
|
||||
def train_graph(self):
|
||||
train_graph = TrainGraph(self.train_module, self.cfg, self.margin_softmax,
|
||||
self.of_cross_entropy, self.train_data_loader, self.optimizer, self.scheduler)
|
||||
# train_graph.debug()
|
||||
val_graph = EvalGraph(self.backbone, self.cfg)
|
||||
|
||||
for epoch in range(self.start_epoch, self.cfg.num_epoch):
|
||||
self.train_module.train()
|
||||
one_epoch_steps = len(self.train_data_loader)
|
||||
for steps in range(one_epoch_steps):
|
||||
self.global_step += 1
|
||||
loss = train_graph()
|
||||
loss = loss.to_consistent(
|
||||
sbp=flow.sbp.broadcast).to_local().numpy()
|
||||
self.losses.update(loss, 1)
|
||||
self.callback_logging(self.global_step, self.losses, epoch, False,
|
||||
self.scheduler.get_last_lr()[0])
|
||||
self.callback_verification(
|
||||
self.global_step, self.train_module, val_graph)
|
||||
self.callback_checkpoint(self.global_step, epoch,
|
||||
self.train_module, is_consistent=True)
|
||||
|
||||
def train_eager(self):
|
||||
self.train_module = ddp(self.train_module)
|
||||
for epoch in range(self.start_epoch, self.cfg.num_epoch):
|
||||
self.train_module.train()
|
||||
|
||||
one_epoch_steps = len(self.train_data_loader)
|
||||
for steps in range(one_epoch_steps):
|
||||
self.global_step += 1
|
||||
image, label = self.train_data_loader()
|
||||
image = image.to("cuda")
|
||||
label = label.to("cuda")
|
||||
features_fc7 = self.train_module(image, label)
|
||||
features_fc7 = self.margin_softmax(features_fc7, label)*64
|
||||
loss = self.of_cross_entropy(features_fc7, label)
|
||||
loss.backward()
|
||||
self.optimizer.step()
|
||||
self.optimizer.zero_grad()
|
||||
|
||||
loss = loss.numpy()
|
||||
self.losses.update(loss, 1)
|
||||
self.callback_logging(self.global_step, self.losses, epoch, False,
|
||||
self.scheduler.get_last_lr()[0])
|
||||
self.callback_verification(self.global_step, self.backbone)
|
||||
self.scheduler.step()
|
||||
self.callback_checkpoint(
|
||||
self.global_step, epoch, self.train_module)
|
||||
|
||||
75
recognition/arcface_oneflow/graph.py
Normal file
75
recognition/arcface_oneflow/graph.py
Normal file
@@ -0,0 +1,75 @@
|
||||
import oneflow as flow
|
||||
import oneflow.nn as nn
|
||||
|
||||
|
||||
def make_static_grad_scaler():
|
||||
return flow.amp.StaticGradScaler(flow.env.get_world_size())
|
||||
|
||||
|
||||
def make_grad_scaler():
|
||||
return flow.amp.GradScaler(
|
||||
init_scale=2 ** 30, growth_factor=2.0, backoff_factor=0.5, growth_interval=2000,
|
||||
)
|
||||
|
||||
|
||||
def meter(self, mkey, *args):
|
||||
assert mkey in self.m
|
||||
self.m[mkey]["meter"].record(*args)
|
||||
|
||||
|
||||
class TrainGraph(flow.nn.Graph):
|
||||
def __init__(
|
||||
self,
|
||||
model,
|
||||
cfg,
|
||||
combine_margin,
|
||||
cross_entropy,
|
||||
data_loader,
|
||||
optimizer,
|
||||
lr_scheduler=None,
|
||||
):
|
||||
super().__init__()
|
||||
|
||||
if cfg.use_fp16:
|
||||
self.config.enable_amp(True)
|
||||
self.set_grad_scaler(make_grad_scaler())
|
||||
elif cfg.scale_grad:
|
||||
self.set_grad_scaler(make_static_grad_scaler())
|
||||
|
||||
|
||||
|
||||
self.config.allow_fuse_add_to_output(True)
|
||||
self.config.allow_fuse_model_update_ops(True)
|
||||
|
||||
self.model = model
|
||||
|
||||
self.cross_entropy = cross_entropy
|
||||
self.combine_margin = combine_margin
|
||||
self.data_loader = data_loader
|
||||
self.add_optimizer(optimizer, lr_sch=lr_scheduler)
|
||||
|
||||
def build(self):
|
||||
image, label = self.data_loader()
|
||||
|
||||
image = image.to("cuda")
|
||||
label = label.to("cuda")
|
||||
|
||||
logits, label = self.model(image, label)
|
||||
logits = self.combine_margin(logits, label) * 64
|
||||
loss = self.cross_entropy(logits, label)
|
||||
|
||||
loss.backward()
|
||||
return loss
|
||||
|
||||
|
||||
class EvalGraph(flow.nn.Graph):
|
||||
def __init__(self, model, cfg):
|
||||
super().__init__()
|
||||
self.config.allow_fuse_add_to_output(True)
|
||||
self.model = model
|
||||
if cfg.fp16:
|
||||
self.config.enable_amp(True)
|
||||
|
||||
def build(self, image):
|
||||
logits = self.model(image)
|
||||
return logits
|
||||
@@ -1,27 +1,49 @@
|
||||
import os
|
||||
from os import mkdir
|
||||
import oneflow.typing as tp
|
||||
import onnx
|
||||
import onnxruntime as ort
|
||||
import numpy as np
|
||||
from oneflow_onnx.oneflow2onnx.util import convert_to_onnx_and_check
|
||||
import oneflow as flow
|
||||
|
||||
import logging
|
||||
from easydict import EasyDict as edict
|
||||
from backbones import get_model
|
||||
from utils.utils_config import get_config
|
||||
import argparse
|
||||
import tempfile
|
||||
|
||||
|
||||
def convert_func(cfg, model_path, out_path):
|
||||
@flow.global_function()
|
||||
def InferenceNet(images: tp.Numpy.Placeholder((1, 3, 112, 112))):
|
||||
class ModelGraph(flow.nn.Graph):
|
||||
def __init__(self, model):
|
||||
super().__init__()
|
||||
self.backbone = model
|
||||
|
||||
logits = get_model(cfg.network, images, cfg)
|
||||
return logits
|
||||
print(convert_to_onnx_and_check(InferenceNet,
|
||||
flow_weight_dir=None, onnx_model_path=out_path))
|
||||
def build(self, x):
|
||||
x = x.to("cuda")
|
||||
out = self.backbone(x)
|
||||
return out
|
||||
|
||||
|
||||
def convert_func(cfg, model_path, out_path,image_size):
|
||||
|
||||
model_module = get_model(cfg.network, dropout=0.0,
|
||||
num_features=cfg.embedding_size).to("cuda")
|
||||
model_module.eval()
|
||||
print(model_module)
|
||||
model_graph = ModelGraph(model_module)
|
||||
model_graph._compile(flow.randn(1, 3, image_size, image_size).to("cuda"))
|
||||
|
||||
with tempfile.TemporaryDirectory() as tmpdirname:
|
||||
new_parameters = dict()
|
||||
parameters = flow.load(model_path)
|
||||
for key, value in parameters.items():
|
||||
if "num_batches_tracked" not in key:
|
||||
if key == "fc.weight":
|
||||
continue
|
||||
val = value
|
||||
new_key = key.replace("backbone.", "")
|
||||
new_parameters[new_key] = val
|
||||
model_module.load_state_dict(new_parameters)
|
||||
flow.save(model_module.state_dict(), tmpdirname)
|
||||
convert_to_onnx_and_check(
|
||||
model_graph, flow_weight_dir=tmpdirname, onnx_model_path="./", print_outlier=True)
|
||||
|
||||
|
||||
def main(args):
|
||||
@@ -30,7 +52,7 @@ def main(args):
|
||||
cfg = get_config(args.config)
|
||||
if not os.path.exists(args.out_path):
|
||||
mkdir(args.out_path)
|
||||
convert_func(cfg, args.model_path, args.out_path)
|
||||
convert_func(cfg, args.model_path, args.out_path,args.image_size)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
@@ -38,6 +60,8 @@ if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser(description='OneFlow ArcFace val')
|
||||
parser.add_argument('config', type=str, help='py config file')
|
||||
parser.add_argument('--model_path', type=str, help='model path')
|
||||
parser.add_argument('--image_size', type=int,
|
||||
default=112, help='input image size')
|
||||
parser.add_argument('--out_path', type=str,
|
||||
default="onnx_model", help='out path')
|
||||
main(parser.parse_args())
|
||||
|
||||
|
||||
7
recognition/arcface_oneflow/requirements.txt
Normal file
7
recognition/arcface_oneflow/requirements.txt
Normal file
@@ -0,0 +1,7 @@
|
||||
numpy
|
||||
matplotlib
|
||||
Pillow
|
||||
opencv-python
|
||||
scikit-learn
|
||||
scipy
|
||||
easydict
|
||||
@@ -1,11 +0,0 @@
|
||||
#!/bin/bash
|
||||
|
||||
export PYTHONUNBUFFERED=1
|
||||
echo PYTHONUNBUFFERED=$PYTHONUNBUFFERED
|
||||
export NCCL_LAUNCH_MODE=PARALLEL
|
||||
echo NCCL_LAUNCH_MODE=$NCCL_LAUNCH_MODE
|
||||
export NCCL_DEBUG=False
|
||||
export ONEFLOW_DEBUG_MODE=False
|
||||
|
||||
#CUDA_VISIBLE_DEVICES='1'
|
||||
python train.py configs/ms1mv3_r50.py --device_num_per_node 8
|
||||
@@ -1,3 +0,0 @@
|
||||
#!/bin/bash
|
||||
|
||||
python val.py configs/ms1mv3_r50 --model_path work_dir/lazy_r50/snapshot_0
|
||||
@@ -37,10 +37,7 @@ def load_train_data(data_dir):
|
||||
)
|
||||
)
|
||||
|
||||
imgrec = recordio.MXIndexedRecordIO(
|
||||
path_imgidx, path_imgrec, "r", key_type=int
|
||||
)
|
||||
# TODO: key_type ??
|
||||
imgrec = recordio.MXIndexedRecordIO(path_imgidx, path_imgrec, "r", key_type=int)
|
||||
|
||||
# Read header0 to get some info.
|
||||
identity_key_start = 0
|
||||
@@ -64,15 +61,6 @@ def load_train_data(data_dir):
|
||||
|
||||
else:
|
||||
imgidx_list = imgrec.keys
|
||||
|
||||
# print id2range to txt file
|
||||
# with open('id2range.txt', 'w') as f:
|
||||
# for identity in range(identity_key_start, identity_key_end):
|
||||
# l = str(identity) \
|
||||
# + ' ' \
|
||||
# + str(id2range[identity][0]) \
|
||||
# + ' ' + str(id2range[identity][1]) + '\n'
|
||||
# f.write(l)
|
||||
return imgrec, imgidx_list
|
||||
|
||||
|
||||
@@ -129,11 +117,7 @@ def main(args):
|
||||
with open(output_file, "wb") as f:
|
||||
for idx in imgidx_list:
|
||||
if idx % 10000 == 0:
|
||||
print(
|
||||
"Converting images: {} of {}".format(
|
||||
idx, len(imgidx_list)
|
||||
)
|
||||
)
|
||||
print("Converting images: {} of {}".format(idx, len(imgidx_list)))
|
||||
|
||||
img_data = {}
|
||||
rec = imgrec.read_idx(idx)
|
||||
@@ -25,10 +25,7 @@ def parse_arguement(argv):
|
||||
help="Path to output OFRecord.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--num_part",
|
||||
type=int,
|
||||
default=96,
|
||||
help="num_part of OFRecord to generate.",
|
||||
"--num_part", type=int, default=96, help="num_part of OFRecord to generate.",
|
||||
)
|
||||
return parser.parse_args(argv)
|
||||
|
||||
@@ -40,14 +37,12 @@ def load_train_data(data_dir):
|
||||
|
||||
print(
|
||||
"Loading recordio {}\n\
|
||||
Corresponding record idx is {}".format(
|
||||
Corresponding record idx is {}".format(
|
||||
path_imgrec, path_imgidx
|
||||
)
|
||||
)
|
||||
|
||||
imgrec = recordio.MXIndexedRecordIO(
|
||||
path_imgidx, path_imgrec, "r", key_type=int
|
||||
)
|
||||
imgrec = recordio.MXIndexedRecordIO(path_imgidx, path_imgrec, "r", key_type=int)
|
||||
|
||||
# Read header0 to get some info.
|
||||
identity_key_start = 0
|
||||
@@ -135,16 +130,12 @@ def main(args):
|
||||
output_file = os.path.join(output_dir, part_name)
|
||||
file_idx_start = part_id * num_images_per_part
|
||||
file_idx_end = min((part_id + 1) * num_images_per_part, num_images)
|
||||
print("part-"+str(part_id), "start", file_idx_start, "end", file_idx_end)
|
||||
print("part-" + str(part_id), "start", file_idx_start, "end", file_idx_end)
|
||||
with open(output_file, "wb") as f:
|
||||
for file_idx in range(file_idx_start, file_idx_end):
|
||||
idx = imgidx_list[file_idx]
|
||||
if idx % 10000 == 0:
|
||||
print(
|
||||
"Converting images: {} of {}".format(
|
||||
idx, len(imgidx_list)
|
||||
)
|
||||
)
|
||||
print("Converting images: {} of {}".format(idx, len(imgidx_list)))
|
||||
|
||||
img_data = {}
|
||||
rec = imgrec.read_idx(idx)
|
||||
@@ -2,92 +2,42 @@ import argparse
|
||||
import logging
|
||||
import os
|
||||
import oneflow as flow
|
||||
import oneflow.nn as nn
|
||||
import sys
|
||||
import math
|
||||
import numpy as np
|
||||
import pickle
|
||||
import time
|
||||
from backbones import get_model
|
||||
from utils.utils_callbacks import CallBackVerification, CallBackLogging
|
||||
|
||||
from function import Trainer
|
||||
from utils.utils_logging import init_logging
|
||||
from utils.utils_config import get_config
|
||||
from utils.utils_logging import AverageMeter, init_logging
|
||||
from utils.ofrecord_data_utils import load_train_dataset, load_synthetic
|
||||
from function import make_train_func, Validator
|
||||
|
||||
|
||||
def main(args):
|
||||
cfg = get_config(args.config)
|
||||
cfg.graph = args.graph
|
||||
rank = flow.env.get_rank()
|
||||
world_size = flow.env.get_world_size()
|
||||
placement = flow.env.all_device_placement("cuda")
|
||||
|
||||
cfg.device_num_per_node = args.device_num_per_node
|
||||
cfg.total_batch_size = cfg.batch_size*cfg.device_num_per_node*cfg.num_nodes
|
||||
cfg.steps_per_epoch = math.ceil(cfg.num_image / cfg.total_batch_size)
|
||||
cfg.total_step = cfg.num_epoch*cfg.steps_per_epoch
|
||||
cfg.lr_steps = (np.array(cfg.decay_epoch)*cfg.steps_per_epoch).tolist()
|
||||
lr_scales = [0.1, 0.01, 0.001, 0.0001]
|
||||
cfg.lr_scales = lr_scales[:len(cfg.lr_steps)]
|
||||
cfg.output = os.path.join("work_dir", cfg.output, cfg.loss)
|
||||
|
||||
world_size = cfg.num_nodes
|
||||
os.makedirs(cfg.output, exist_ok=True)
|
||||
|
||||
log_root = logging.getLogger()
|
||||
init_logging(log_root, cfg.output)
|
||||
flow.config.gpu_device_num(cfg.device_num_per_node)
|
||||
logging.info("gpu num: %d" % cfg.device_num_per_node)
|
||||
if cfg.num_nodes > 1:
|
||||
assert cfg.num_nodes <= len(
|
||||
cfg.node_ips), "The number of nodes should not be greater than length of node_ips list."
|
||||
flow.env.ctrl_port(12138)
|
||||
nodes = []
|
||||
for ip in cfg.node_ips:
|
||||
addr_dict = {}
|
||||
addr_dict["addr"] = ip
|
||||
nodes.append(addr_dict)
|
||||
flow.env.machine(nodes)
|
||||
flow.env.log_dir(cfg.output)
|
||||
init_logging(log_root, rank, cfg.output)
|
||||
|
||||
# root dir of loading checkpoint
|
||||
load_path = None
|
||||
|
||||
for key, value in cfg.items():
|
||||
num_space = 35 - len(key)
|
||||
num_space = 25 - len(key)
|
||||
logging.info(": " + key + " " * num_space + str(value))
|
||||
|
||||
train_func = make_train_func(cfg)
|
||||
val_infer = Validator(cfg)
|
||||
|
||||
callback_verification = CallBackVerification(
|
||||
3000, cfg.val_targets, cfg.eval_ofrecord_path)
|
||||
callback_logging = CallBackLogging(
|
||||
50, cfg.total_step, cfg.total_batch_size, world_size, None)
|
||||
|
||||
if cfg.resume and os.path.exists(cfg.model_load_dir):
|
||||
logging.info("Loading model from {}".format(cfg.model_load_dir))
|
||||
variables = flow.checkpoint.get(cfg.model_load_dir)
|
||||
flow.load_variables(variables)
|
||||
|
||||
start_epoch = 0
|
||||
global_step = 0
|
||||
lr = cfg.lr
|
||||
for epoch in range(start_epoch, cfg.num_epoch):
|
||||
for steps in range(cfg.steps_per_epoch):
|
||||
train_func().async_get(callback_logging.metric_cb(global_step, epoch, lr))
|
||||
callback_verification(global_step, val_infer.get_symbol_val_fn)
|
||||
global_step += 1
|
||||
if epoch in cfg.decay_epoch:
|
||||
lr *= 0.1
|
||||
logging.info("lr_steps: %d" % global_step)
|
||||
logging.info("lr change to %f" % lr)
|
||||
|
||||
# snapshot
|
||||
path = os.path.join(
|
||||
cfg.output, "snapshot_" + str(epoch))
|
||||
flow.checkpoint.save(path)
|
||||
logging.info("oneflow Model Saved in '{}'".format(path))
|
||||
trainer = Trainer(cfg, placement, load_path, world_size, rank)
|
||||
trainer()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser(description='OneFlow ArcFace Training')
|
||||
parser.add_argument('config', type=str, help='py config file')
|
||||
parser.add_argument('--local_rank', type=int, default=0, help='local_rank')
|
||||
parser.add_argument('--device_num_per_node', type=int,
|
||||
default=1, help='local_rank')
|
||||
|
||||
parser = argparse.ArgumentParser(description="OneFlow ArcFace Training")
|
||||
parser.add_argument("config", type=str, help="py config file")
|
||||
parser.add_argument(
|
||||
"--graph",
|
||||
action="store_true",
|
||||
help="Run model in graph mode,else run model in ddp mode.",
|
||||
)
|
||||
parser.add_argument("--local_rank", type=int, default=0, help="local_rank")
|
||||
main(parser.parse_args())
|
||||
|
||||
25
recognition/arcface_oneflow/train_ddp.sh
Normal file
25
recognition/arcface_oneflow/train_ddp.sh
Normal file
@@ -0,0 +1,25 @@
|
||||
# set -aux
|
||||
|
||||
MASTER_ADDR=127.0.0.1
|
||||
MASTER_PORT=17788
|
||||
DEVICE_NUM_PER_NODE=8
|
||||
NUM_NODES=1
|
||||
NODE_RANK=0
|
||||
|
||||
|
||||
export PYTHONUNBUFFERED=1
|
||||
echo PYTHONUNBUFFERED=$PYTHONUNBUFFERED
|
||||
export NCCL_LAUNCH_MODE=PARALLEL
|
||||
echo NCCL_LAUNCH_MODE=$NCCL_LAUNCH_MODE
|
||||
export NCCL_DEBUG=INFO
|
||||
export ONEFLOW_DEBUG_MODE=True
|
||||
|
||||
|
||||
NCCL_DEBUG=INFO \
|
||||
python3 -m oneflow.distributed.launch \
|
||||
--nproc_per_node $DEVICE_NUM_PER_NODE \
|
||||
--nnodes $NUM_NODES \
|
||||
--node_rank $NODE_RANK \
|
||||
--master_addr $MASTER_ADDR \
|
||||
--master_port $MASTER_PORT \
|
||||
train.py configs/ms1mv3_r50.py
|
||||
26
recognition/arcface_oneflow/train_graph_distributed.sh
Normal file
26
recognition/arcface_oneflow/train_graph_distributed.sh
Normal file
@@ -0,0 +1,26 @@
|
||||
# set -aux
|
||||
|
||||
MASTER_ADDR=127.0.0.1
|
||||
MASTER_PORT=17788
|
||||
DEVICE_NUM_PER_NODE=8
|
||||
NUM_NODES=1
|
||||
NODE_RANK=0
|
||||
|
||||
|
||||
|
||||
export PYTHONUNBUFFERED=1
|
||||
echo PYTHONUNBUFFERED=$PYTHONUNBUFFERED
|
||||
export NCCL_LAUNCH_MODE=PARALLEL
|
||||
echo NCCL_LAUNCH_MODE=$NCCL_LAUNCH_MODE
|
||||
#export NCCL_DEBUG=INFO
|
||||
export ONEFLOW_DEBUG_MODE=True
|
||||
|
||||
|
||||
#NCCL_DEBUG=INFO
|
||||
python3 -m oneflow.distributed.launch \
|
||||
--nproc_per_node $DEVICE_NUM_PER_NODE \
|
||||
--nnodes $NUM_NODES \
|
||||
--node_rank $NODE_RANK \
|
||||
--master_addr $MASTER_ADDR \
|
||||
--master_port $MASTER_PORT \
|
||||
train.py configs/ms1mv3_r50.py --graph
|
||||
66
recognition/arcface_oneflow/utils/losses.py
Normal file
66
recognition/arcface_oneflow/utils/losses.py
Normal file
@@ -0,0 +1,66 @@
|
||||
import oneflow as flow
|
||||
from oneflow import nn
|
||||
|
||||
|
||||
def get_loss(name):
|
||||
if name == "cosface":
|
||||
return CosFace()
|
||||
elif name == "arcface":
|
||||
return ArcFace()
|
||||
else:
|
||||
raise ValueError()
|
||||
|
||||
|
||||
class CrossEntropyLoss_sbp(nn.Module):
|
||||
def __init__(self):
|
||||
super(CrossEntropyLoss_sbp, self).__init__()
|
||||
|
||||
def forward(self, logits, label):
|
||||
loss = flow._C.sparse_softmax_cross_entropy(
|
||||
logits, label)
|
||||
loss = flow.mean(loss)
|
||||
return loss
|
||||
|
||||
|
||||
def get_loss(name):
|
||||
if name == "cosface":
|
||||
return CosFace()
|
||||
elif name == "arcface":
|
||||
return ArcFace()
|
||||
else:
|
||||
raise ValueError()
|
||||
|
||||
|
||||
class CosFace(nn.Module):
|
||||
def __init__(self, s=64.0, m=0.40):
|
||||
super(CosFace, self).__init__()
|
||||
self.s = s
|
||||
self.m = m
|
||||
|
||||
def forward(self, cosine, label):
|
||||
index = flow.where(label != -1)[0]
|
||||
m_hot = flow.zeros(index.size()[0], cosine.size()[
|
||||
1], device=cosine.device)
|
||||
|
||||
m_hot = flow.scatter(m_hot, 1, label[index, None], self.m)
|
||||
cosine = cosine[index] - m_hot
|
||||
|
||||
ret = cosine * self.s
|
||||
return ret
|
||||
|
||||
|
||||
class ArcFace(nn.Module):
|
||||
def __init__(self, s=64.0, m=0.5):
|
||||
super(ArcFace, self).__init__()
|
||||
self.s = s
|
||||
self.m = m
|
||||
|
||||
def forward(self, cosine: flow.Tensor, label):
|
||||
index = flow.where(label != -1)[0]
|
||||
m_hot = flow.zeros(index.size()[0], cosine.size()[
|
||||
1], device=cosine.device)
|
||||
m_hot.scatter_(1, label[index, None], self.m)
|
||||
cosine.acos_()
|
||||
cosine[index] += m_hot
|
||||
cosine.cos_().mul_(self.s)
|
||||
return cosine
|
||||
@@ -1,70 +1,148 @@
|
||||
import os
|
||||
import oneflow as flow
|
||||
import oneflow.nn as nn
|
||||
import os
|
||||
from typing import List, Union
|
||||
|
||||
|
||||
def train_dataset_reader(
|
||||
args, data_dir, batch_size, data_part_num, part_name_suffix_length=1
|
||||
):
|
||||
if os.path.exists(data_dir):
|
||||
print("Loading train data from {}".format(data_dir))
|
||||
else:
|
||||
raise Exception("Invalid train dataset dir", data_dir)
|
||||
image_blob_conf = flow.data.BlobConf(
|
||||
"encoded",
|
||||
shape=(112, 112, 3),
|
||||
dtype=flow.float,
|
||||
codec=flow.data.ImageCodec(
|
||||
image_preprocessors=[
|
||||
flow.data.ImagePreprocessor("bgr2rgb"),
|
||||
flow.data.ImagePreprocessor("mirror"),
|
||||
]
|
||||
),
|
||||
preprocessors=[
|
||||
flow.data.NormByChannelPreprocessor(
|
||||
mean_values=(127.5, 127.5, 127.5), std_values=(127.5, 127.5, 127.5), data_format="NCHW"
|
||||
),
|
||||
],
|
||||
)
|
||||
class OFRecordDataLoader(nn.Module):
|
||||
def __init__(
|
||||
self,
|
||||
ofrecord_root: str = "./ofrecord",
|
||||
mode: str = "train", # "val"
|
||||
dataset_size: int = 9469,
|
||||
batch_size: int = 1,
|
||||
total_batch_size: int = 1,
|
||||
data_part_num: int = 8,
|
||||
placement: flow.placement = None,
|
||||
sbp: Union[flow.sbp.sbp, List[flow.sbp.sbp]] = None,
|
||||
):
|
||||
super().__init__()
|
||||
channel_last = False
|
||||
output_layout = "NHWC" if channel_last else "NCHW"
|
||||
assert (ofrecord_root, mode)
|
||||
self.train_record_reader = flow.nn.OfrecordReader(
|
||||
os.path.join(ofrecord_root, mode),
|
||||
batch_size=batch_size,
|
||||
data_part_num=data_part_num,
|
||||
part_name_suffix_length=5,
|
||||
random_shuffle=True if mode == "train" else False,
|
||||
shuffle_after_epoch=True if mode == "train" else False,
|
||||
placement=placement,
|
||||
sbp=sbp,
|
||||
)
|
||||
self.record_label_decoder = flow.nn.OfrecordRawDecoder(
|
||||
"label", shape=(), dtype=flow.int32
|
||||
)
|
||||
|
||||
label_blob_conf = flow.data.BlobConf(
|
||||
"label", shape=(), dtype=flow.int32, codec=flow.data.RawCodec()
|
||||
)
|
||||
color_space = "RGB"
|
||||
height = 112
|
||||
width = 112
|
||||
|
||||
return flow.data.decode_ofrecord(
|
||||
data_dir,
|
||||
(label_blob_conf, image_blob_conf),
|
||||
batch_size=batch_size,
|
||||
data_part_num=data_part_num,
|
||||
part_name_prefix=args.part_name_prefix,
|
||||
part_name_suffix_length=args.part_name_suffix_length,
|
||||
shuffle=args.shuffle,
|
||||
buffer_size=16384,
|
||||
)
|
||||
self.record_image_decoder = flow.nn.OFRecordImageDecoder(
|
||||
"encoded", color_space=color_space
|
||||
)
|
||||
self.resize = (
|
||||
flow.nn.image.Resize(target_size=[height, width])
|
||||
if mode == "train"
|
||||
else flow.nn.image.Resize(
|
||||
resize_side="shorter", keep_aspect_ratio=True, target_size=112
|
||||
)
|
||||
)
|
||||
|
||||
self.flip = (
|
||||
flow.nn.CoinFlip(batch_size=batch_size, placement=placement, sbp=sbp)
|
||||
if mode == "train"
|
||||
else None
|
||||
)
|
||||
|
||||
rgb_mean = [127.5, 127.5, 127.5]
|
||||
rgb_std = [127.5, 127.5, 127.5]
|
||||
self.crop_mirror_norm = (
|
||||
flow.nn.CropMirrorNormalize(
|
||||
color_space=color_space,
|
||||
output_layout=output_layout,
|
||||
mean=rgb_mean,
|
||||
std=rgb_std,
|
||||
output_dtype=flow.float,
|
||||
)
|
||||
if mode == "train"
|
||||
else flow.nn.CropMirrorNormalize(
|
||||
color_space=color_space,
|
||||
output_layout=output_layout,
|
||||
crop_h=0,
|
||||
crop_w=0,
|
||||
crop_pos_y=0.5,
|
||||
crop_pos_x=0.5,
|
||||
mean=rgb_mean,
|
||||
std=rgb_std,
|
||||
output_dtype=flow.float,
|
||||
)
|
||||
)
|
||||
|
||||
self.batch_size = batch_size
|
||||
self.total_batch_size = total_batch_size
|
||||
self.dataset_size = dataset_size
|
||||
|
||||
def __len__(self):
|
||||
return self.dataset_size // self.total_batch_size
|
||||
|
||||
def forward(self):
|
||||
train_record = self.train_record_reader()
|
||||
label = self.record_label_decoder(train_record)
|
||||
image_raw_buffer = self.record_image_decoder(train_record)
|
||||
|
||||
image = self.resize(image_raw_buffer)[0]
|
||||
|
||||
rng = self.flip() if self.flip != None else None
|
||||
image = self.crop_mirror_norm(image, rng)
|
||||
|
||||
return image, label
|
||||
|
||||
|
||||
def load_synthetic(config):
|
||||
batch_size = config.train_batch_size
|
||||
image_size = 112
|
||||
label = flow.data.decode_random(
|
||||
shape=(),
|
||||
dtype=flow.int32,
|
||||
batch_size=batch_size,
|
||||
initializer=flow.zeros_initializer(flow.int32),
|
||||
)
|
||||
class SyntheticDataLoader(flow.nn.Module):
|
||||
def __init__(
|
||||
self, batch_size, image_size=112, num_classes=10000, placement=None, sbp=None,
|
||||
):
|
||||
super().__init__()
|
||||
|
||||
image = flow.data.decode_random(
|
||||
shape=(image_size, image_size, 3), dtype=flow.float, batch_size=batch_size,
|
||||
)
|
||||
return label, image
|
||||
self.image_shape = (batch_size, 3, image_size, image_size)
|
||||
self.label_shape = (batch_size,)
|
||||
self.num_classes = num_classes
|
||||
self.placement = placement
|
||||
self.sbp = sbp
|
||||
|
||||
if self.placement is not None and self.sbp is not None:
|
||||
self.image = flow.nn.Parameter(
|
||||
flow.randint(
|
||||
0,
|
||||
high=255,
|
||||
size=self.image_shape,
|
||||
dtype=flow.float32,
|
||||
placement=self.placement,
|
||||
sbp=self.sbp,
|
||||
),
|
||||
requires_grad=False,
|
||||
)
|
||||
self.label = flow.nn.Parameter(
|
||||
flow.randint(
|
||||
0,
|
||||
high=self.num_classes,
|
||||
size=self.label_shape,
|
||||
placement=self.placement,
|
||||
sbp=self.sbp,
|
||||
).to(dtype=flow.int32),
|
||||
requires_grad=False,
|
||||
)
|
||||
else:
|
||||
self.image = flow.randint(
|
||||
0, high=255, size=self.image_shape, dtype=flow.float32, device="cuda"
|
||||
)
|
||||
self.label = flow.randint(
|
||||
0, high=self.num_classes, size=self.label_shape, device="cuda",
|
||||
).to(dtype=flow.int32)
|
||||
|
||||
def load_train_dataset(args):
|
||||
data_dir = args.ofrecord_path
|
||||
batch_size = args.total_batch_size
|
||||
data_part_num = args.train_data_part_num
|
||||
part_name_suffix_length = args.part_name_suffix_length
|
||||
print("train batch size in load train dataset: ", batch_size)
|
||||
labels, images = train_dataset_reader(
|
||||
args, data_dir, batch_size, data_part_num, part_name_suffix_length
|
||||
)
|
||||
return labels, images
|
||||
def __len__(self):
|
||||
return 10000
|
||||
|
||||
def forward(self):
|
||||
return self.image, self.label
|
||||
|
||||
@@ -9,55 +9,85 @@ from eval import verification
|
||||
from utils.utils_logging import AverageMeter
|
||||
|
||||
|
||||
|
||||
class CallBackVerification(object):
|
||||
def __init__(self, frequent, val_targets, rec_prefix, image_size=(112, 112),world_size=1):
|
||||
def __init__(
|
||||
self,
|
||||
frequent,
|
||||
rank,
|
||||
val_targets,
|
||||
rec_prefix,
|
||||
image_size=(112, 112),
|
||||
world_size=1,
|
||||
is_consistent=False,
|
||||
):
|
||||
self.frequent: int = frequent
|
||||
|
||||
self.rank: int = rank
|
||||
self.highest_acc: float = 0.0
|
||||
self.highest_acc_list: List[float] = [0.0] * len(val_targets)
|
||||
self.ver_list: List[object] = []
|
||||
self.ver_name_list: List[str] = []
|
||||
self.world_size=world_size
|
||||
|
||||
self.init_dataset(val_targets=val_targets, data_dir=rec_prefix, image_size=image_size)
|
||||
self.world_size = world_size
|
||||
self.is_consistent = is_consistent
|
||||
|
||||
if self.is_consistent:
|
||||
self.init_dataset(
|
||||
val_targets=val_targets, data_dir=rec_prefix, image_size=image_size
|
||||
)
|
||||
else:
|
||||
if self.rank is 0:
|
||||
self.init_dataset(
|
||||
val_targets=val_targets, data_dir=rec_prefix, image_size=image_size
|
||||
)
|
||||
|
||||
def ver_test(self, backbone: flow.nn.Module, global_step: int):
|
||||
results = []
|
||||
for i in range(len(self.ver_list)):
|
||||
|
||||
|
||||
acc1, std1, acc2, std2, xnorm, embeddings_list = verification.test(
|
||||
self.ver_list[i], backbone, 10, 10)
|
||||
logging.info('[%s][%d]XNorm: %f' % (self.ver_name_list[i], global_step, xnorm))
|
||||
logging.info('[%s][%d]Accuracy-Flip: %1.5f+-%1.5f' % (self.ver_name_list[i], global_step, acc2, std2))
|
||||
self.ver_list[i], backbone, 10, 10, self.is_consistent
|
||||
)
|
||||
logging.info(
|
||||
"[%s][%d]XNorm: %f" % (self.ver_name_list[i], global_step, xnorm)
|
||||
)
|
||||
logging.info(
|
||||
"[%s][%d]Accuracy-Flip: %1.5f+-%1.5f"
|
||||
% (self.ver_name_list[i], global_step, acc2, std2)
|
||||
)
|
||||
if acc2 > self.highest_acc_list[i]:
|
||||
self.highest_acc_list[i] = acc2
|
||||
logging.info(
|
||||
'[%s][%d]Accuracy-Highest: %1.5f' % (self.ver_name_list[i], global_step, self.highest_acc_list[i]))
|
||||
"[%s][%d]Accuracy-Highest: %1.5f"
|
||||
% (self.ver_name_list[i], global_step, self.highest_acc_list[i])
|
||||
)
|
||||
results.append(acc2)
|
||||
|
||||
|
||||
def init_dataset(self, val_targets, data_dir, image_size):
|
||||
|
||||
for name in val_targets:
|
||||
path = os.path.join(data_dir, name + ".bin")
|
||||
path = os.path.join(data_dir, name + ".bin")
|
||||
if os.path.exists(path):
|
||||
data_set = verification.load_bin_cv(path, image_size)
|
||||
self.ver_list.append(data_set)
|
||||
self.ver_name_list.append(name)
|
||||
|
||||
def __call__(self, num_update, backbone: flow.nn.Module, backbone_graph=None):
|
||||
|
||||
def __call__(self, num_update, backbone):
|
||||
if num_update > 0 and num_update % self.frequent == 0:
|
||||
self.ver_test(backbone, num_update)
|
||||
|
||||
if self.is_consistent:
|
||||
if num_update > 0 and num_update % self.frequent == 0:
|
||||
backbone.eval()
|
||||
self.ver_test(backbone_graph, num_update)
|
||||
backbone.train()
|
||||
else:
|
||||
if self.rank is 0 and num_update > 0 and num_update % self.frequent == 0:
|
||||
backbone.eval()
|
||||
self.ver_test(backbone, num_update)
|
||||
backbone.train()
|
||||
|
||||
|
||||
class CallBackLogging(object):
|
||||
def __init__(self, frequent, total_step, batch_size, world_size, writer=None):
|
||||
def __init__(self, frequent, rank, total_step, batch_size, world_size, writer=None):
|
||||
self.frequent: int = frequent
|
||||
|
||||
self.rank: int = rank
|
||||
self.time_start = time.time()
|
||||
self.total_step: int = total_step
|
||||
self.batch_size: int = batch_size
|
||||
@@ -66,42 +96,80 @@ class CallBackLogging(object):
|
||||
|
||||
self.init = False
|
||||
self.tic = 0
|
||||
self.losses=AverageMeter()
|
||||
|
||||
def metric_cb(self,
|
||||
global_step: int,
|
||||
epoch: int,
|
||||
learning_rate: float):
|
||||
def callback(loss):
|
||||
loss=loss.mean()
|
||||
self.losses.update(loss, 1)
|
||||
if global_step % self.frequent == 0:
|
||||
def __call__(
|
||||
self,
|
||||
global_step: int,
|
||||
loss: AverageMeter,
|
||||
epoch: int,
|
||||
fp16: bool,
|
||||
learning_rate: float,
|
||||
grad_scaler=None,
|
||||
):
|
||||
if self.rank == 0 and global_step % self.frequent == 0:
|
||||
if self.init:
|
||||
try:
|
||||
speed: float = self.frequent * self.batch_size / (
|
||||
time.time() - self.tic
|
||||
)
|
||||
speed_total = speed * self.world_size
|
||||
except ZeroDivisionError:
|
||||
speed_total = float("inf")
|
||||
|
||||
if self.init:
|
||||
try:
|
||||
speed: float = self.frequent * self.batch_size / (time.time() - self.tic)
|
||||
speed_total = speed * self.world_size
|
||||
except ZeroDivisionError:
|
||||
speed_total = float('inf')
|
||||
|
||||
time_now = (time.time() - self.time_start) / 3600
|
||||
time_total = time_now / ((global_step + 1) / self.total_step)
|
||||
time_for_end = time_total - time_now
|
||||
if self.writer is not None:
|
||||
self.writer.add_scalar('time_for_end', time_for_end, global_step)
|
||||
self.writer.add_scalar('learning_rate', learning_rate, global_step)
|
||||
self.writer.add_scalar('loss', loss.avg, global_step)
|
||||
else:
|
||||
msg = "Speed %.2f samples/sec Loss %.4f LearningRate %.4f Epoch: %d Global Step: %d " \
|
||||
"Required: %1.f hours" % (
|
||||
speed_total, self.losses.avg, learning_rate, epoch, global_step, time_for_end
|
||||
)
|
||||
logging.info(msg)
|
||||
self.losses.reset()
|
||||
self.tic = time.time()
|
||||
time_now = (time.time() - self.time_start) / 3600
|
||||
time_total = time_now / ((global_step + 1) / self.total_step)
|
||||
time_for_end = time_total - time_now
|
||||
if self.writer is not None:
|
||||
self.writer.add_scalar("time_for_end", time_for_end, global_step)
|
||||
self.writer.add_scalar("learning_rate", learning_rate, global_step)
|
||||
self.writer.add_scalar("loss", loss.avg, global_step)
|
||||
if fp16:
|
||||
msg = (
|
||||
"Speed %.2f samples/sec Loss %.4f LearningRate %.4f Epoch: %d Global Step: %d "
|
||||
"Fp16 Grad Scale: %2.f Required: %1.f hours"
|
||||
% (
|
||||
speed_total,
|
||||
loss.avg,
|
||||
learning_rate,
|
||||
epoch,
|
||||
global_step,
|
||||
time_for_end,
|
||||
)
|
||||
)
|
||||
else:
|
||||
self.init = True
|
||||
self.tic = time.time()
|
||||
return callback
|
||||
msg = (
|
||||
"Speed %.2f samples/sec Loss %.4f LearningRate %.4f Epoch: %d Global Step: %d "
|
||||
"Required: %1.f hours"
|
||||
% (
|
||||
speed_total,
|
||||
loss.avg,
|
||||
learning_rate,
|
||||
epoch,
|
||||
global_step,
|
||||
time_for_end,
|
||||
)
|
||||
)
|
||||
logging.info(msg)
|
||||
loss.reset()
|
||||
self.tic = time.time()
|
||||
else:
|
||||
self.init = True
|
||||
self.tic = time.time()
|
||||
|
||||
|
||||
class CallBackModelCheckpoint(object):
|
||||
def __init__(self, rank, output="./"):
|
||||
self.rank: int = rank
|
||||
self.output: str = output
|
||||
|
||||
def __call__(self, global_step, epoch, backbone, is_consistent=False):
|
||||
|
||||
if global_step > 100 and backbone is not None:
|
||||
path_module = os.path.join(self.output, "epoch_%d" % (epoch))
|
||||
|
||||
if is_consistent:
|
||||
flow.save(backbone.state_dict(), path_module, consistent_dst_rank=0)
|
||||
else:
|
||||
if self.rank == 0:
|
||||
flow.save(backbone.state_dict(), path_module)
|
||||
logging.info("oneflow Model Saved in '{}'".format(path_module))
|
||||
|
||||
@@ -4,7 +4,8 @@ import os.path as osp
|
||||
|
||||
def get_config(config_file):
|
||||
assert config_file.startswith(
|
||||
'configs/'), 'config file setting must start with configs/'
|
||||
"configs/"
|
||||
), "config file setting must start with configs/"
|
||||
temp_config_name = osp.basename(config_file)
|
||||
temp_module_name = osp.splitext(temp_config_name)[0]
|
||||
config = importlib.import_module("configs.base")
|
||||
@@ -13,5 +14,5 @@ def get_config(config_file):
|
||||
job_cfg = config.config
|
||||
cfg.update(job_cfg)
|
||||
if cfg.output is None:
|
||||
cfg.output = osp.join('work_dirs', temp_module_name)
|
||||
cfg.output = osp.join("work_dirs", temp_module_name)
|
||||
return cfg
|
||||
|
||||
@@ -27,14 +27,14 @@ class AverageMeter(object):
|
||||
self.avg = self.sum / self.count
|
||||
|
||||
|
||||
def init_logging(log_root, models_root):
|
||||
|
||||
log_root.setLevel(logging.INFO)
|
||||
formatter = logging.Formatter("Training: %(asctime)s-%(message)s")
|
||||
handler_file = logging.FileHandler(
|
||||
os.path.join(models_root, "training.log"))
|
||||
handler_stream = logging.StreamHandler(sys.stdout)
|
||||
handler_file.setFormatter(formatter)
|
||||
handler_stream.setFormatter(formatter)
|
||||
log_root.addHandler(handler_file)
|
||||
log_root.addHandler(handler_stream)
|
||||
def init_logging(log_root, rank, models_root):
|
||||
if rank is 0:
|
||||
log_root.setLevel(logging.INFO)
|
||||
formatter = logging.Formatter("Training: %(asctime)s-%(message)s")
|
||||
handler_file = logging.FileHandler(os.path.join(models_root, "training.log"))
|
||||
handler_stream = logging.StreamHandler(sys.stdout)
|
||||
handler_file.setFormatter(formatter)
|
||||
handler_stream.setFormatter(formatter)
|
||||
log_root.addHandler(handler_file)
|
||||
log_root.addHandler(handler_stream)
|
||||
log_root.info("rank_id: %d" % rank)
|
||||
|
||||
@@ -1,33 +1,44 @@
|
||||
from utils.utils_logging import AverageMeter, init_logging
|
||||
import argparse
|
||||
from function import Validator
|
||||
from utils.utils_config import get_config
|
||||
import logging
|
||||
import os
|
||||
from backbones import get_model
|
||||
from utils.utils_callbacks import CallBackVerification
|
||||
from eval import verification
|
||||
import backbones
|
||||
import oneflow as flow
|
||||
import sys
|
||||
from utils.utils_callbacks import CallBackVerification
|
||||
from backbones import get_model
|
||||
from graph import TrainGraph, EvalGraph
|
||||
import logging
|
||||
import argparse
|
||||
from utils.utils_config import get_config
|
||||
from function import EvalGraph
|
||||
|
||||
|
||||
def main(args):
|
||||
|
||||
cfg = get_config(args.config)
|
||||
|
||||
logging.basicConfig(level=logging.NOTSET)
|
||||
logging.info(args.model_path)
|
||||
val_infer = Validator(cfg)
|
||||
val_callback = CallBackVerification(
|
||||
1, cfg.val_targets, cfg.eval_ofrecord_path, image_nums=cfg.val_image_num)
|
||||
val_infer.load_checkpoint(args.model_path)
|
||||
|
||||
val_callback(1000, val_infer.get_symbol_val_fn)
|
||||
backbone = get_model(cfg.network, dropout=0.0, num_features=cfg.embedding_size).to(
|
||||
"cuda"
|
||||
)
|
||||
val_callback = CallBackVerification(1, 0, cfg.val_targets, cfg.ofrecord_path)
|
||||
|
||||
state_dict = flow.load(args.model_path)
|
||||
|
||||
new_parameters = dict()
|
||||
for key, value in state_dict.items():
|
||||
if "num_batches_tracked" not in key:
|
||||
if key == "fc.weight":
|
||||
continue
|
||||
new_key = key.replace("backbone.", "")
|
||||
new_parameters[new_key] = value
|
||||
|
||||
backbone.load_state_dict(new_parameters)
|
||||
|
||||
infer_graph = EvalGraph(backbone)
|
||||
val_callback(1000, backbone, infer_graph)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
|
||||
parser = argparse.ArgumentParser(description='OneFlow ArcFace val')
|
||||
parser.add_argument('config', type=str, help='py config file')
|
||||
parser.add_argument('--model_path', type=str, help='model path')
|
||||
parser = argparse.ArgumentParser(description="OneFlow ArcFace val")
|
||||
parser.add_argument("config", type=str, help="py config file")
|
||||
parser.add_argument("--model_path", type=str, help="model path")
|
||||
main(parser.parse_args())
|
||||
|
||||
3
recognition/arcface_oneflow/val.sh
Normal file
3
recognition/arcface_oneflow/val.sh
Normal file
@@ -0,0 +1,3 @@
|
||||
#!/bin/bash
|
||||
|
||||
python val.py configs/ms1mv3_r50 --model_path eager_test/epoch_0
|
||||
Reference in New Issue
Block a user