mirror of
https://github.com/deepinsight/insightface.git
synced 2026-05-22 01:27:48 +00:00
57 lines
1.9 KiB
Markdown
57 lines
1.9 KiB
Markdown
|
|
|
|
|
|
## 1. Download Datasets and Unzip
|
|
|
|
Download WebFace42M from [https://www.face-benchmark.org/download.html](https://www.face-benchmark.org/download.html).
|
|
The raw data of `WebFace42M` will have 10 directories after being unarchived:
|
|
`WebFace4M` contains 1 directory: `0`.
|
|
`WebFace12M` contains 3 directories: `0,1,2`.
|
|
`WebFace42M` contains 10 directories: `0,1,2,3,4,5,6,7,8,9`.
|
|
|
|
## 2. Create Shuffled Rec File for DALI
|
|
|
|
Note: Shuffled rec is very important to DALI, and rec without shuffled can cause performance degradation, origin insightface style rec file
|
|
do not support Nvidia DALI, you must follow this command [mxnet.tools.im2rec](https://github.com/apache/incubator-mxnet/blob/master/tools/im2rec.py) to generate a shuffled rec file.
|
|
|
|
```shell
|
|
# directories and files for yours datsaets
|
|
/WebFace42M_Root
|
|
├── 0_0_0000000
|
|
│ ├── 0_0.jpg
|
|
│ ├── 0_1.jpg
|
|
│ ├── 0_2.jpg
|
|
│ ├── 0_3.jpg
|
|
│ └── 0_4.jpg
|
|
├── 0_0_0000001
|
|
│ ├── 0_5.jpg
|
|
│ ├── 0_6.jpg
|
|
│ ├── 0_7.jpg
|
|
│ ├── 0_8.jpg
|
|
│ └── 0_9.jpg
|
|
├── 0_0_0000002
|
|
│ ├── 0_10.jpg
|
|
│ ├── 0_11.jpg
|
|
│ ├── 0_12.jpg
|
|
│ ├── 0_13.jpg
|
|
│ ├── 0_14.jpg
|
|
│ ├── 0_15.jpg
|
|
│ ├── 0_16.jpg
|
|
│ └── 0_17.jpg
|
|
├── 0_0_0000003
|
|
│ ├── 0_18.jpg
|
|
│ ├── 0_19.jpg
|
|
│ └── 0_20.jpg
|
|
├── 0_0_0000004
|
|
|
|
|
|
|
|
# 1) create train.lst using follow command
|
|
python -m mxnet.tools.im2rec --list --recursive train WebFace42M_Root
|
|
|
|
# 2) create train.rec and train.idx using train.lst using following command
|
|
python -m mxnet.tools.im2rec --num-thread 16 --quality 100 train WebFace42M_Root
|
|
```
|
|
|
|
Finally, you will get three files: `train.lst`, `train.rec`, `train.idx`. which `train.idx`, `train.rec` are using for training.
|