2021-08-07 17:27:09 +08:00
|
|
|
|
# Training performance report on NVIDIA RTX3090
|
|
|
|
|
|
|
|
|
|
|
|
[GEFORCE RTX 3090](https://www.nvidia.com/en-us/geforce/graphics-cards/30-series/rtx-3090/)
|
|
|
|
|
|
The GeForce RTX™ 3090 is a big ferocious GPU (BFGPU) with TITAN class performance.
|
|
|
|
|
|
|
|
|
|
|
|
Besides, we can also use GeForce RTX™ 3090 to train deep learning models by its FP16 and TF32 supports.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Test Server Spec
|
|
|
|
|
|
|
|
|
|
|
|
| Key | Value |
|
|
|
|
|
|
|--------------|---------------------------------------------------|
|
|
|
|
|
|
| CPU | 2 x Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz |
|
|
|
|
|
|
| Memory | 384GB |
|
|
|
|
|
|
| GPU | 8 x GeForce RTX™ 3090 |
|
|
|
|
|
|
| OS | Ubuntu 18.04.4 LTS |
|
|
|
|
|
|
| Installation | CUDA 11.1, |
|
|
|
|
|
|
| Installation | Python 3.7.3 |
|
|
|
|
|
|
| Installation | PyTorch 1.9.0 (pip) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Experiments on arcface_torch
|
|
|
|
|
|
|
|
|
|
|
|
We report training speed in following table, please also note that:
|
|
|
|
|
|
|
|
|
|
|
|
1. The training dataset is SyntheticDataset.
|
|
|
|
|
|
|
|
|
|
|
|
2. Embedding-size are all set to 512.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 1. 2 Million Identities
|
|
|
|
|
|
|
|
|
|
|
|
We use a large dataset which contains about 2 millions identities to simulate real cases.
|
|
|
|
|
|
|
|
|
|
|
|
| Dataset | Classes | Backbone | Batch-size | FP16 | TF32 | Partial FC | Samples/sec |
|
|
|
|
|
|
|------------|------------|------------|------------|------|------|------------|-------------|
|
|
|
|
|
|
| WebFace40M | 2 Millions | IResNet-50 | 512 | × | × | × | ~1750 |
|
|
|
|
|
|
| WebFace40M | 2 Millions | IResNet-50 | 512 | × | √ | × | ~1810 |
|
2021-08-08 15:32:19 +08:00
|
|
|
|
| WebFace40M | 2 Millions | IResNet-50 | 512 | √ | √ | × | ~2056 |
|
|
|
|
|
|
| WebFace40M | 2 Millions | IResNet-50 | 512 | √ | √ | √ | ~2850 |
|
2021-08-07 17:27:09 +08:00
|
|
|
|
| WebFace40M | 2 Millions | IResNet-50 | 1024 | √ | √ | × | ~2810 |
|
|
|
|
|
|
| WebFace40M | 2 Millions | IResNet-50 | 1024 | √ | √ | √ | ~4220 |
|
|
|
|
|
|
| WebFace40M | 2 Millions | IResNet-50 | 2048 | √ | √ | √ | ~5330 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### 2. 600K Identities
|
|
|
|
|
|
|
|
|
|
|
|
We use a large dataset which contains about 600k identities to simulate real cases.
|
|
|
|
|
|
|
|
|
|
|
|
| Dataset | Classes | Backbone | Batch-size | FP16 | Samples/sec |
|
|
|
|
|
|
|-------------|---------|------------|------------|------|-------------|
|
|
|
|
|
|
| WebFace600K | 618K | IResNet-50 | 512 | × | ~2220 |
|
|
|
|
|
|
| WebFace600K | 618K | IResNet-50 | 512 | √ | ~2610 |
|
|
|
|
|
|
| WebFace600K | 618K | IResNet-50 | 1024 | × | ~2940 |
|
|
|
|
|
|
| WebFace600K | 618K | IResNet-50 | 1024 | √ | ~3790 |
|
|
|
|
|
|
| WebFace600K | 618K | IResNet-50 | 2048 | √ | ~4680 |
|