Files
insightface/benchmarks/train/nvidia_rtx3090.md
2021-08-08 15:32:19 +08:00

2.7 KiB
Raw Blame History

Training performance report on NVIDIA RTX3090

GEFORCE RTX 3090 The GeForce RTX™ 3090 is a big ferocious GPU (BFGPU) with TITAN class performance.

Besides, we can also use GeForce RTX™ 3090 to train deep learning models by its FP16 and TF32 supports.

Test Server Spec

Key Value
CPU 2 x Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz
Memory 384GB
GPU 8 x GeForce RTX™ 3090
OS Ubuntu 18.04.4 LTS
Installation CUDA 11.1,
Installation Python 3.7.3
Installation PyTorch 1.9.0 (pip)

Experiments on arcface_torch

We report training speed in following table, please also note that:

  1. The training dataset is SyntheticDataset.

  2. Embedding-size are all set to 512.

1. 2 Million Identities

We use a large dataset which contains about 2 millions identities to simulate real cases.

Dataset Classes Backbone Batch-size FP16 TF32 Partial FC Samples/sec
WebFace40M 2 Millions IResNet-50 512 × × × ~1750
WebFace40M 2 Millions IResNet-50 512 × × ~1810
WebFace40M 2 Millions IResNet-50 512 × ~2056
WebFace40M 2 Millions IResNet-50 512 ~2850
WebFace40M 2 Millions IResNet-50 1024 × ~2810
WebFace40M 2 Millions IResNet-50 1024 ~4220
WebFace40M 2 Millions IResNet-50 2048 ~5330

2. 600K Identities

We use a large dataset which contains about 600k identities to simulate real cases.

Dataset Classes Backbone Batch-size FP16 Samples/sec
WebFace600K 618K IResNet-50 512 × ~2220
WebFace600K 618K IResNet-50 512 ~2610
WebFace600K 618K IResNet-50 1024 × ~2940
WebFace600K 618K IResNet-50 1024 ~3790
WebFace600K 618K IResNet-50 2048 ~4680