Files
uniface/docs/models.md
Yakhyokhuja Valikhujaev efb40f2e91 feat: Upgrade docs and Add google colab support (#52)
* docs: Add announcement section

* docs: Add landing page and improve the docs

* docs: Update docs

* docs: Update documentation

* chore: Update all examples and add google colab support

* docs: Update README.md
2025-12-31 18:07:04 +09:00

13 KiB
Raw Blame History

Model Zoo

Complete guide to all available models, their performance characteristics, and selection criteria.


Face Detection Models

RetinaFace Family

RetinaFace models are trained on the WIDER FACE dataset and provide excellent accuracy-speed tradeoffs.

Model Name Params Size Easy Medium Hard Use Case
MNET_025 0.4M 1.7MB 88.48% 87.02% 80.61% Mobile/Edge devices
MNET_050 1.0M 2.6MB 89.42% 87.97% 82.40% Mobile/Edge devices
MNET_V1 3.5M 3.8MB 90.59% 89.14% 84.13% Balanced mobile
MNET_V2 :material-check-circle: 3.2M 3.5MB 91.70% 91.03% 86.60% Default
RESNET18 11.7M 27MB 92.50% 91.02% 86.63% Server/High accuracy
RESNET34 24.8M 56MB 94.16% 93.12% 88.90% Maximum accuracy

!!! info "Accuracy & Benchmarks" Accuracy: WIDER FACE validation set (Easy/Medium/Hard subsets) - from RetinaFace paper

**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --iterations 100`

SCRFD Family

SCRFD (Sample and Computation Redistribution for Efficient Face Detection) models offer state-of-the-art speed-accuracy tradeoffs.

Model Name Params Size Easy Medium Hard Use Case
SCRFD_500M 0.6M 2.5MB 90.57% 88.12% 68.51% Real-time applications
SCRFD_10G :material-check-circle: 4.2M 17MB 95.16% 93.87% 83.05% High accuracy + speed

!!! info "Accuracy & Benchmarks" Accuracy: WIDER FACE validation set - from SCRFD paper

**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --iterations 100`

YOLOv5-Face Family

YOLOv5-Face models provide excellent detection accuracy with 5-point facial landmarks, optimized for real-time applications.

Model Name Size Easy Medium Hard Use Case
YOLOV5N 11MB 93.61% 91.52% 80.53% Lightweight/Mobile
YOLOV5S :material-check-circle: 28MB 94.33% 92.61% 83.15% Real-time + accuracy
YOLOV5M 82MB 95.30% 93.76% 85.28% High accuracy

!!! info "Accuracy & Benchmarks" Accuracy: WIDER FACE validation set - from YOLOv5-Face paper

**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --iterations 100`

!!! note "Fixed Input Size" All YOLOv5-Face models use a fixed input size of 640×640. Models exported to ONNX from deepcam-cn/yolov5-face.


Face Recognition Models

ArcFace

State-of-the-art face recognition using additive angular margin loss.

Model Name Backbone Params Size Use Case
MNET :material-check-circle: MobileNet 2.0M 8MB Balanced (recommended)
RESNET ResNet50 43.6M 166MB Maximum accuracy

!!! info "Training Data" Dataset: Trained on MS1M-V2 (5.8M images, 85K identities)

**Accuracy**: Benchmark on your own dataset or use standard face verification benchmarks

MobileFace

Lightweight face recognition optimized for mobile devices.

Model Name Backbone Params Size LFW CALFW CPLFW AgeDB-30 Use Case
MNET_025 MobileNetV1 0.25 0.36M 1MB 98.76% 92.02% 82.37% 90.02% Ultra-lightweight
MNET_V2 :material-check-circle: MobileNetV2 2.29M 4MB 99.55% 94.87% 86.89% 95.16% Mobile/Edge
MNET_V3_SMALL MobileNetV3-S 1.25M 3MB 99.30% 93.77% 85.29% 92.79% Mobile optimized
MNET_V3_LARGE MobileNetV3-L 3.52M 10MB 99.53% 94.56% 86.79% 95.13% Balanced mobile

!!! info "Training Data" Dataset: Trained on MS1M-V2 (5.8M images, 85K identities)

**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks

!!! tip "Use Case" These models are lightweight alternatives to ArcFace for resource-constrained environments.


SphereFace

Face recognition using angular softmax loss.

Model Name Backbone Params Size LFW CALFW CPLFW AgeDB-30 Use Case
SPHERE20 Sphere20 24.5M 50MB 99.67% 95.61% 88.75% 96.58% Research/Comparison
SPHERE36 Sphere36 34.6M 92MB 99.72% 95.64% 89.92% 96.83% Research/Comparison

!!! info "Training Data" Dataset: Trained on MS1M-V2 (5.8M images, 85K identities)

**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks

!!! note "Architecture" SphereFace uses angular softmax loss, an earlier approach before ArcFace. These models provide good accuracy with moderate resource requirements.


Facial Landmark Models

106-Point Landmark Detection

High-precision facial landmark localization.

Model Name Points Params Size Use Case
2D106 106 3.7M 14MB Face alignment, analysis

Landmark Groups:

Group Points Count
Face contour 0-32 33 points
Eyebrows 33-50 18 points
Nose 51-62 12 points
Eyes 63-86 24 points
Mouth 87-105 19 points

Attribute Analysis Models

Age & Gender Detection

Model Name Attributes Params Size Use Case
AgeGender Age, Gender 2.1M 8MB General purpose

!!! info "Training Data" Dataset: Trained on CelebA

!!! warning "Accuracy Note" Accuracy varies by demographic and image quality. Test on your specific use case.


FairFace Attributes

Model Name Attributes Params Size Use Case
FairFace Race, Gender, Age Group - 44MB Balanced demographic prediction

!!! info "Training Data" Dataset: Trained on FairFace dataset with balanced demographics

!!! tip "Equitable Predictions" FairFace provides more equitable predictions across different racial and gender groups.

Race Categories (7): White, Black, Latino Hispanic, East Asian, Southeast Asian, Indian, Middle Eastern

Age Groups (9): 0-2, 3-9, 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70+


Emotion Detection

Model Name Classes Params Size Use Case
AFFECNET7 7 0.5M 2MB 7-class emotion
AFFECNET8 8 0.5M 2MB 8-class emotion

Classes (7): Neutral, Happy, Sad, Surprise, Fear, Disgust, Anger

Classes (8): Above + Contempt

!!! info "Training Data" Dataset: Trained on AffectNet

!!! note "Accuracy Note" Emotion detection accuracy depends heavily on facial expression clarity and cultural context.


Gaze Estimation Models

MobileGaze Family

Real-time gaze direction prediction models trained on Gaze360 dataset. Returns pitch (vertical) and yaw (horizontal) angles in radians.

Model Name Params Size MAE* Use Case
RESNET18 11.7M 43 MB 12.84 Balanced accuracy/speed
RESNET34 :material-check-circle: 24.8M 81.6 MB 11.33 Default
RESNET50 25.6M 91.3 MB 11.34 High accuracy
MOBILENET_V2 3.5M 9.59 MB 13.07 Mobile/Edge devices
MOBILEONE_S0 2.1M 4.8 MB 12.58 Lightweight/Real-time

*MAE (Mean Absolute Error) in degrees on Gaze360 test set - lower is better

!!! info "Training Data" Dataset: Trained on Gaze360 (indoor/outdoor scenes with diverse head poses)

**Training**: 200 epochs with classification-based approach (binned angles)

!!! note "Input Requirements" Requires face crop as input. Use face detection first to obtain bounding boxes.


Face Parsing Models

BiSeNet Family

BiSeNet (Bilateral Segmentation Network) models for semantic face parsing. Segments face images into 19 facial component classes.

Model Name Params Size Classes Use Case
RESNET18 :material-check-circle: 13.3M 50.7 MB 19 Default
RESNET34 24.1M 89.2 MB 19 Higher accuracy

!!! info "Training Data" Dataset: Trained on CelebAMask-HQ

**Architecture**: BiSeNet with ResNet backbone

**Input Size**: 512×512 (automatically resized)

19 Facial Component Classes:

# Class # Class # Class
1 Background 8 Left Ear 15 Neck
2 Skin 9 Right Ear 16 Neck Lace
3 Left Eyebrow 10 Ear Ring 17 Cloth
4 Right Eyebrow 11 Nose 18 Hair
5 Left Eye 12 Mouth 19 Hat
6 Right Eye 13 Upper Lip
7 Eye Glasses 14 Lower Lip

Applications:

  • Face makeup and beauty applications
  • Virtual try-on systems
  • Face editing and manipulation
  • Facial feature extraction
  • Portrait segmentation

!!! note "Input Requirements" Input should be a cropped face image. For full pipeline, use face detection first to obtain face crops.


Anti-Spoofing Models

MiniFASNet Family

Lightweight face anti-spoofing models for liveness detection. Detect if a face is real (live) or fake (photo, video replay, mask).

Model Name Size Scale Use Case
V1SE 1.2 MB 4.0 Squeeze-and-excitation variant
V2 :material-check-circle: 1.2 MB 2.7 Default

!!! info "Output Format" Output: Returns SpoofingResult(is_real, confidence) where is_real: True=Real, False=Fake

!!! note "Input Requirements" Requires face bounding box from a detector. Use with RetinaFace, SCRFD, or YOLOv5Face.


Model Management

Models are automatically downloaded and cached on first use.

  • Cache location: ~/.uniface/models/
  • Verification: Models are verified with SHA-256 checksums
  • Manual download: Use python tools/download_model.py to pre-download models

References

Model Training & Architectures

Papers