mirror of
https://github.com/yakhyo/uniface.git
synced 2026-05-17 14:23:46 +00:00
490 lines
19 KiB
Markdown
490 lines
19 KiB
Markdown
# Model Zoo
|
||
|
||
Complete guide to all available models and their performance characteristics.
|
||
|
||
---
|
||
|
||
## Face Detection Models
|
||
|
||
### RetinaFace Family
|
||
|
||
RetinaFace models are trained on the [WIDER FACE](datasets.md#wider-face) dataset.
|
||
|
||
| Model Name | Params | Size | Easy | Medium | Hard |
|
||
| -------------- | ------ | ----- | ------ | ------ | ------ |
|
||
| `MNET_025` | 0.4M | 1.7MB | 88.48% | 87.02% | 80.61% |
|
||
| `MNET_050` | 1.0M | 2.6MB | 89.42% | 87.97% | 82.40% |
|
||
| `MNET_V1` | 3.5M | 3.8MB | 90.59% | 89.14% | 84.13% |
|
||
| `MNET_V2` :material-check-circle: | 3.2M | 3.5MB | 91.70% | 91.03% | 86.60% |
|
||
| `RESNET18` | 11.7M | 27MB | 92.50% | 91.02% | 86.63% |
|
||
| `RESNET34` | 24.8M | 56MB | 94.16% | 93.12% | 88.90% |
|
||
|
||
!!! info "Accuracy & Benchmarks"
|
||
**Accuracy**: WIDER FACE validation set (Easy/Medium/Hard subsets) - from [RetinaFace paper](https://arxiv.org/abs/1905.00641)
|
||
|
||
**Speed**: Benchmark on your own hardware using `python tools/detect.py --source <image>`
|
||
|
||
---
|
||
|
||
### SCRFD Family
|
||
|
||
SCRFD (Sample and Computation Redistribution for Efficient Face Detection) models trained on [WIDER FACE](datasets.md#wider-face) dataset.
|
||
|
||
| Model Name | Params | Size | Easy | Medium | Hard |
|
||
| ---------------- | ------ | ----- | ------ | ------ | ------ |
|
||
| `SCRFD_500M_KPS` | 0.6M | 2.5MB | 90.57% | 88.12% | 68.51% |
|
||
| `SCRFD_10G_KPS` :material-check-circle: | 4.2M | 17MB | 95.16% | 93.87% | 83.05% |
|
||
|
||
!!! info "Accuracy & Benchmarks"
|
||
**Accuracy**: WIDER FACE validation set - from [SCRFD paper](https://arxiv.org/abs/2105.04714)
|
||
|
||
**Speed**: Benchmark on your own hardware using `python tools/detect.py --source <image>`
|
||
|
||
---
|
||
|
||
### YOLOv5-Face Family
|
||
|
||
YOLOv5-Face models provide detection with 5-point facial landmarks, trained on [WIDER FACE](datasets.md#wider-face) dataset.
|
||
|
||
| Model Name | Size | Easy | Medium | Hard |
|
||
| -------------- | ---- | ------ | ------ | ------ |
|
||
| `YOLOV5N` | 11MB | 93.61% | 91.52% | 80.53% |
|
||
| `YOLOV5S` :material-check-circle: | 28MB | 94.33% | 92.61% | 83.15% |
|
||
| `YOLOV5M` | 82MB | 95.30% | 93.76% | 85.28% |
|
||
|
||
!!! info "Accuracy & Benchmarks"
|
||
**Accuracy**: WIDER FACE validation set - from [YOLOv5-Face paper](https://arxiv.org/abs/2105.12931)
|
||
|
||
**Speed**: Benchmark on your own hardware using `python tools/detect.py --source <image>`
|
||
|
||
!!! note "Fixed Input Size"
|
||
All YOLOv5-Face models use a fixed input size of 640×640.
|
||
|
||
---
|
||
|
||
### YOLOv8-Face Family
|
||
|
||
YOLOv8-Face models use anchor-free design with DFL (Distribution Focal Loss) for bbox regression. Provides detection with 5-point facial landmarks.
|
||
|
||
| Model Name | Size | Easy | Medium | Hard |
|
||
| ---------------- | ------ | ------ | ------ | ------ |
|
||
| `YOLOV8_LITE_S`| 7.4MB | 93.4% | 91.2% | 78.6% |
|
||
| `YOLOV8N` :material-check-circle: | 12MB | 94.6% | 92.3% | 79.6% |
|
||
|
||
!!! info "Accuracy & Benchmarks"
|
||
**Accuracy**: WIDER FACE validation set (Easy/Medium/Hard subsets)
|
||
|
||
**Speed**: Benchmark on your own hardware using `python tools/detect.py --source <image> --method yolov8face`
|
||
|
||
!!! note "Fixed Input Size"
|
||
All YOLOv8-Face models use a fixed input size of 640×640.
|
||
|
||
---
|
||
|
||
## Face Recognition Models
|
||
|
||
### AdaFace
|
||
|
||
Face recognition using adaptive margin based on image quality.
|
||
|
||
| Model Name | Backbone | Dataset | Size | IJB-B TAR | IJB-C TAR |
|
||
| ----------- | -------- | ----------- | ------ | --------- | --------- |
|
||
| `IR_18` :material-check-circle: | IR-18 | WebFace4M | 92 MB | 93.03% | 94.99% |
|
||
| `IR_101` | IR-101 | WebFace12M | 249 MB | - | 97.66% |
|
||
|
||
!!! info "Training Data & Accuracy"
|
||
**Dataset**: [WebFace4M / WebFace12M](datasets.md#webface4m--webface12m) (4M / 12M images)
|
||
|
||
**Accuracy**: IJB-B and IJB-C benchmarks, TAR@FAR=0.01%
|
||
|
||
!!! tip "Key Innovation"
|
||
AdaFace introduces adaptive margin that adjusts based on image quality, providing better performance on low-quality images compared to fixed-margin approaches.
|
||
|
||
|
||
---
|
||
|
||
### ArcFace
|
||
|
||
Face recognition using additive angular margin loss.
|
||
|
||
| Model Name | Backbone | Params | Size | LFW | CFP-FP | AgeDB-30 | IJB-C |
|
||
| ----------- | --------- | ------ | ----- | ------ | ------ | -------- | ----- |
|
||
| `MNET` :material-check-circle: | MobileNet | 2.0M | 8MB | 99.70% | 98.00% | 96.58% | 95.02% |
|
||
| `RESNET` | ResNet50 | 43.6M | 166MB | 99.83% | 99.33% | 98.23% | 97.25% |
|
||
|
||
!!! info "Training Data"
|
||
**Dataset**: Trained on [WebFace600K](datasets.md#webface600k) (600K images)
|
||
|
||
**Accuracy**: IJB-C accuracy reported as TAR@FAR=1e-4
|
||
|
||
---
|
||
|
||
### MobileFace
|
||
|
||
Lightweight face recognition models with MobileNet backbones.
|
||
|
||
| Model Name | Backbone | Params | Size | LFW | CALFW | CPLFW | AgeDB-30 |
|
||
| ----------------- | ---------------- | ------ | ---- | ------ | ------ | ------ | -------- |
|
||
| `MNET_025` | MobileNetV1 0.25 | 0.36M | 1MB | 98.76% | 92.02% | 82.37% | 90.02% |
|
||
| `MNET_V2` :material-check-circle: | MobileNetV2 | 2.29M | 4MB | 99.55% | 94.87% | 86.89% | 95.16% |
|
||
| `MNET_V3_SMALL` | MobileNetV3-S | 1.25M | 3MB | 99.30% | 93.77% | 85.29% | 92.79% |
|
||
| `MNET_V3_LARGE` | MobileNetV3-L | 3.52M | 10MB | 99.53% | 94.56% | 86.79% | 95.13% |
|
||
|
||
!!! info "Training Data"
|
||
**Dataset**: Trained on [MS1MV2](datasets.md#ms1mv2) (5.8M images, 85K identities)
|
||
|
||
**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks
|
||
|
||
---
|
||
|
||
### SphereFace
|
||
|
||
Face recognition using angular softmax loss.
|
||
|
||
| Model Name | Backbone | Params | Size | LFW | CALFW | CPLFW | AgeDB-30 |
|
||
| ------------ | -------- | ------ | ---- | ------ | ------ | ------ | -------- |
|
||
| `SPHERE20` | Sphere20 | 24.5M | 50MB | 99.67% | 95.61% | 88.75% | 96.58% |
|
||
| `SPHERE36` | Sphere36 | 34.6M | 92MB | 99.72% | 95.64% | 89.92% | 96.83% |
|
||
|
||
!!! info "Training Data"
|
||
**Dataset**: Trained on [MS1MV2](datasets.md#ms1mv2) (5.8M images, 85K identities)
|
||
|
||
**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks
|
||
|
||
!!! note "Architecture"
|
||
SphereFace uses angular softmax loss, an earlier approach before ArcFace. These models provide good accuracy with moderate resource requirements.
|
||
|
||
---
|
||
|
||
### EdgeFace
|
||
|
||
Efficient face recognition designed for edge devices, using EdgeNeXt backbone with optional LoRA compression.
|
||
|
||
| Model Name | Backbone | Params | MFLOPs | Size | LFW | CALFW | CPLFW | CFP-FP | AgeDB-30 |
|
||
| --------------- | -------- | ------ | ------ | ----- | ------ | ------ | ------ | ------ | -------- |
|
||
| `XXS` :material-check-circle: | EdgeNeXt | 1.24M | 94 | ~5 MB | 99.57% | 94.83% | 90.27% | 93.63% | 94.92% |
|
||
| `XS_GAMMA_06` | EdgeNeXt | 1.77M | 154 | ~7 MB | 99.73% | 95.28% | 91.58% | 94.71% | 96.08% |
|
||
| `S_GAMMA_05` | EdgeNeXt | 3.65M | 306 | ~14 MB | 99.78% | 95.55% | 92.48% | 95.74% | 97.03% |
|
||
| `BASE` | EdgeNeXt | 18.2M | 1399 | ~70 MB | 99.83% | 96.07% | 93.75% | 97.01% | 97.60% |
|
||
|
||
!!! info "Training Data & Reference"
|
||
**Paper**: [EdgeFace: Efficient Face Recognition Model for Edge Devices](https://arxiv.org/abs/2307.01838v2) (IEEE T-BIOM 2024)
|
||
|
||
**Source**: [github.com/otroshi/edgeface](https://github.com/otroshi/edgeface) | [github.com/yakhyo/edgeface-onnx](https://github.com/yakhyo/edgeface-onnx)
|
||
|
||
---
|
||
|
||
## Facial Landmark Models
|
||
|
||
### 106-Point Landmark Detection
|
||
|
||
Facial landmark localization model.
|
||
|
||
| Model Name | Points | Params | Size |
|
||
| ---------- | ------ | ------ | ---- |
|
||
| `2D106` | 106 | 3.7M | 14MB |
|
||
|
||
**Landmark Groups:**
|
||
|
||
| Group | Points | Count |
|
||
|-------|--------|-------|
|
||
| Face contour | 0-32 | 33 points |
|
||
| Eyebrows | 33-50 | 18 points |
|
||
| Nose | 51-62 | 12 points |
|
||
| Eyes | 63-86 | 24 points |
|
||
| Mouth | 87-105 | 19 points |
|
||
|
||
---
|
||
|
||
### PIPNet (98 / 68 points)
|
||
|
||
PIPNet (Pixel-in-Pixel Net) facial landmark detector. ResNet-18 backbone, 256×256 input.
|
||
|
||
| Model Name | Points | Backbone | Dataset | Size |
|
||
| ---------- | ------ | -------- | ------- | ---- |
|
||
| `WFLW_98` :material-check-circle: | 98 | ResNet-18 | WFLW (supervised) | 47 MB |
|
||
| `DW300_CELEBA_68` | 68 | ResNet-18 | 300W+CelebA (GSSL) | 46 MB |
|
||
|
||
!!! info "Reference"
|
||
**Paper**: [PIPNet: Towards Efficient Facial Landmark Detection in the Wild](https://arxiv.org/abs/2003.03771) (IJCV 2021)
|
||
|
||
**Source**: [yakhyo/pipnet-onnx](https://github.com/yakhyo/pipnet-onnx) — ONNX export from [jhb86253817/PIPNet](https://github.com/jhb86253817/PIPNet)
|
||
|
||
!!! note "Auto-selected meanface"
|
||
Both variants share the same architecture; the number of landmarks (and the matching
|
||
meanface table) is inferred from the ONNX output channel count.
|
||
|
||
---
|
||
|
||
## Attribute Analysis Models
|
||
|
||
### Age & Gender Detection
|
||
|
||
| Model Name | Attributes | Params | Size |
|
||
| ----------- | ----------- | ------ | ---- |
|
||
| `AgeGender` | Age, Gender | 2.1M | 8MB |
|
||
|
||
!!! info "Training Data"
|
||
**Dataset**: Trained on [CelebA](datasets.md#celeba)
|
||
|
||
!!! warning "Accuracy Note"
|
||
Accuracy varies by demographic and image quality. Test on your specific use case.
|
||
|
||
---
|
||
|
||
### FairFace Attributes
|
||
|
||
| Model Name | Attributes | Params | Size |
|
||
| ----------- | --------------------- | ------ | ----- |
|
||
| `FairFace` | Race, Gender, Age Group | - | 44MB |
|
||
|
||
!!! info "Training Data"
|
||
**Dataset**: Trained on [FairFace](datasets.md#fairface) dataset with balanced demographics
|
||
|
||
!!! tip "Equitable Predictions"
|
||
FairFace provides more equitable predictions across different racial and gender groups.
|
||
|
||
**Race Categories (7):** White, Black, Latino Hispanic, East Asian, Southeast Asian, Indian, Middle Eastern
|
||
|
||
**Age Groups (9):** 0-2, 3-9, 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70+
|
||
|
||
---
|
||
|
||
### Emotion Detection
|
||
|
||
| Model Name | Classes | Params | Size |
|
||
| ------------- | ------- | ------ | ---- |
|
||
| `AFFECNET7` | 7 | 0.5M | 2MB |
|
||
| `AFFECNET8` | 8 | 0.5M | 2MB |
|
||
|
||
**Classes (7)**: Neutral, Happy, Sad, Surprise, Fear, Disgust, Angry
|
||
|
||
**Classes (8)**: Above + Contempt
|
||
|
||
!!! info "Training Data"
|
||
**Dataset**: Trained on [AffectNet](datasets.md#affectnet)
|
||
|
||
!!! note "Accuracy Note"
|
||
Emotion detection accuracy depends heavily on facial expression clarity and cultural context.
|
||
|
||
---
|
||
|
||
## Gaze Estimation Models
|
||
|
||
### MobileGaze Family
|
||
|
||
Gaze direction prediction models trained on [Gaze360](datasets.md#gaze360) dataset. Returns pitch (vertical) and yaw (horizontal) angles in radians.
|
||
|
||
| Model Name | Params | Size | MAE* |
|
||
| -------------- | ------ | ------- | ----- |
|
||
| `RESNET18` | 11.7M | 43 MB | 12.84 |
|
||
| `RESNET34` :material-check-circle: | 24.8M | 81.6 MB | 11.33 |
|
||
| `RESNET50` | 25.6M | 91.3 MB | 11.34 |
|
||
| `MOBILENET_V2` | 3.5M | 9.59 MB | 13.07 |
|
||
| `MOBILEONE_S0` | 2.1M | 4.8 MB | 12.58 |
|
||
|
||
*MAE (Mean Absolute Error) in degrees on Gaze360 test set - lower is better
|
||
|
||
!!! info "Training Data"
|
||
**Dataset**: Trained on [Gaze360](datasets.md#gaze360) (indoor/outdoor scenes with diverse head poses)
|
||
|
||
**Training**: 200 epochs with classification-based approach (binned angles)
|
||
|
||
!!! note "Input Requirements"
|
||
Requires face crop as input. Use face detection first to obtain bounding boxes.
|
||
|
||
---
|
||
|
||
## Head Pose Estimation Models
|
||
|
||
### HeadPose Family
|
||
|
||
Head pose estimation models using 6D rotation representation. Trained on [300W-LP](datasets.md#300w-lp) dataset, evaluated on AFLW2000. Returns pitch, yaw, and roll angles in degrees.
|
||
|
||
| Model Name | Backbone | Size | MAE* |
|
||
| -------------- | -------- | ------- | ----- |
|
||
| `RESNET18` :material-check-circle: | ResNet18 | 43 MB | 5.22° |
|
||
| `RESNET34` | ResNet34 | 82 MB | 5.07° |
|
||
| `RESNET50` | ResNet50 | 91 MB | 4.83° |
|
||
| `MOBILENET_V2` | MobileNetV2 | 9.6 MB | 5.72° |
|
||
| `MOBILENET_V3_SMALL` | MobileNetV3-Small | 4.8 MB | 6.31° |
|
||
| `MOBILENET_V3_LARGE` | MobileNetV3-Large | 16 MB | 5.58° |
|
||
|
||
*MAE (Mean Absolute Error) in degrees on AFLW2000 test set — lower is better
|
||
|
||
!!! info "Training Data"
|
||
**Dataset**: Trained on [300W-LP](datasets.md#300w-lp) (synthesized large-pose faces from 300W)
|
||
|
||
**Method**: 6D rotation representation (rotation matrix → Euler angles)
|
||
|
||
!!! note "Input Requirements"
|
||
Requires face crop as input. Use face detection first to obtain bounding boxes.
|
||
|
||
---
|
||
|
||
## Face Parsing Models
|
||
|
||
### BiSeNet Family
|
||
|
||
BiSeNet (Bilateral Segmentation Network) models for semantic face parsing. Segments face images into 19 facial component classes.
|
||
|
||
| Model Name | Params | Size | Classes |
|
||
| -------------- | ------ | ------- | ------- |
|
||
| `RESNET18` :material-check-circle: | 13.3M | 50.7 MB | 19 |
|
||
| `RESNET34` | 24.1M | 89.2 MB | 19 |
|
||
|
||
!!! info "Training Data"
|
||
**Dataset**: Trained on [CelebAMask-HQ](datasets.md#celebamask-hq)
|
||
|
||
**Architecture**: BiSeNet with ResNet backbone
|
||
|
||
**Input Size**: 512×512 (automatically resized)
|
||
|
||
**19 Facial Component Classes:**
|
||
|
||
| # | Class | # | Class | # | Class |
|
||
|---|-------|---|-------|---|-------|
|
||
| 0 | Background | 7 | Left Ear | 14 | Neck |
|
||
| 1 | Skin | 8 | Right Ear | 15 | Neck Lace |
|
||
| 2 | Left Eyebrow | 9 | Ear Ring | 16 | Cloth |
|
||
| 3 | Right Eyebrow | 10 | Nose | 17 | Hair |
|
||
| 4 | Left Eye | 11 | Mouth | 18 | Hat |
|
||
| 5 | Right Eye | 12 | Upper Lip | | |
|
||
| 6 | Eye Glasses | 13 | Lower Lip | | |
|
||
|
||
**Applications:**
|
||
|
||
- Face makeup and beauty applications
|
||
- Virtual try-on systems
|
||
- Face editing and manipulation
|
||
- Facial feature extraction
|
||
- Portrait segmentation
|
||
|
||
!!! note "Input Requirements"
|
||
Input should be a cropped face image. For full pipeline, use face detection first to obtain face crops.
|
||
|
||
---
|
||
|
||
### XSeg
|
||
|
||
XSeg from DeepFaceLab outputs masks for face regions. Requires 5-point landmarks for face alignment.
|
||
|
||
| Model Name | Size | Output |
|
||
|------------|--------|--------|
|
||
| `DEFAULT` | 67 MB | Mask [0, 1] |
|
||
|
||
!!! info "Model Details"
|
||
**Origin**: DeepFaceLab
|
||
|
||
**Input**: NHWC format, normalized to [0, 1]
|
||
|
||
**Alignment**: Requires 5-point landmarks (not bbox crops)
|
||
|
||
**Applications:**
|
||
|
||
- Face region extraction
|
||
- Face swapping pipelines
|
||
- Occlusion handling
|
||
|
||
!!! note "Input Requirements"
|
||
Requires 5-point facial landmarks. Use a face detector like RetinaFace to obtain landmarks first.
|
||
|
||
---
|
||
|
||
## Portrait Matting Models
|
||
|
||
### MODNet
|
||
|
||
MODNet (Real-Time Trimap-Free Portrait Matting) produces soft alpha mattes from full images without requiring a trimap. Uses MobileNetV2 backbone with low-resolution, high-resolution, and fusion branches.
|
||
|
||
| Model Name | Variant | Size | Use Case |
|
||
| ---------- | ------- | ---- | -------- |
|
||
| `PHOTOGRAPHIC` :material-check-circle: | High-quality | 25 MB | Portrait photos |
|
||
| `WEBCAM` | Real-time | 25 MB | Webcam feeds |
|
||
|
||
!!! info "Model Details"
|
||
**Paper**: [MODNet: Real-Time Trimap-Free Portrait Matting via Objective Decomposition](https://arxiv.org/abs/2011.11961) (AAAI 2022)
|
||
|
||
**Source**: [yakhyo/modnet](https://github.com/yakhyo/modnet) — ported weights and clean inference codebase
|
||
|
||
**Output**: Alpha matte `(H, W)` in `[0, 1]`
|
||
|
||
**Applications:**
|
||
|
||
- Background removal / replacement
|
||
- Green screen compositing
|
||
- Video conferencing virtual backgrounds
|
||
- Portrait editing
|
||
|
||
!!! note "Input Requirements"
|
||
Operates on full images (not face crops). No trimap or face detection required.
|
||
|
||
---
|
||
|
||
## Anti-Spoofing Models
|
||
|
||
### MiniFASNet Family
|
||
|
||
Face anti-spoofing models for liveness detection. Detect if a face is real (live) or fake (photo, video replay, mask).
|
||
|
||
| Model Name | Size | Scale |
|
||
| ---------- | ------ | ----- |
|
||
| `V1SE` | 1.2 MB | 4.0 |
|
||
| `V2` :material-check-circle: | 1.2 MB | 2.7 |
|
||
|
||
!!! info "Output Format"
|
||
**Output**: Returns `SpoofingResult(is_real, confidence)` where is_real: True=Real, False=Fake
|
||
|
||
!!! note "Input Requirements"
|
||
Requires face bounding box from a detector.
|
||
|
||
---
|
||
|
||
## Model Management
|
||
|
||
Models are automatically downloaded and cached on first use.
|
||
|
||
- **Cache location**: `~/.uniface/models/` (configurable via `set_cache_dir()` or `UNIFACE_CACHE_DIR` env var)
|
||
- **Inspect cache path**: `get_cache_dir()` returns the resolved active path
|
||
- **Verification**: Models are verified with SHA-256 checksums
|
||
- **Concurrent download**: `download_models([...])` fetches multiple models in parallel
|
||
- **Manual download**: Use `python tools/download_model.py` to pre-download models
|
||
|
||
See [Model Cache & Offline Use](concepts/model-cache-offline.md) for full details.
|
||
|
||
---
|
||
|
||
## References
|
||
|
||
### Model Training & Architectures
|
||
|
||
- **RetinaFace Training**: [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) - PyTorch implementation and training code
|
||
- **YOLOv5-Face Original**: [deepcam-cn/yolov5-face](https://github.com/deepcam-cn/yolov5-face) - Original PyTorch implementation
|
||
- **YOLOv5-Face ONNX**: [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) - ONNX inference implementation
|
||
- **YOLOv8-Face Original**: [derronqi/yolov8-face](https://github.com/derronqi/yolov8-face) - Original PyTorch implementation
|
||
- **YOLOv8-Face ONNX**: [yakhyo/yolov8-face-onnx-inference](https://github.com/yakhyo/yolov8-face-onnx-inference) - ONNX inference implementation
|
||
- **AdaFace Original**: [mk-minchul/AdaFace](https://github.com/mk-minchul/AdaFace) - Original PyTorch implementation
|
||
- **AdaFace ONNX**: [yakhyo/adaface-onnx](https://github.com/yakhyo/adaface-onnx) - ONNX export and inference
|
||
- **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) - ArcFace, MobileFace, SphereFace training code
|
||
- **Gaze Estimation Training**: [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) - MobileGaze training code and pretrained weights
|
||
- **Head Pose Estimation**: [yakhyo/head-pose-estimation](https://github.com/yakhyo/head-pose-estimation) - 6D rotation head pose estimation training and ONNX models
|
||
- **Face Parsing Training**: [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) - BiSeNet training code and pretrained weights
|
||
- **Face Segmentation**: [yakhyo/face-segmentation](https://github.com/yakhyo/face-segmentation) - XSeg ONNX Inference
|
||
- **Portrait Matting**: [yakhyo/modnet](https://github.com/yakhyo/modnet) - MODNet ported weights and inference (from [ZHKKKe/MODNet](https://github.com/ZHKKKe/MODNet))
|
||
- **Face Anti-Spoofing**: [yakhyo/face-anti-spoofing](https://github.com/yakhyo/face-anti-spoofing) - MiniFASNet ONNX inference (weights from [minivision-ai/Silent-Face-Anti-Spoofing](https://github.com/minivision-ai/Silent-Face-Anti-Spoofing))
|
||
- **FairFace**: [yakhyo/fairface-onnx](https://github.com/yakhyo/fairface-onnx) - FairFace ONNX inference for race, gender, age prediction
|
||
- **PIPNet**: [yakhyo/pipnet-onnx](https://github.com/yakhyo/pipnet-onnx) - PIPNet ONNX export and inference (from [jhb86253817/PIPNet](https://github.com/jhb86253817/PIPNet))
|
||
- **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface) - Model architectures and pretrained weights
|
||
|
||
### Papers
|
||
|
||
- **RetinaFace**: [Single-Shot Multi-Level Face Localisation in the Wild](https://arxiv.org/abs/1905.00641)
|
||
- **SCRFD**: [Sample and Computation Redistribution for Efficient Face Detection](https://arxiv.org/abs/2105.04714)
|
||
- **YOLOv5-Face**: [YOLO5Face: Why Reinventing a Face Detector](https://arxiv.org/abs/2105.12931)
|
||
- **AdaFace**: [AdaFace: Quality Adaptive Margin for Face Recognition](https://arxiv.org/abs/2204.00964)
|
||
- **ArcFace**: [Additive Angular Margin Loss for Deep Face Recognition](https://arxiv.org/abs/1801.07698)
|
||
- **SphereFace**: [Deep Hypersphere Embedding for Face Recognition](https://arxiv.org/abs/1704.08063)
|
||
- **MODNet**: [Real-Time Trimap-Free Portrait Matting via Objective Decomposition](https://arxiv.org/abs/2011.11961)
|
||
- **BiSeNet**: [Bilateral Segmentation Network for Real-time Semantic Segmentation](https://arxiv.org/abs/1808.00897)
|
||
- **PIPNet**: [Towards Efficient Facial Landmark Detection in the Wild](https://arxiv.org/abs/2003.03771)
|