mirror of
https://github.com/yakhyo/uniface.git
synced 2026-05-15 12:57:55 +00:00
Compare commits
11 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
edbab5f7bf | ||
|
|
cd8077e460 | ||
|
|
452b3381a2 | ||
|
|
07c8bd7b24 | ||
|
|
68179d1e2d | ||
|
|
99b35dddb4 | ||
|
|
3b6d0a35a9 | ||
|
|
0bd808bcef | ||
|
|
9edf8b6b3d | ||
|
|
efb40f2e91 | ||
|
|
376e7bc488 |
16
.github/workflows/ci.yml
vendored
16
.github/workflows/ci.yml
vendored
@@ -4,11 +4,9 @@ on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
- develop
|
||||
pull_request:
|
||||
branches:
|
||||
- main
|
||||
- develop
|
||||
|
||||
concurrency:
|
||||
group: ${{ github.workflow }}-${{ github.ref }}
|
||||
@@ -22,7 +20,7 @@ jobs:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: '3.11'
|
||||
python-version: "3.10"
|
||||
- uses: pre-commit/action@v3.0.1
|
||||
|
||||
test:
|
||||
@@ -33,8 +31,16 @@ jobs:
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
os: [ubuntu-latest, macos-latest, windows-latest]
|
||||
python-version: ["3.11", "3.13"]
|
||||
include:
|
||||
# Full Python range on Linux (fastest runner)
|
||||
- os: ubuntu-latest
|
||||
python-version: "3.10"
|
||||
- os: ubuntu-latest
|
||||
python-version: "3.13"
|
||||
- os: macos-latest
|
||||
python-version: "3.13"
|
||||
- os: windows-latest
|
||||
python-version: "3.13"
|
||||
|
||||
steps:
|
||||
- name: Checkout code
|
||||
|
||||
38
.github/workflows/docs.yml
vendored
Normal file
38
.github/workflows/docs.yml
vendored
Normal file
@@ -0,0 +1,38 @@
|
||||
name: Deploy docs
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [main]
|
||||
workflow_dispatch:
|
||||
|
||||
permissions:
|
||||
contents: write
|
||||
|
||||
jobs:
|
||||
deploy:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0 # Fetch full history for git-committers and git-revision-date plugins
|
||||
|
||||
- uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: "3.11"
|
||||
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
python -m pip install --upgrade pip
|
||||
pip install mkdocs-material pymdown-extensions mkdocs-git-committers-plugin-2 mkdocs-git-revision-date-localized-plugin
|
||||
|
||||
- name: Build docs
|
||||
env:
|
||||
MKDOCS_GIT_COMMITTERS_APIKEY: ${{ secrets.MKDOCS_GIT_COMMITTERS_APIKEY }}
|
||||
run: mkdocs build --strict
|
||||
|
||||
- name: Deploy to GitHub Pages
|
||||
uses: peaceiris/actions-gh-pages@v4
|
||||
with:
|
||||
github_token: ${{ secrets.GITHUB_TOKEN }}
|
||||
publish_dir: ./site
|
||||
destination_dir: docs
|
||||
4
.github/workflows/publish.yml
vendored
4
.github/workflows/publish.yml
vendored
@@ -24,7 +24,7 @@ jobs:
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: "3.11"
|
||||
python-version: "3.11" # Needs 3.11+ for tomllib
|
||||
|
||||
- name: Get version from tag and pyproject.toml
|
||||
id: get_version
|
||||
@@ -54,7 +54,7 @@ jobs:
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
python-version: ["3.11", "3.13"]
|
||||
python-version: ["3.10", "3.13"]
|
||||
|
||||
steps:
|
||||
- name: Checkout code
|
||||
|
||||
@@ -10,6 +10,7 @@ repos:
|
||||
- id: trailing-whitespace
|
||||
- id: end-of-file-fixer
|
||||
- id: check-yaml
|
||||
exclude: ^mkdocs.yml$
|
||||
- id: check-toml
|
||||
- id: check-added-large-files
|
||||
args: ['--maxkb=1000']
|
||||
|
||||
533
MODELS.md
533
MODELS.md
@@ -1,533 +0,0 @@
|
||||
# UniFace Model Zoo
|
||||
|
||||
Complete guide to all available models, their performance characteristics, and selection criteria.
|
||||
|
||||
---
|
||||
|
||||
## Face Detection Models
|
||||
|
||||
### RetinaFace Family
|
||||
|
||||
RetinaFace models are trained on the WIDER FACE dataset and provide excellent accuracy-speed tradeoffs.
|
||||
|
||||
| Model Name | Params | Size | Easy | Medium | Hard | Use Case |
|
||||
| -------------- | ------ | ----- | ------ | ------ | ------ | ----------------------------- |
|
||||
| `MNET_025` | 0.4M | 1.7MB | 88.48% | 87.02% | 80.61% | Mobile/Edge devices |
|
||||
| `MNET_050` | 1.0M | 2.6MB | 89.42% | 87.97% | 82.40% | Mobile/Edge devices |
|
||||
| `MNET_V1` | 3.5M | 3.8MB | 90.59% | 89.14% | 84.13% | Balanced mobile |
|
||||
| `MNET_V2` ⭐ | 3.2M | 3.5MB | 91.70% | 91.03% | 86.60% | **Recommended default** |
|
||||
| `RESNET18` | 11.7M | 27MB | 92.50% | 91.02% | 86.63% | Server/High accuracy |
|
||||
| `RESNET34` | 24.8M | 56MB | 94.16% | 93.12% | 88.90% | Maximum accuracy |
|
||||
|
||||
**Accuracy**: WIDER FACE validation set (Easy/Medium/Hard subsets) - from [RetinaFace paper](https://arxiv.org/abs/1905.00641)
|
||||
**Speed**: Benchmark on your own hardware using `tools/detection.py --source <image> --iterations 100`
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
from uniface.constants import RetinaFaceWeights
|
||||
|
||||
# Default (recommended)
|
||||
detector = RetinaFace() # Uses MNET_V2
|
||||
|
||||
# Specific model
|
||||
detector = RetinaFace(
|
||||
model_name=RetinaFaceWeights.MNET_025, # Fastest
|
||||
confidence_threshold=0.5,
|
||||
nms_thresh=0.4,
|
||||
input_size=(640, 640)
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### SCRFD Family
|
||||
|
||||
SCRFD (Sample and Computation Redistribution for Efficient Face Detection) models offer state-of-the-art speed-accuracy tradeoffs.
|
||||
|
||||
| Model Name | Params | Size | Easy | Medium | Hard | Use Case |
|
||||
| ---------------- | ------ | ----- | ------ | ------ | ------ | ------------------------------- |
|
||||
| `SCRFD_500M` | 0.6M | 2.5MB | 90.57% | 88.12% | 68.51% | Real-time applications |
|
||||
| `SCRFD_10G` ⭐ | 4.2M | 17MB | 95.16% | 93.87% | 83.05% | **High accuracy + speed** |
|
||||
|
||||
**Accuracy**: WIDER FACE validation set - from [SCRFD paper](https://arxiv.org/abs/2105.04714)
|
||||
**Speed**: Benchmark on your own hardware using `tools/detection.py --source <image> --iterations 100`
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
from uniface import SCRFD
|
||||
from uniface.constants import SCRFDWeights
|
||||
|
||||
# Fast real-time detection
|
||||
detector = SCRFD(
|
||||
model_name=SCRFDWeights.SCRFD_500M_KPS,
|
||||
confidence_threshold=0.5,
|
||||
input_size=(640, 640)
|
||||
)
|
||||
|
||||
# High accuracy
|
||||
detector = SCRFD(
|
||||
model_name=SCRFDWeights.SCRFD_10G_KPS,
|
||||
confidence_threshold=0.5
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### YOLOv5-Face Family
|
||||
|
||||
YOLOv5-Face models provide excellent detection accuracy with 5-point facial landmarks, optimized for real-time applications.
|
||||
|
||||
| Model Name | Size | Easy | Medium | Hard | Use Case |
|
||||
| -------------- | ---- | ------ | ------ | ------ | ------------------------------ |
|
||||
| `YOLOV5N` | 11MB | 93.61% | 91.52% | 80.53% | Lightweight/Mobile |
|
||||
| `YOLOV5S` ⭐ | 28MB | 94.33% | 92.61% | 83.15% | **Real-time + accuracy** |
|
||||
| `YOLOV5M` | 82MB | 95.30% | 93.76% | 85.28% | High accuracy |
|
||||
|
||||
**Accuracy**: WIDER FACE validation set - from [YOLOv5-Face paper](https://arxiv.org/abs/2105.12931)
|
||||
**Speed**: Benchmark on your own hardware using `tools/detection.py --source <image> --iterations 100`
|
||||
**Note**: Fixed input size of 640×640. Models exported to ONNX from [deepcam-cn/yolov5-face](https://github.com/deepcam-cn/yolov5-face)
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
from uniface import YOLOv5Face
|
||||
from uniface.constants import YOLOv5FaceWeights
|
||||
|
||||
# Lightweight/Mobile
|
||||
detector = YOLOv5Face(
|
||||
model_name=YOLOv5FaceWeights.YOLOV5N,
|
||||
confidence_threshold=0.6,
|
||||
nms_thresh=0.5
|
||||
)
|
||||
|
||||
# Real-time detection (recommended)
|
||||
detector = YOLOv5Face(
|
||||
model_name=YOLOv5FaceWeights.YOLOV5S,
|
||||
confidence_threshold=0.6,
|
||||
nms_thresh=0.5
|
||||
)
|
||||
|
||||
# High accuracy
|
||||
detector = YOLOv5Face(
|
||||
model_name=YOLOv5FaceWeights.YOLOV5M,
|
||||
confidence_threshold=0.6
|
||||
)
|
||||
|
||||
# Detect faces with landmarks
|
||||
faces = detector.detect(image)
|
||||
for face in faces:
|
||||
bbox = face.bbox # [x1, y1, x2, y2]
|
||||
confidence = face.confidence
|
||||
landmarks = face.landmarks # 5-point landmarks (5, 2)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Face Recognition Models
|
||||
|
||||
### ArcFace
|
||||
|
||||
State-of-the-art face recognition using additive angular margin loss.
|
||||
|
||||
| Model Name | Backbone | Params | Size | Use Case |
|
||||
| ----------- | --------- | ------ | ----- | -------------------------------- |
|
||||
| `MNET` ⭐ | MobileNet | 2.0M | 8MB | **Balanced (recommended)** |
|
||||
| `RESNET` | ResNet50 | 43.6M | 166MB | Maximum accuracy |
|
||||
|
||||
**Dataset**: Trained on MS1M-V2 (5.8M images, 85K identities)
|
||||
**Accuracy**: Benchmark on your own dataset or use standard face verification benchmarks
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
from uniface import ArcFace
|
||||
from uniface.constants import ArcFaceWeights
|
||||
|
||||
# Default (MobileNet backbone)
|
||||
recognizer = ArcFace()
|
||||
|
||||
# High accuracy (ResNet50 backbone)
|
||||
recognizer = ArcFace(model_name=ArcFaceWeights.RESNET)
|
||||
|
||||
# Extract embedding
|
||||
embedding = recognizer.get_normalized_embedding(image, landmarks)
|
||||
# Returns: (1, 512) normalized embedding vector
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### MobileFace
|
||||
|
||||
Lightweight face recognition optimized for mobile devices.
|
||||
|
||||
| Model Name | Backbone | Params | Size | LFW | CALFW | CPLFW | AgeDB-30 | Use Case |
|
||||
| ----------------- | ---------------- | ------ | ---- | ------ | ------ | ------ | -------- | --------------------- |
|
||||
| `MNET_025` | MobileNetV1 0.25 | 0.36M | 1MB | 98.76% | 92.02% | 82.37% | 90.02% | Ultra-lightweight |
|
||||
| `MNET_V2` ⭐ | MobileNetV2 | 2.29M | 4MB | 99.55% | 94.87% | 86.89% | 95.16% | **Mobile/Edge** |
|
||||
| `MNET_V3_SMALL` | MobileNetV3-S | 1.25M | 3MB | 99.30% | 93.77% | 85.29% | 92.79% | Mobile optimized |
|
||||
| `MNET_V3_LARGE` | MobileNetV3-L | 3.52M | 10MB | 99.53% | 94.56% | 86.79% | 95.13% | Balanced mobile |
|
||||
|
||||
**Dataset**: Trained on MS1M-V2 (5.8M images, 85K identities)
|
||||
**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks
|
||||
**Note**: These models are lightweight alternatives to ArcFace for resource-constrained environments
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
from uniface import MobileFace
|
||||
from uniface.constants import MobileFaceWeights
|
||||
|
||||
# Lightweight
|
||||
recognizer = MobileFace(model_name=MobileFaceWeights.MNET_V2)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### SphereFace
|
||||
|
||||
Face recognition using angular softmax loss.
|
||||
|
||||
| Model Name | Backbone | Params | Size | LFW | CALFW | CPLFW | AgeDB-30 | Use Case |
|
||||
| ------------ | -------- | ------ | ---- | ------ | ------ | ------ | -------- | ------------------- |
|
||||
| `SPHERE20` | Sphere20 | 24.5M | 50MB | 99.67% | 95.61% | 88.75% | 96.58% | Research/Comparison |
|
||||
| `SPHERE36` | Sphere36 | 34.6M | 92MB | 99.72% | 95.64% | 89.92% | 96.83% | Research/Comparison |
|
||||
|
||||
**Dataset**: Trained on MS1M-V2 (5.8M images, 85K identities)
|
||||
**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks
|
||||
**Note**: SphereFace uses angular softmax loss, an earlier approach before ArcFace. These models provide good accuracy with moderate resource requirements.
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
from uniface import SphereFace
|
||||
from uniface.constants import SphereFaceWeights
|
||||
|
||||
recognizer = SphereFace(model_name=SphereFaceWeights.SPHERE20)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Facial Landmark Models
|
||||
|
||||
### 106-Point Landmark Detection
|
||||
|
||||
High-precision facial landmark localization.
|
||||
|
||||
| Model Name | Points | Params | Size | Use Case |
|
||||
| ---------- | ------ | ------ | ---- | ------------------------ |
|
||||
| `2D106` | 106 | 3.7M | 14MB | Face alignment, analysis |
|
||||
|
||||
**Note**: Provides 106 facial keypoints for detailed face analysis and alignment
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
from uniface import Landmark106
|
||||
|
||||
landmarker = Landmark106()
|
||||
landmarks = landmarker.get_landmarks(image, bbox)
|
||||
# Returns: (106, 2) array of (x, y) coordinates
|
||||
```
|
||||
|
||||
**Landmark Groups:**
|
||||
|
||||
- Face contour: 0-32 (33 points)
|
||||
- Eyebrows: 33-50 (18 points)
|
||||
- Nose: 51-62 (12 points)
|
||||
- Eyes: 63-86 (24 points)
|
||||
- Mouth: 87-105 (19 points)
|
||||
|
||||
---
|
||||
|
||||
## Attribute Analysis Models
|
||||
|
||||
### Age & Gender Detection
|
||||
|
||||
| Model Name | Attributes | Params | Size | Use Case |
|
||||
| ----------- | ----------- | ------ | ---- | --------------- |
|
||||
| `DEFAULT` | Age, Gender | 2.1M | 8MB | General purpose |
|
||||
|
||||
**Dataset**: Trained on CelebA
|
||||
**Note**: Accuracy varies by demographic and image quality. Test on your specific use case.
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
from uniface import AgeGender
|
||||
|
||||
predictor = AgeGender()
|
||||
result = predictor.predict(image, bbox)
|
||||
# Returns: AttributeResult with gender, age, sex property
|
||||
# result.gender: 0 for Female, 1 for Male
|
||||
# result.sex: "Female" or "Male"
|
||||
# result.age: age in years
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### FairFace Attributes
|
||||
|
||||
| Model Name | Attributes | Params | Size | Use Case |
|
||||
| ----------- | --------------------- | ------ | ----- | --------------------------- |
|
||||
| `DEFAULT` | Race, Gender, Age Group | - | 44MB | Balanced demographic prediction |
|
||||
|
||||
**Dataset**: Trained on FairFace dataset with balanced demographics
|
||||
**Note**: FairFace provides more equitable predictions across different racial and gender groups
|
||||
|
||||
**Race Categories (7):** White, Black, Latino Hispanic, East Asian, Southeast Asian, Indian, Middle Eastern
|
||||
|
||||
**Age Groups (9):** 0-2, 3-9, 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70+
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
from uniface import FairFace
|
||||
|
||||
predictor = FairFace()
|
||||
result = predictor.predict(image, bbox)
|
||||
# Returns: AttributeResult with gender, age_group, race, sex property
|
||||
# result.gender: 0 for Female, 1 for Male
|
||||
# result.sex: "Female" or "Male"
|
||||
# result.age_group: "20-29", "30-39", etc.
|
||||
# result.race: "East Asian", "White", etc.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Emotion Detection
|
||||
|
||||
| Model Name | Classes | Params | Size | Use Case |
|
||||
| ------------- | ------- | ------ | ---- | --------------- |
|
||||
| `AFFECNET7` | 7 | 0.5M | 2MB | 7-class emotion |
|
||||
| `AFFECNET8` | 8 | 0.5M | 2MB | 8-class emotion |
|
||||
|
||||
**Classes (7)**: Neutral, Happy, Sad, Surprise, Fear, Disgust, Anger
|
||||
**Classes (8)**: Above + Contempt
|
||||
|
||||
**Dataset**: Trained on AffectNet
|
||||
**Note**: Emotion detection accuracy depends heavily on facial expression clarity and cultural context
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
from uniface import Emotion
|
||||
from uniface.constants import DDAMFNWeights
|
||||
|
||||
predictor = Emotion(model_name=DDAMFNWeights.AFFECNET7)
|
||||
result = predictor.predict(image, landmarks)
|
||||
# result.emotion: predicted emotion label
|
||||
# result.confidence: confidence score
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Gaze Estimation Models
|
||||
|
||||
### MobileGaze Family
|
||||
|
||||
Real-time gaze direction prediction models trained on Gaze360 dataset. Returns pitch (vertical) and yaw (horizontal) angles in radians.
|
||||
|
||||
| Model Name | Params | Size | MAE* | Use Case |
|
||||
| -------------- | ------ | ------- | ----- | ----------------------------- |
|
||||
| `RESNET18` | 11.7M | 43 MB | 12.84 | Balanced accuracy/speed |
|
||||
| `RESNET34` ⭐ | 24.8M | 81.6 MB | 11.33 | **Recommended default** |
|
||||
| `RESNET50` | 25.6M | 91.3 MB | 11.34 | High accuracy |
|
||||
| `MOBILENET_V2` | 3.5M | 9.59 MB | 13.07 | Mobile/Edge devices |
|
||||
| `MOBILEONE_S0` | 2.1M | 4.8 MB | 12.58 | Lightweight/Real-time |
|
||||
|
||||
*MAE (Mean Absolute Error) in degrees on Gaze360 test set - lower is better
|
||||
|
||||
**Dataset**: Trained on Gaze360 (indoor/outdoor scenes with diverse head poses)
|
||||
**Training**: 200 epochs with classification-based approach (binned angles)
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
from uniface import MobileGaze
|
||||
from uniface.constants import GazeWeights
|
||||
import numpy as np
|
||||
|
||||
# Default (recommended)
|
||||
gaze_estimator = MobileGaze() # Uses RESNET34
|
||||
|
||||
# Lightweight model
|
||||
gaze_estimator = MobileGaze(model_name=GazeWeights.MOBILEONE_S0)
|
||||
|
||||
# Estimate gaze from face crop
|
||||
result = gaze_estimator.estimate(face_crop)
|
||||
print(f"Pitch: {np.degrees(result.pitch):.1f}°, Yaw: {np.degrees(result.yaw):.1f}°")
|
||||
```
|
||||
|
||||
**Note**: Requires face crop as input. Use face detection first to obtain bounding boxes.
|
||||
|
||||
---
|
||||
|
||||
## Face Parsing Models
|
||||
|
||||
### BiSeNet Family
|
||||
|
||||
BiSeNet (Bilateral Segmentation Network) models for semantic face parsing. Segments face images into 19 facial component classes.
|
||||
|
||||
| Model Name | Params | Size | Classes | Use Case |
|
||||
| -------------- | ------ | ------- | ------- | ----------------------------- |
|
||||
| `RESNET18` ⭐ | 13.3M | 50.7 MB | 19 | **Recommended default** |
|
||||
| `RESNET34` | 24.1M | 89.2 MB | 19 | Higher accuracy |
|
||||
|
||||
**19 Facial Component Classes:**
|
||||
1. Background
|
||||
2. Skin
|
||||
3. Left Eyebrow
|
||||
4. Right Eyebrow
|
||||
5. Left Eye
|
||||
6. Right Eye
|
||||
7. Eye Glasses
|
||||
8. Left Ear
|
||||
9. Right Ear
|
||||
10. Ear Ring
|
||||
11. Nose
|
||||
12. Mouth
|
||||
13. Upper Lip
|
||||
14. Lower Lip
|
||||
15. Neck
|
||||
16. Neck Lace
|
||||
17. Cloth
|
||||
18. Hair
|
||||
19. Hat
|
||||
|
||||
**Dataset**: Trained on CelebAMask-HQ
|
||||
**Architecture**: BiSeNet with ResNet backbone
|
||||
**Input Size**: 512×512 (automatically resized)
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
from uniface.parsing import BiSeNet
|
||||
from uniface.constants import ParsingWeights
|
||||
from uniface.visualization import vis_parsing_maps
|
||||
import cv2
|
||||
|
||||
# Default (recommended)
|
||||
parser = BiSeNet() # Uses RESNET18
|
||||
|
||||
# Higher accuracy model
|
||||
parser = BiSeNet(model_name=ParsingWeights.RESNET34)
|
||||
|
||||
# Parse face image (already cropped)
|
||||
mask = parser.parse(face_image)
|
||||
|
||||
# Visualize with overlay
|
||||
face_rgb = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
|
||||
vis_result = vis_parsing_maps(face_rgb, mask, save_image=False)
|
||||
|
||||
# mask shape: (H, W) with values 0-18 representing classes
|
||||
print(f"Detected {len(np.unique(mask))} facial components")
|
||||
```
|
||||
|
||||
**Applications:**
|
||||
- Face makeup and beauty applications
|
||||
- Virtual try-on systems
|
||||
- Face editing and manipulation
|
||||
- Facial feature extraction
|
||||
- Portrait segmentation
|
||||
|
||||
**Note**: Input should be a cropped face image. For full pipeline, use face detection first to obtain face crops.
|
||||
|
||||
---
|
||||
|
||||
## Anti-Spoofing Models
|
||||
|
||||
### MiniFASNet Family
|
||||
|
||||
Lightweight face anti-spoofing models for liveness detection. Detect if a face is real (live) or fake (photo, video replay, mask).
|
||||
|
||||
| Model Name | Size | Scale | Use Case |
|
||||
| ---------- | ------ | ----- | ----------------------------- |
|
||||
| `V1SE` | 1.2 MB | 4.0 | Squeeze-and-excitation variant |
|
||||
| `V2` ⭐ | 1.2 MB | 2.7 | **Recommended default** |
|
||||
|
||||
**Dataset**: Trained on face anti-spoofing datasets
|
||||
**Output**: Returns `SpoofingResult(is_real, confidence)` where is_real: True=Real, False=Fake
|
||||
|
||||
#### Usage
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
from uniface.spoofing import MiniFASNet
|
||||
from uniface.constants import MiniFASNetWeights
|
||||
|
||||
# Default (V2, recommended)
|
||||
detector = RetinaFace()
|
||||
spoofer = MiniFASNet()
|
||||
|
||||
# V1SE variant
|
||||
spoofer = MiniFASNet(model_name=MiniFASNetWeights.V1SE)
|
||||
|
||||
# Detect and check liveness
|
||||
faces = detector.detect(image)
|
||||
for face in faces:
|
||||
result = spoofer.predict(image, face.bbox)
|
||||
# result.is_real: True for real, False for fake
|
||||
label = 'Real' if result.is_real else 'Fake'
|
||||
print(f"{label}: {result.confidence:.1%}")
|
||||
```
|
||||
|
||||
**Note**: Requires face bounding box from a detector. Use with RetinaFace, SCRFD, or YOLOv5Face.
|
||||
|
||||
---
|
||||
|
||||
## Model Updates
|
||||
|
||||
Models are automatically downloaded and cached on first use. Cache location: `~/.uniface/models/`
|
||||
|
||||
### Manual Model Management
|
||||
|
||||
```python
|
||||
from uniface.model_store import verify_model_weights
|
||||
from uniface.constants import RetinaFaceWeights
|
||||
|
||||
# Download specific model
|
||||
model_path = verify_model_weights(
|
||||
RetinaFaceWeights.MNET_V2,
|
||||
root='./custom_cache'
|
||||
)
|
||||
|
||||
# Models are verified with SHA-256 checksums
|
||||
```
|
||||
|
||||
### Download All Models
|
||||
|
||||
```bash
|
||||
# Using the provided script
|
||||
python tools/download_model.py
|
||||
|
||||
# Download specific model
|
||||
python tools/download_model.py --model MNET_V2
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
### Model Training & Architectures
|
||||
|
||||
- **RetinaFace Training**: [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) - PyTorch implementation and training code
|
||||
- **YOLOv5-Face Original**: [deepcam-cn/yolov5-face](https://github.com/deepcam-cn/yolov5-face) - Original PyTorch implementation
|
||||
- **YOLOv5-Face ONNX**: [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) - ONNX inference implementation
|
||||
- **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) - ArcFace, MobileFace, SphereFace training code
|
||||
- **Gaze Estimation Training**: [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) - MobileGaze training code and pretrained weights
|
||||
- **Face Parsing Training**: [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) - BiSeNet training code and pretrained weights
|
||||
- **Face Anti-Spoofing**: [yakhyo/face-anti-spoofing](https://github.com/yakhyo/face-anti-spoofing) - MiniFASNet ONNX inference (weights from [minivision-ai/Silent-Face-Anti-Spoofing](https://github.com/minivision-ai/Silent-Face-Anti-Spoofing))
|
||||
- **FairFace**: [yakhyo/fairface-onnx](https://github.com/yakhyo/fairface-onnx) - FairFace ONNX inference for race, gender, age prediction
|
||||
- **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface) - Model architectures and pretrained weights
|
||||
|
||||
### Papers
|
||||
|
||||
- **RetinaFace**: [Single-Shot Multi-Level Face Localisation in the Wild](https://arxiv.org/abs/1905.00641)
|
||||
- **SCRFD**: [Sample and Computation Redistribution for Efficient Face Detection](https://arxiv.org/abs/2105.04714)
|
||||
- **YOLOv5-Face**: [YOLO5Face: Why Reinventing a Face Detector](https://arxiv.org/abs/2105.12931)
|
||||
- **ArcFace**: [Additive Angular Margin Loss for Deep Face Recognition](https://arxiv.org/abs/1801.07698)
|
||||
- **SphereFace**: [Deep Hypersphere Embedding for Face Recognition](https://arxiv.org/abs/1704.08063)
|
||||
- **BiSeNet**: [Bilateral Segmentation Network for Real-time Semantic Segmentation](https://arxiv.org/abs/1808.00897)
|
||||
695
QUICKSTART.md
695
QUICKSTART.md
@@ -1,695 +0,0 @@
|
||||
# UniFace Quick Start Guide
|
||||
|
||||
Get up and running with UniFace in 5 minutes! This guide covers the most common use cases.
|
||||
|
||||
---
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
# macOS (Apple Silicon) - automatically includes ARM64 optimizations
|
||||
pip install uniface
|
||||
|
||||
# Linux/Windows with NVIDIA GPU
|
||||
pip install uniface[gpu]
|
||||
|
||||
# CPU-only (all platforms)
|
||||
pip install uniface
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 1. Face Detection (30 seconds)
|
||||
|
||||
Detect faces in an image:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
|
||||
# Load image
|
||||
image = cv2.imread("photo.jpg")
|
||||
|
||||
# Initialize detector (models auto-download on first use)
|
||||
detector = RetinaFace()
|
||||
|
||||
# Detect faces
|
||||
faces = detector.detect(image)
|
||||
|
||||
# Print results
|
||||
for i, face in enumerate(faces):
|
||||
print(f"Face {i+1}:")
|
||||
print(f" Confidence: {face.confidence:.2f}")
|
||||
print(f" BBox: {face.bbox}")
|
||||
print(f" Landmarks: {len(face.landmarks)} points")
|
||||
```
|
||||
|
||||
**Output:**
|
||||
|
||||
```
|
||||
Face 1:
|
||||
Confidence: 0.99
|
||||
BBox: [120.5, 85.3, 245.8, 210.6]
|
||||
Landmarks: 5 points
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Visualize Detections (1 minute)
|
||||
|
||||
Draw bounding boxes and landmarks:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.visualization import draw_detections
|
||||
|
||||
# Detect faces
|
||||
detector = RetinaFace()
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
# Extract visualization data
|
||||
bboxes = [f.bbox for f in faces]
|
||||
scores = [f.confidence for f in faces]
|
||||
landmarks = [f.landmarks for f in faces]
|
||||
|
||||
# Draw on image
|
||||
draw_detections(
|
||||
image=image,
|
||||
bboxes=bboxes,
|
||||
scores=scores,
|
||||
landmarks=landmarks,
|
||||
vis_threshold=0.6,
|
||||
)
|
||||
|
||||
# Save result
|
||||
cv2.imwrite("output.jpg", image)
|
||||
print("Saved output.jpg")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Face Recognition (2 minutes)
|
||||
|
||||
Compare two faces:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
import numpy as np
|
||||
from uniface import RetinaFace, ArcFace
|
||||
|
||||
# Initialize models
|
||||
detector = RetinaFace()
|
||||
recognizer = ArcFace()
|
||||
|
||||
# Load two images
|
||||
image1 = cv2.imread("person1.jpg")
|
||||
image2 = cv2.imread("person2.jpg")
|
||||
|
||||
# Detect faces
|
||||
faces1 = detector.detect(image1)
|
||||
faces2 = detector.detect(image2)
|
||||
|
||||
if faces1 and faces2:
|
||||
# Extract embeddings
|
||||
emb1 = recognizer.get_normalized_embedding(image1, faces1[0].landmarks)
|
||||
emb2 = recognizer.get_normalized_embedding(image2, faces2[0].landmarks)
|
||||
|
||||
# Compute similarity (cosine similarity)
|
||||
similarity = np.dot(emb1, emb2.T)[0][0]
|
||||
|
||||
# Interpret result
|
||||
if similarity > 0.6:
|
||||
print(f"Same person (similarity: {similarity:.3f})")
|
||||
else:
|
||||
print(f"Different people (similarity: {similarity:.3f})")
|
||||
else:
|
||||
print("No faces detected")
|
||||
```
|
||||
|
||||
**Similarity thresholds:**
|
||||
|
||||
- `> 0.6`: Same person (high confidence)
|
||||
- `0.4 - 0.6`: Uncertain (manual review)
|
||||
- `< 0.4`: Different people
|
||||
|
||||
---
|
||||
|
||||
## 4. Webcam Demo (2 minutes)
|
||||
|
||||
Real-time face detection:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.visualization import draw_detections
|
||||
|
||||
detector = RetinaFace()
|
||||
cap = cv2.VideoCapture(0)
|
||||
|
||||
print("Press 'q' to quit")
|
||||
|
||||
while True:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
# Detect faces
|
||||
faces = detector.detect(frame)
|
||||
|
||||
# Draw results
|
||||
bboxes = [f.bbox for f in faces]
|
||||
scores = [f.confidence for f in faces]
|
||||
landmarks = [f.landmarks for f in faces]
|
||||
draw_detections(
|
||||
image=frame,
|
||||
bboxes=bboxes,
|
||||
scores=scores,
|
||||
landmarks=landmarks,
|
||||
)
|
||||
|
||||
# Show frame
|
||||
cv2.imshow("UniFace - Press 'q' to quit", frame)
|
||||
|
||||
if cv2.waitKey(1) & 0xFF == ord('q'):
|
||||
break
|
||||
|
||||
cap.release()
|
||||
cv2.destroyAllWindows()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Age & Gender Detection (2 minutes)
|
||||
|
||||
Detect age and gender:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace, AgeGender
|
||||
|
||||
# Initialize models
|
||||
detector = RetinaFace()
|
||||
age_gender = AgeGender()
|
||||
|
||||
# Load image
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
# Predict attributes
|
||||
for i, face in enumerate(faces):
|
||||
result = age_gender.predict(image, face.bbox)
|
||||
print(f"Face {i+1}: {result.sex}, {result.age} years old")
|
||||
# result.gender: 0=Female, 1=Male
|
||||
# result.sex: "Female" or "Male"
|
||||
# result.age: age in years
|
||||
```
|
||||
|
||||
**Output:**
|
||||
|
||||
```
|
||||
Face 1: Male, 32 years old
|
||||
Face 2: Female, 28 years old
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5b. FairFace Attributes (2 minutes)
|
||||
|
||||
Detect race, gender, and age group with balanced demographics:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace, FairFace
|
||||
|
||||
# Initialize models
|
||||
detector = RetinaFace()
|
||||
fairface = FairFace()
|
||||
|
||||
# Load image
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
# Predict attributes
|
||||
for i, face in enumerate(faces):
|
||||
result = fairface.predict(image, face.bbox)
|
||||
print(f"Face {i+1}: {result.sex}, {result.age_group}, {result.race}")
|
||||
# result.gender: 0=Female, 1=Male
|
||||
# result.sex: "Female" or "Male"
|
||||
# result.age_group: "20-29", "30-39", etc.
|
||||
# result.race: "East Asian", "White", etc.
|
||||
```
|
||||
|
||||
**Output:**
|
||||
|
||||
```
|
||||
Face 1: Male, 30-39, East Asian
|
||||
Face 2: Female, 20-29, White
|
||||
```
|
||||
|
||||
**Race Categories:** White, Black, Latino Hispanic, East Asian, Southeast Asian, Indian, Middle Eastern
|
||||
|
||||
**Age Groups:** 0-2, 3-9, 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70+
|
||||
|
||||
---
|
||||
|
||||
## 6. Facial Landmarks (2 minutes)
|
||||
|
||||
Detect 106 facial landmarks:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace, Landmark106
|
||||
|
||||
# Initialize models
|
||||
detector = RetinaFace()
|
||||
landmarker = Landmark106()
|
||||
|
||||
# Detect face and landmarks
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
if faces:
|
||||
landmarks = landmarker.get_landmarks(image, faces[0].bbox)
|
||||
print(f"Detected {len(landmarks)} landmarks")
|
||||
|
||||
# Draw landmarks
|
||||
for x, y in landmarks.astype(int):
|
||||
cv2.circle(image, (x, y), 2, (0, 255, 0), -1)
|
||||
|
||||
cv2.imwrite("landmarks.jpg", image)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Gaze Estimation (2 minutes)
|
||||
|
||||
Estimate where a person is looking:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
import numpy as np
|
||||
from uniface import RetinaFace, MobileGaze
|
||||
from uniface.visualization import draw_gaze
|
||||
|
||||
# Initialize models
|
||||
detector = RetinaFace()
|
||||
gaze_estimator = MobileGaze()
|
||||
|
||||
# Load image
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
# Estimate gaze for each face
|
||||
for i, face in enumerate(faces):
|
||||
x1, y1, x2, y2 = map(int, face.bbox[:4])
|
||||
face_crop = image[y1:y2, x1:x2]
|
||||
|
||||
if face_crop.size > 0:
|
||||
result = gaze_estimator.estimate(face_crop)
|
||||
print(f"Face {i+1}: pitch={np.degrees(result.pitch):.1f}°, yaw={np.degrees(result.yaw):.1f}°")
|
||||
|
||||
# Draw gaze direction
|
||||
draw_gaze(image, face.bbox, result.pitch, result.yaw)
|
||||
|
||||
cv2.imwrite("gaze_output.jpg", image)
|
||||
```
|
||||
|
||||
**Output:**
|
||||
|
||||
```
|
||||
Face 1: pitch=5.2°, yaw=-12.3°
|
||||
Face 2: pitch=-8.1°, yaw=15.7°
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8. Face Parsing (2 minutes)
|
||||
|
||||
Segment face into semantic components (skin, eyes, nose, mouth, hair, etc.):
|
||||
|
||||
```python
|
||||
import cv2
|
||||
import numpy as np
|
||||
from uniface.parsing import BiSeNet
|
||||
from uniface.visualization import vis_parsing_maps
|
||||
|
||||
# Initialize parser
|
||||
parser = BiSeNet() # Uses ResNet18 by default
|
||||
|
||||
# Load face image (already cropped)
|
||||
face_image = cv2.imread("face.jpg")
|
||||
|
||||
# Parse face into 19 components
|
||||
mask = parser.parse(face_image)
|
||||
|
||||
# Visualize with overlay
|
||||
face_rgb = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
|
||||
vis_result = vis_parsing_maps(face_rgb, mask, save_image=False)
|
||||
|
||||
# Convert back to BGR for saving
|
||||
vis_bgr = cv2.cvtColor(vis_result, cv2.COLOR_RGB2BGR)
|
||||
cv2.imwrite("parsed_face.jpg", vis_bgr)
|
||||
|
||||
print(f"Detected {len(np.unique(mask))} facial components")
|
||||
```
|
||||
|
||||
**Output:**
|
||||
|
||||
```
|
||||
Detected 12 facial components
|
||||
```
|
||||
|
||||
**19 Facial Component Classes:**
|
||||
- Background, Skin, Eyebrows (L/R), Eyes (L/R), Eye Glasses
|
||||
- Ears (L/R), Ear Ring, Nose, Mouth, Lips (Upper/Lower)
|
||||
- Neck, Neck Lace, Cloth, Hair, Hat
|
||||
|
||||
---
|
||||
|
||||
## 9. Face Anonymization (2 minutes)
|
||||
|
||||
Automatically blur faces for privacy protection:
|
||||
|
||||
```python
|
||||
from uniface.privacy import anonymize_faces
|
||||
import cv2
|
||||
|
||||
# One-liner: automatic detection and blurring
|
||||
image = cv2.imread("group_photo.jpg")
|
||||
anonymized = anonymize_faces(image, method='pixelate')
|
||||
cv2.imwrite("anonymized.jpg", anonymized)
|
||||
print("Faces anonymized successfully!")
|
||||
```
|
||||
|
||||
**Manual control with custom parameters:**
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
from uniface.privacy import BlurFace
|
||||
|
||||
# Initialize detector and blurrer
|
||||
detector = RetinaFace()
|
||||
blurrer = BlurFace(method='gaussian', blur_strength=5.0)
|
||||
|
||||
# Detect and anonymize
|
||||
faces = detector.detect(image)
|
||||
anonymized = blurrer.anonymize(image, faces)
|
||||
cv2.imwrite("output.jpg", anonymized)
|
||||
```
|
||||
|
||||
**Available blur methods:**
|
||||
|
||||
```python
|
||||
# Pixelation (news media standard)
|
||||
blurrer = BlurFace(method='pixelate', pixel_blocks=8)
|
||||
|
||||
# Gaussian blur (smooth, natural)
|
||||
blurrer = BlurFace(method='gaussian', blur_strength=4.0)
|
||||
|
||||
# Black boxes (maximum privacy)
|
||||
blurrer = BlurFace(method='blackout', color=(0, 0, 0))
|
||||
|
||||
# Elliptical blur (natural face shape)
|
||||
blurrer = BlurFace(method='elliptical', blur_strength=3.0, margin=30)
|
||||
|
||||
# Median blur (edge-preserving)
|
||||
blurrer = BlurFace(method='median', blur_strength=3.0)
|
||||
```
|
||||
|
||||
**Webcam anonymization:**
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.privacy import BlurFace
|
||||
|
||||
detector = RetinaFace()
|
||||
blurrer = BlurFace(method='pixelate')
|
||||
cap = cv2.VideoCapture(0)
|
||||
|
||||
while True:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
faces = detector.detect(frame)
|
||||
frame = blurrer.anonymize(frame, faces, inplace=True)
|
||||
|
||||
cv2.imshow('Anonymized', frame)
|
||||
if cv2.waitKey(1) & 0xFF == ord('q'):
|
||||
break
|
||||
|
||||
cap.release()
|
||||
cv2.destroyAllWindows()
|
||||
```
|
||||
|
||||
**Command-line tool:**
|
||||
|
||||
```bash
|
||||
# Anonymize image with pixelation
|
||||
python tools/face_anonymize.py --source photo.jpg
|
||||
|
||||
# Real-time webcam anonymization
|
||||
python tools/face_anonymize.py --source 0 --method gaussian
|
||||
|
||||
# Custom blur strength
|
||||
python tools/face_anonymize.py --source photo.jpg --method gaussian --blur-strength 5.0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10. Face Anti-Spoofing (2 minutes)
|
||||
|
||||
Detect if a face is real or fake (photo, video replay, mask):
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
from uniface.spoofing import MiniFASNet
|
||||
|
||||
detector = RetinaFace()
|
||||
spoofer = MiniFASNet() # Uses V2 by default
|
||||
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
for i, face in enumerate(faces):
|
||||
result = spoofer.predict(image, face.bbox)
|
||||
# result.is_real: True for real, False for fake
|
||||
label = 'Real' if result.is_real else 'Fake'
|
||||
print(f"Face {i+1}: {label} ({result.confidence:.1%})")
|
||||
```
|
||||
|
||||
**Output:**
|
||||
|
||||
```
|
||||
Face 1: Real (98.5%)
|
||||
```
|
||||
|
||||
**Command-line tool:**
|
||||
|
||||
```bash
|
||||
# Image
|
||||
python tools/spoofing.py --source photo.jpg
|
||||
|
||||
# Webcam
|
||||
python tools/spoofing.py --source 0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 11. Batch Processing (3 minutes)
|
||||
|
||||
Process multiple images:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from pathlib import Path
|
||||
from uniface import RetinaFace
|
||||
|
||||
detector = RetinaFace()
|
||||
|
||||
# Process all images in a folder
|
||||
image_dir = Path("images/")
|
||||
output_dir = Path("output/")
|
||||
output_dir.mkdir(exist_ok=True)
|
||||
|
||||
for image_path in image_dir.glob("*.jpg"):
|
||||
print(f"Processing {image_path.name}...")
|
||||
|
||||
image = cv2.imread(str(image_path))
|
||||
faces = detector.detect(image)
|
||||
|
||||
print(f" Found {len(faces)} face(s)")
|
||||
|
||||
# Save results
|
||||
output_path = output_dir / image_path.name
|
||||
# ... draw and save ...
|
||||
|
||||
print("Done!")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 12. Model Selection
|
||||
|
||||
Choose the right model for your use case:
|
||||
|
||||
### Detection Models
|
||||
|
||||
```python
|
||||
from uniface.detection import RetinaFace, SCRFD, YOLOv5Face
|
||||
from uniface.constants import RetinaFaceWeights, SCRFDWeights, YOLOv5FaceWeights
|
||||
|
||||
# Fast detection (mobile/edge devices)
|
||||
detector = RetinaFace(
|
||||
model_name=RetinaFaceWeights.MNET_025,
|
||||
confidence_threshold=0.7
|
||||
)
|
||||
|
||||
# Balanced (recommended)
|
||||
detector = RetinaFace(
|
||||
model_name=RetinaFaceWeights.MNET_V2
|
||||
)
|
||||
|
||||
# Real-time with high accuracy
|
||||
detector = YOLOv5Face(
|
||||
model_name=YOLOv5FaceWeights.YOLOV5S,
|
||||
confidence_threshold=0.6,
|
||||
nms_thresh=0.5
|
||||
)
|
||||
|
||||
# High accuracy (server/GPU)
|
||||
detector = SCRFD(
|
||||
model_name=SCRFDWeights.SCRFD_10G_KPS,
|
||||
confidence_threshold=0.5
|
||||
)
|
||||
```
|
||||
|
||||
### Recognition Models
|
||||
|
||||
```python
|
||||
from uniface import ArcFace, MobileFace, SphereFace
|
||||
from uniface.constants import MobileFaceWeights, SphereFaceWeights
|
||||
|
||||
# ArcFace (recommended for most use cases)
|
||||
recognizer = ArcFace() # Best accuracy
|
||||
|
||||
# MobileFace (lightweight for mobile/edge)
|
||||
recognizer = MobileFace(model_name=MobileFaceWeights.MNET_V2) # Fast, small size
|
||||
|
||||
# SphereFace (angular margin approach)
|
||||
recognizer = SphereFace(model_name=SphereFaceWeights.SPHERE20) # Alternative method
|
||||
```
|
||||
|
||||
### Gaze Estimation Models
|
||||
|
||||
```python
|
||||
from uniface import MobileGaze
|
||||
from uniface.constants import GazeWeights
|
||||
|
||||
# Default (recommended)
|
||||
gaze_estimator = MobileGaze() # Uses RESNET34
|
||||
|
||||
# Lightweight (mobile/edge devices)
|
||||
gaze_estimator = MobileGaze(model_name=GazeWeights.MOBILEONE_S0)
|
||||
|
||||
# High accuracy
|
||||
gaze_estimator = MobileGaze(model_name=GazeWeights.RESNET50)
|
||||
```
|
||||
|
||||
### Face Parsing Models
|
||||
|
||||
```python
|
||||
from uniface.parsing import BiSeNet
|
||||
from uniface.constants import ParsingWeights
|
||||
|
||||
# Default (recommended, 50.7 MB)
|
||||
parser = BiSeNet() # Uses RESNET18
|
||||
|
||||
# Higher accuracy (89.2 MB)
|
||||
parser = BiSeNet(model_name=ParsingWeights.RESNET34)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Issues
|
||||
|
||||
### 1. Models Not Downloading
|
||||
|
||||
```python
|
||||
# Manually download a model
|
||||
from uniface.model_store import verify_model_weights
|
||||
from uniface.constants import RetinaFaceWeights
|
||||
|
||||
model_path = verify_model_weights(RetinaFaceWeights.MNET_V2)
|
||||
print(f"Model downloaded to: {model_path}")
|
||||
```
|
||||
|
||||
### 2. Check Hardware Acceleration
|
||||
|
||||
```python
|
||||
import onnxruntime as ort
|
||||
print("Available providers:", ort.get_available_providers())
|
||||
|
||||
# macOS M-series should show: ['CoreMLExecutionProvider', ...]
|
||||
# NVIDIA GPU should show: ['CUDAExecutionProvider', ...]
|
||||
```
|
||||
|
||||
### 3. Slow Performance on Mac
|
||||
|
||||
The standard installation includes ARM64 optimizations for Apple Silicon. If performance is slow, verify you're using the ARM64 build of Python:
|
||||
|
||||
```bash
|
||||
python -c "import platform; print(platform.machine())"
|
||||
# Should show: arm64 (not x86_64)
|
||||
```
|
||||
|
||||
### 4. Import Errors
|
||||
|
||||
```python
|
||||
# Correct imports
|
||||
from uniface.detection import RetinaFace
|
||||
from uniface.recognition import ArcFace
|
||||
from uniface.landmark import Landmark106
|
||||
|
||||
# Wrong imports
|
||||
from uniface import retinaface # Module, not class
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
### Jupyter Notebook Examples
|
||||
|
||||
Explore interactive examples for common tasks:
|
||||
|
||||
| Example | Description | Notebook |
|
||||
|---------|-------------|----------|
|
||||
| **Face Detection** | Detect faces and facial landmarks | [01_face_detection.ipynb](examples/01_face_detection.ipynb) |
|
||||
| **Face Alignment** | Align and crop faces for recognition | [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) |
|
||||
| **Face Verification** | Compare two faces to verify identity | [03_face_verification.ipynb](examples/03_face_verification.ipynb) |
|
||||
| **Face Search** | Find a person in a group photo | [04_face_search.ipynb](examples/04_face_search.ipynb) |
|
||||
| **Face Analyzer** | All-in-one detection, recognition & attributes | [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) |
|
||||
| **Face Parsing** | Segment face into semantic components | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
|
||||
| **Face Anonymization** | Blur or pixelate faces for privacy protection | [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) |
|
||||
| **Gaze Estimation** | Estimate gaze direction | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
|
||||
|
||||
### Additional Resources
|
||||
|
||||
- **Model Benchmarks**: See [MODELS.md](MODELS.md) for performance comparisons
|
||||
- **Full Documentation**: Read [README.md](README.md) for complete API reference
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- **RetinaFace Training**: [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch)
|
||||
- **YOLOv5-Face ONNX**: [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference)
|
||||
- **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition)
|
||||
- **Gaze Estimation Training**: [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation)
|
||||
- **Face Parsing Training**: [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing)
|
||||
- **FairFace**: [yakhyo/fairface-onnx](https://github.com/yakhyo/fairface-onnx) - Race, gender, age prediction
|
||||
- **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface)
|
||||
684
README.md
684
README.md
@@ -2,683 +2,125 @@
|
||||
|
||||
<div align="center">
|
||||
|
||||
[](https://pypi.org/project/uniface/)
|
||||
[](https://www.python.org/)
|
||||
[](https://opensource.org/licenses/MIT)
|
||||
[](https://www.python.org/)
|
||||
[](https://pypi.org/project/uniface/)
|
||||
[](https://github.com/yakhyo/uniface/actions)
|
||||
[](https://pepy.tech/project/uniface)
|
||||
[](https://deepwiki.com/yakhyo/uniface)
|
||||
[](https://pepy.tech/projects/uniface)
|
||||
[](https://yakhyo.github.io/uniface/)
|
||||
|
||||
</div>
|
||||
|
||||
<div align="center">
|
||||
<img src=".github/logos/logo_web.webp" width=75%>
|
||||
<img src=".github/logos/logo_web.webp" width=80%>
|
||||
</div>
|
||||
|
||||
**UniFace** is a lightweight, production-ready face analysis library built on ONNX Runtime. It provides high-performance face detection, recognition, landmark detection, face parsing, gaze estimation, and attribute analysis with hardware acceleration support across platforms.
|
||||
|
||||
> 💬 **Have questions?** [Chat with this codebase on DeepWiki](https://deepwiki.com/yakhyo/uniface) - AI-powered docs that let you ask anything about UniFace.
|
||||
|
||||
---
|
||||
|
||||
## Features
|
||||
|
||||
- **High-Speed Face Detection**: ONNX-optimized RetinaFace, SCRFD, and YOLOv5-Face models
|
||||
- **Facial Landmark Detection**: Accurate 106-point landmark localization
|
||||
- **Face Recognition**: ArcFace, MobileFace, and SphereFace embeddings
|
||||
- **Face Parsing**: BiSeNet-based semantic segmentation with 19 facial component classes
|
||||
- **Gaze Estimation**: Real-time gaze direction prediction with MobileGaze
|
||||
- **Attribute Analysis**: Age, gender, race (FairFace), and emotion detection
|
||||
- **Anti-Spoofing**: Face liveness detection with MiniFASNet models
|
||||
- **Face Anonymization**: Privacy-preserving face blurring with 5 methods (pixelate, gaussian, blackout, elliptical, median)
|
||||
- **Face Alignment**: Precise alignment for downstream tasks
|
||||
- **Hardware Acceleration**: ARM64 optimizations (Apple Silicon), CUDA (NVIDIA), CPU fallback
|
||||
- **Simple API**: Intuitive factory functions and clean interfaces
|
||||
- **Production-Ready**: Type hints, comprehensive logging, PEP8 compliant
|
||||
- **Face Detection** — RetinaFace, SCRFD, YOLOv5-Face, and YOLOv8-Face with 5-point landmarks
|
||||
- **Face Recognition** — ArcFace, MobileFace, and SphereFace embeddings
|
||||
- **Facial Landmarks** — 106-point landmark localization
|
||||
- **Face Parsing** — BiSeNet semantic segmentation (19 classes)
|
||||
- **Gaze Estimation** — Real-time gaze direction with MobileGaze
|
||||
- **Attribute Analysis** — Age, gender, race (FairFace), and emotion
|
||||
- **Anti-Spoofing** — Face liveness detection with MiniFASNet
|
||||
- **Face Anonymization** — 5 blur methods for privacy protection
|
||||
- **Hardware Acceleration** — ARM64 (Apple Silicon), CUDA (NVIDIA), CPU
|
||||
|
||||
---
|
||||
|
||||
## Installation
|
||||
|
||||
### Quick Install (All Platforms)
|
||||
|
||||
```bash
|
||||
# Standard installation
|
||||
pip install uniface
|
||||
```
|
||||
|
||||
### Platform-Specific Installation
|
||||
|
||||
#### macOS (Apple Silicon - M1/M2/M3/M4)
|
||||
|
||||
For Apple Silicon Macs, the standard installation automatically includes optimized ARM64 support:
|
||||
|
||||
```bash
|
||||
pip install uniface
|
||||
```
|
||||
|
||||
The base `onnxruntime` package (included with uniface) has native Apple Silicon support with ARM64 optimizations built-in since version 1.13+.
|
||||
|
||||
#### Linux/Windows with NVIDIA GPU
|
||||
|
||||
For CUDA acceleration on NVIDIA GPUs:
|
||||
|
||||
```bash
|
||||
# GPU support (CUDA)
|
||||
pip install uniface[gpu]
|
||||
```
|
||||
|
||||
**Requirements:**
|
||||
|
||||
- CUDA 11.x or 12.x
|
||||
- cuDNN 8.x
|
||||
- See [ONNX Runtime GPU requirements](https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html)
|
||||
|
||||
#### CPU-Only (All Platforms)
|
||||
|
||||
```bash
|
||||
pip install uniface
|
||||
```
|
||||
|
||||
### Install from Source
|
||||
|
||||
```bash
|
||||
# From source
|
||||
git clone https://github.com/yakhyo/uniface.git
|
||||
cd uniface
|
||||
pip install -e .
|
||||
cd uniface && pip install -e .
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Face Detection
|
||||
## Quick Example
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
|
||||
# Initialize detector
|
||||
# Initialize detector (models auto-download on first use)
|
||||
detector = RetinaFace()
|
||||
|
||||
# Load image
|
||||
image = cv2.imread("image.jpg")
|
||||
|
||||
# Detect faces
|
||||
faces = detector.detect(image)
|
||||
|
||||
# Process results
|
||||
for face in faces:
|
||||
bbox = face.bbox # np.ndarray [x1, y1, x2, y2]
|
||||
confidence = face.confidence
|
||||
landmarks = face.landmarks # np.ndarray (5, 2) landmarks
|
||||
print(f"Face detected with confidence: {confidence:.2f}")
|
||||
```
|
||||
|
||||
### Face Recognition
|
||||
|
||||
```python
|
||||
from uniface import ArcFace, RetinaFace
|
||||
from uniface import compute_similarity
|
||||
|
||||
# Initialize models
|
||||
detector = RetinaFace()
|
||||
recognizer = ArcFace()
|
||||
|
||||
# Detect and extract embeddings
|
||||
faces1 = detector.detect(image1)
|
||||
faces2 = detector.detect(image2)
|
||||
|
||||
embedding1 = recognizer.get_normalized_embedding(image1, faces1[0].landmarks)
|
||||
embedding2 = recognizer.get_normalized_embedding(image2, faces2[0].landmarks)
|
||||
|
||||
# Compare faces
|
||||
similarity = compute_similarity(embedding1, embedding2)
|
||||
print(f"Similarity: {similarity:.4f}")
|
||||
```
|
||||
|
||||
### Facial Landmarks
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace, Landmark106
|
||||
|
||||
detector = RetinaFace()
|
||||
landmarker = Landmark106()
|
||||
|
||||
faces = detector.detect(image)
|
||||
landmarks = landmarker.get_landmarks(image, faces[0].bbox)
|
||||
# Returns 106 (x, y) landmark points
|
||||
```
|
||||
|
||||
### Age & Gender Detection
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace, AgeGender
|
||||
|
||||
detector = RetinaFace()
|
||||
age_gender = AgeGender()
|
||||
|
||||
faces = detector.detect(image)
|
||||
result = age_gender.predict(image, faces[0].bbox)
|
||||
print(f"{result.sex}, {result.age} years old")
|
||||
# result.gender: 0=Female, 1=Male
|
||||
# result.sex: "Female" or "Male"
|
||||
# result.age: age in years
|
||||
```
|
||||
|
||||
### FairFace Attributes (Race, Gender, Age Group)
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace, FairFace
|
||||
|
||||
detector = RetinaFace()
|
||||
fairface = FairFace()
|
||||
|
||||
faces = detector.detect(image)
|
||||
result = fairface.predict(image, faces[0].bbox)
|
||||
print(f"{result.sex}, {result.age_group}, {result.race}")
|
||||
# result.gender: 0=Female, 1=Male
|
||||
# result.sex: "Female" or "Male"
|
||||
# result.age_group: "20-29", "30-39", etc.
|
||||
# result.race: "East Asian", "White", etc.
|
||||
```
|
||||
|
||||
### Gaze Estimation
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace, MobileGaze
|
||||
from uniface.visualization import draw_gaze
|
||||
import numpy as np
|
||||
|
||||
detector = RetinaFace()
|
||||
gaze_estimator = MobileGaze()
|
||||
|
||||
faces = detector.detect(image)
|
||||
for face in faces:
|
||||
x1, y1, x2, y2 = map(int, face.bbox[:4])
|
||||
face_crop = image[y1:y2, x1:x2]
|
||||
|
||||
result = gaze_estimator.estimate(face_crop)
|
||||
print(f"Gaze: pitch={np.degrees(result.pitch):.1f}°, yaw={np.degrees(result.yaw):.1f}°")
|
||||
|
||||
# Visualize
|
||||
draw_gaze(image, face.bbox, result.pitch, result.yaw)
|
||||
```
|
||||
|
||||
### Face Parsing
|
||||
|
||||
```python
|
||||
from uniface.parsing import BiSeNet
|
||||
from uniface.visualization import vis_parsing_maps
|
||||
|
||||
# Initialize parser
|
||||
parser = BiSeNet() # Uses ResNet18 by default
|
||||
|
||||
# Parse face image (already cropped)
|
||||
mask = parser.parse(face_image)
|
||||
|
||||
# Visualize with overlay
|
||||
import cv2
|
||||
face_rgb = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
|
||||
vis_result = vis_parsing_maps(face_rgb, mask, save_image=False)
|
||||
|
||||
# mask contains 19 classes: skin, eyes, nose, mouth, hair, etc.
|
||||
print(f"Unique classes: {len(np.unique(mask))}")
|
||||
```
|
||||
|
||||
### Face Anti-Spoofing
|
||||
|
||||
Detect if a face is real or fake (photo, video replay, mask):
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
from uniface.spoofing import MiniFASNet
|
||||
|
||||
detector = RetinaFace()
|
||||
spoofer = MiniFASNet() # Uses V2 by default
|
||||
|
||||
faces = detector.detect(image)
|
||||
for face in faces:
|
||||
result = spoofer.predict(image, face.bbox)
|
||||
# result.is_real: True for real, False for fake
|
||||
# result.confidence: confidence score
|
||||
label = 'Real' if result.is_real else 'Fake'
|
||||
print(f"{label}: {result.confidence:.1%}")
|
||||
```
|
||||
|
||||
### Face Anonymization
|
||||
|
||||
Protect privacy by blurring or pixelating faces with 5 different methods:
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
from uniface.privacy import BlurFace, anonymize_faces
|
||||
import cv2
|
||||
|
||||
# Method 1: One-liner with automatic detection
|
||||
image = cv2.imread("photo.jpg")
|
||||
anonymized = anonymize_faces(image, method='pixelate')
|
||||
cv2.imwrite("anonymized.jpg", anonymized)
|
||||
|
||||
# Method 2: Manual control with custom parameters
|
||||
detector = RetinaFace()
|
||||
blurrer = BlurFace(method='gaussian', blur_strength=5.0)
|
||||
|
||||
faces = detector.detect(image)
|
||||
anonymized = blurrer.anonymize(image, faces)
|
||||
|
||||
# Available blur methods:
|
||||
methods = {
|
||||
'pixelate': BlurFace(method='pixelate', pixel_blocks=10), # Blocky effect (news media standard)
|
||||
'gaussian': BlurFace(method='gaussian', blur_strength=3.0), # Smooth, natural blur
|
||||
'blackout': BlurFace(method='blackout', color=(0, 0, 0)), # Solid color boxes (maximum privacy)
|
||||
'elliptical': BlurFace(method='elliptical', margin=20), # Soft oval blur (natural face shape)
|
||||
'median': BlurFace(method='median', blur_strength=3.0) # Edge-preserving blur
|
||||
}
|
||||
|
||||
# Real-time webcam anonymization
|
||||
cap = cv2.VideoCapture(0)
|
||||
detector = RetinaFace()
|
||||
blurrer = BlurFace(method='pixelate')
|
||||
|
||||
while True:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
faces = detector.detect(frame)
|
||||
frame = blurrer.anonymize(frame, faces, inplace=True)
|
||||
|
||||
cv2.imshow('Anonymized', frame)
|
||||
if cv2.waitKey(1) & 0xFF == ord('q'):
|
||||
break
|
||||
|
||||
cap.release()
|
||||
cv2.destroyAllWindows()
|
||||
for face in faces:
|
||||
print(f"Confidence: {face.confidence:.2f}")
|
||||
print(f"BBox: {face.bbox}")
|
||||
print(f"Landmarks: {face.landmarks.shape}")
|
||||
```
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Documentation
|
||||
|
||||
- [**QUICKSTART.md**](QUICKSTART.md) - 5-minute getting started guide
|
||||
- [**MODELS.md**](MODELS.md) - Model zoo, benchmarks, and selection guide
|
||||
- [**Examples**](examples/) - Jupyter notebooks with detailed examples
|
||||
|
||||
---
|
||||
|
||||
## API Overview
|
||||
|
||||
### Factory Functions (Recommended)
|
||||
|
||||
```python
|
||||
from uniface.detection import RetinaFace, SCRFD
|
||||
from uniface.recognition import ArcFace
|
||||
from uniface.landmark import Landmark106
|
||||
from uniface.privacy import BlurFace, anonymize_faces
|
||||
|
||||
from uniface.constants import SCRFDWeights
|
||||
|
||||
# Create detector with default settings
|
||||
detector = RetinaFace()
|
||||
|
||||
# Create with custom config
|
||||
detector = SCRFD(
|
||||
model_name=SCRFDWeights.SCRFD_10G_KPS, # SCRFDWeights.SCRFD_500M_KPS
|
||||
confidence_threshold=0.4,
|
||||
input_size=(640, 640)
|
||||
)
|
||||
# Or with defaults settings: detector = SCRFD()
|
||||
|
||||
# Recognition and landmarks
|
||||
recognizer = ArcFace()
|
||||
landmarker = Landmark106()
|
||||
```
|
||||
|
||||
### Direct Model Instantiation
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace, SCRFD, YOLOv5Face, ArcFace, MobileFace, SphereFace
|
||||
from uniface.constants import RetinaFaceWeights, YOLOv5FaceWeights
|
||||
|
||||
# Detection
|
||||
detector = RetinaFace(
|
||||
model_name=RetinaFaceWeights.MNET_V2,
|
||||
confidence_threshold=0.5,
|
||||
nms_threshold=0.4
|
||||
)
|
||||
# Or detector = RetinaFace()
|
||||
|
||||
# YOLOv5-Face detection
|
||||
detector = YOLOv5Face(
|
||||
model_name=YOLOv5FaceWeights.YOLOV5S,
|
||||
confidence_threshold=0.6,
|
||||
nms_threshold=0.5
|
||||
)
|
||||
# Or detector = YOLOv5Face
|
||||
|
||||
# Recognition
|
||||
recognizer = ArcFace() # Uses default weights
|
||||
recognizer = MobileFace() # Lightweight alternative
|
||||
recognizer = SphereFace() # Angular softmax alternative
|
||||
```
|
||||
|
||||
### High-Level Detection API
|
||||
|
||||
```python
|
||||
from uniface import detect_faces
|
||||
|
||||
# One-line face detection
|
||||
faces = detect_faces(image, method='retinaface', confidence_threshold=0.8) # methods: retinaface, scrfd, yolov5face
|
||||
```
|
||||
|
||||
### Key Parameters (quick reference)
|
||||
|
||||
**Detection**
|
||||
|
||||
| Class | Key params (defaults) | Notes |
|
||||
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------- |
|
||||
| `RetinaFace` | `model_name=RetinaFaceWeights.MNET_V2`, `confidence_threshold=0.5`, `nms_threshold=0.4`, `input_size=(640, 640)`, `dynamic_size=False` | Supports 5-point landmarks |
|
||||
| `SCRFD` | `model_name=SCRFDWeights.SCRFD_10G_KPS`, `confidence_threshold=0.5`, `nms_threshold=0.4`, `input_size=(640, 640)` | Supports 5-point landmarks |
|
||||
| `YOLOv5Face` | `model_name=YOLOv5FaceWeights.YOLOV5S`, `confidence_threshold=0.6`, `nms_threshold=0.5`, `input_size=640` (fixed) | Supports 5-point landmarks; models: YOLOV5N/S/M; `input_size` must be 640 |
|
||||
|
||||
**Recognition**
|
||||
|
||||
| Class | Key params (defaults) | Notes |
|
||||
| -------------- | ----------------------------------------- | ------------------------------------- |
|
||||
| `ArcFace` | `model_name=ArcFaceWeights.MNET` | Returns 512-dim normalized embeddings |
|
||||
| `MobileFace` | `model_name=MobileFaceWeights.MNET_V2` | Lightweight embeddings |
|
||||
| `SphereFace` | `model_name=SphereFaceWeights.SPHERE20` | Angular softmax variant |
|
||||
|
||||
**Landmark & Attributes**
|
||||
|
||||
| Class | Key params (defaults) | Notes |
|
||||
| --------------- | --------------------------------------------------------------------- | --------------------------------------- |
|
||||
| `Landmark106` | No required params | 106-point landmarks |
|
||||
| `AgeGender` | `model_name=AgeGenderWeights.DEFAULT`; `input_size` auto-detected | Returns `AttributeResult` with gender, age |
|
||||
| `FairFace` | `model_name=FairFaceWeights.DEFAULT`, `input_size=(224, 224)` | Returns `AttributeResult` with gender, age_group, race |
|
||||
| `Emotion` | `model_weights=DDAMFNWeights.AFFECNET7`, `input_size=(112, 112)` | Requires 5-point landmarks; TorchScript |
|
||||
|
||||
**Gaze Estimation**
|
||||
|
||||
| Class | Key params (defaults) | Notes |
|
||||
| ------------- | ------------------------------------------ | ------------------------------------ |
|
||||
| `MobileGaze` | `model_name=GazeWeights.RESNET34` | Returns `GazeResult(pitch, yaw)` in radians; trained on Gaze360 |
|
||||
|
||||
**Face Parsing**
|
||||
|
||||
| Class | Key params (defaults) | Notes |
|
||||
| ---------- | ---------------------------------------- | ------------------------------------ |
|
||||
| `BiSeNet` | `model_name=ParsingWeights.RESNET18`, `input_size=(512, 512)` | 19 facial component classes; BiSeNet architecture with ResNet backbone |
|
||||
|
||||
**Anti-Spoofing**
|
||||
|
||||
| Class | Key params (defaults) | Notes |
|
||||
| ------------- | ----------------------------------------- | ------------------------------------ |
|
||||
| `MiniFASNet` | `model_name=MiniFASNetWeights.V2` | Returns `SpoofingResult(is_real, confidence)` |
|
||||
|
||||
---
|
||||
|
||||
## Model Performance
|
||||
|
||||
### Face Detection (WIDER FACE Dataset)
|
||||
|
||||
| Model | Easy | Medium | Hard | Use Case |
|
||||
| ------------------ | ------ | ------ | ------ | ---------------------- |
|
||||
| retinaface_mnet025 | 88.48% | 87.02% | 80.61% | Mobile/Edge devices |
|
||||
| retinaface_mnet_v2 | 91.70% | 91.03% | 86.60% | Balanced (recommended) |
|
||||
| retinaface_r34 | 94.16% | 93.12% | 88.90% | High accuracy |
|
||||
| scrfd_500m | 90.57% | 88.12% | 68.51% | Real-time applications |
|
||||
| scrfd_10g | 95.16% | 93.87% | 83.05% | Best accuracy/speed |
|
||||
| yolov5n_face | 93.61% | 91.52% | 80.53% | Lightweight/Mobile |
|
||||
| yolov5s_face | 94.33% | 92.61% | 83.15% | Real-time + accuracy |
|
||||
| yolov5m_face | 95.30% | 93.76% | 85.28% | High accuracy |
|
||||
|
||||
_Accuracy values from original papers: [RetinaFace](https://arxiv.org/abs/1905.00641), [SCRFD](https://arxiv.org/abs/2105.04714), [YOLOv5-Face](https://arxiv.org/abs/2105.12931)_
|
||||
|
||||
**Benchmark on your hardware:**
|
||||
|
||||
```bash
|
||||
python tools/detection.py --source assets/test.jpg --iterations 100
|
||||
```
|
||||
|
||||
See [MODELS.md](MODELS.md) for detailed model information and selection guide.
|
||||
|
||||
<div align="center">
|
||||
<img src="assets/test_result.png">
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
## Examples
|
||||
## Documentation
|
||||
|
||||
📚 **Full documentation**: [yakhyo.github.io/uniface](https://yakhyo.github.io/uniface/)
|
||||
|
||||
| Resource | Description |
|
||||
|----------|-------------|
|
||||
| [Quickstart](https://yakhyo.github.io/uniface/quickstart/) | Get up and running in 5 minutes |
|
||||
| [Model Zoo](https://yakhyo.github.io/uniface/models/) | All models, benchmarks, and selection guide |
|
||||
| [API Reference](https://yakhyo.github.io/uniface/modules/detection/) | Detailed module documentation |
|
||||
| [Tutorials](https://yakhyo.github.io/uniface/recipes/image-pipeline/) | Step-by-step workflow examples |
|
||||
| [Guides](https://yakhyo.github.io/uniface/concepts/overview/) | Architecture and design principles |
|
||||
|
||||
### Jupyter Notebooks
|
||||
|
||||
Interactive examples covering common face analysis tasks:
|
||||
|
||||
| Example | Description | Notebook |
|
||||
|---------|-------------|----------|
|
||||
| **Face Detection** | Detect faces and facial landmarks | [01_face_detection.ipynb](examples/01_face_detection.ipynb) |
|
||||
| **Face Alignment** | Align and crop faces for recognition | [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) |
|
||||
| **Face Verification** | Compare two faces to verify identity | [03_face_verification.ipynb](examples/03_face_verification.ipynb) |
|
||||
| **Face Search** | Find a person in a group photo | [04_face_search.ipynb](examples/04_face_search.ipynb) |
|
||||
| **Face Analyzer** | All-in-one detection, recognition & attributes | [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) |
|
||||
| **Face Parsing** | Segment face into semantic components | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
|
||||
| **Face Anonymization** | Blur or pixelate faces for privacy protection | [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) |
|
||||
| **Gaze Estimation** | Estimate gaze direction from face images | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
|
||||
|
||||
### Webcam Face Detection
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.visualization import draw_detections
|
||||
|
||||
detector = RetinaFace()
|
||||
cap = cv2.VideoCapture(0)
|
||||
|
||||
while True:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
faces = detector.detect(frame)
|
||||
|
||||
# Extract data for visualization
|
||||
bboxes = [f.bbox for f in faces]
|
||||
scores = [f.confidence for f in faces]
|
||||
landmarks = [f.landmarks for f in faces]
|
||||
|
||||
draw_detections(
|
||||
image=frame,
|
||||
bboxes=bboxes,
|
||||
scores=scores,
|
||||
landmarks=landmarks,
|
||||
vis_threshold=0.6,
|
||||
)
|
||||
|
||||
cv2.imshow("Face Detection", frame)
|
||||
if cv2.waitKey(1) & 0xFF == ord('q'):
|
||||
break
|
||||
|
||||
cap.release()
|
||||
cv2.destroyAllWindows()
|
||||
```
|
||||
|
||||
### Face Search System
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
from uniface import RetinaFace, ArcFace
|
||||
|
||||
detector = RetinaFace()
|
||||
recognizer = ArcFace()
|
||||
|
||||
# Build face database
|
||||
database = {}
|
||||
for person_id, image_path in person_images.items():
|
||||
image = cv2.imread(image_path)
|
||||
faces = detector.detect(image)
|
||||
if faces:
|
||||
embedding = recognizer.get_normalized_embedding(
|
||||
image, faces[0].landmarks
|
||||
)
|
||||
database[person_id] = embedding
|
||||
|
||||
# Search for a face
|
||||
query_image = cv2.imread("query.jpg")
|
||||
query_faces = detector.detect(query_image)
|
||||
if query_faces:
|
||||
query_embedding = recognizer.get_normalized_embedding(
|
||||
query_image, query_faces[0].landmarks
|
||||
)
|
||||
|
||||
# Find best match
|
||||
best_match = None
|
||||
best_similarity = -1
|
||||
|
||||
for person_id, db_embedding in database.items():
|
||||
similarity = np.dot(query_embedding, db_embedding.T)[0][0]
|
||||
if similarity > best_similarity:
|
||||
best_similarity = similarity
|
||||
best_match = person_id
|
||||
|
||||
print(f"Best match: {best_match} (similarity: {best_similarity:.4f})")
|
||||
```
|
||||
|
||||
More examples in the [examples/](examples/) directory.
|
||||
|
||||
---
|
||||
|
||||
## Advanced Configuration
|
||||
|
||||
### Custom ONNX Runtime Providers
|
||||
|
||||
```python
|
||||
from uniface.onnx_utils import get_available_providers, create_onnx_session
|
||||
|
||||
# Check available providers
|
||||
providers = get_available_providers()
|
||||
print(f"Available: {providers}")
|
||||
|
||||
# Force CPU-only execution
|
||||
from uniface import RetinaFace
|
||||
detector = RetinaFace()
|
||||
# Internally uses create_onnx_session() which auto-selects best provider
|
||||
```
|
||||
|
||||
### Model Download and Caching
|
||||
|
||||
Models are automatically downloaded on first use and cached in `~/.uniface/models/`.
|
||||
|
||||
```python
|
||||
from uniface.model_store import verify_model_weights
|
||||
from uniface.constants import RetinaFaceWeights
|
||||
|
||||
# Manually download and verify a model
|
||||
model_path = verify_model_weights(
|
||||
RetinaFaceWeights.MNET_V2,
|
||||
root='./custom_models' # Custom cache directory
|
||||
)
|
||||
```
|
||||
|
||||
### Logging Configuration
|
||||
|
||||
```python
|
||||
from uniface import Logger
|
||||
import logging
|
||||
|
||||
# Set logging level
|
||||
Logger.setLevel(logging.DEBUG) # DEBUG, INFO, WARNING, ERROR
|
||||
|
||||
# Disable logging
|
||||
Logger.setLevel(logging.CRITICAL)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
|
||||
```bash
|
||||
# Run all tests
|
||||
pytest
|
||||
|
||||
# Run with coverage
|
||||
pytest --cov=uniface --cov-report=html
|
||||
|
||||
# Run specific test file
|
||||
pytest tests/test_retinaface.py -v
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Development
|
||||
|
||||
### Setup Development Environment
|
||||
|
||||
```bash
|
||||
git clone https://github.com/yakhyo/uniface.git
|
||||
cd uniface
|
||||
|
||||
# Install in editable mode with dev dependencies
|
||||
pip install -e ".[dev]"
|
||||
|
||||
# Run tests
|
||||
pytest
|
||||
```
|
||||
|
||||
### Code Formatting
|
||||
|
||||
This project uses [Ruff](https://docs.astral.sh/ruff/) for linting and formatting.
|
||||
|
||||
```bash
|
||||
# Format code
|
||||
ruff format .
|
||||
|
||||
# Check for linting errors
|
||||
ruff check .
|
||||
|
||||
# Auto-fix linting errors
|
||||
ruff check . --fix
|
||||
```
|
||||
|
||||
Ruff configuration is in `pyproject.toml`. Key settings:
|
||||
|
||||
- Line length: 120
|
||||
- Python target: 3.10+
|
||||
- Import sorting: `uniface` as first-party
|
||||
|
||||
### Project Structure
|
||||
|
||||
```
|
||||
uniface/
|
||||
├── uniface/
|
||||
│ ├── detection/ # Face detection models
|
||||
│ ├── recognition/ # Face recognition models
|
||||
│ ├── landmark/ # Landmark detection
|
||||
│ ├── parsing/ # Face parsing
|
||||
│ ├── gaze/ # Gaze estimation
|
||||
│ ├── attribute/ # Age, gender, emotion
|
||||
│ ├── spoofing/ # Face anti-spoofing
|
||||
│ ├── privacy/ # Face anonymization & blurring
|
||||
│ ├── onnx_utils.py # ONNX Runtime utilities
|
||||
│ ├── model_store.py # Model download & caching
|
||||
│ └── visualization.py # Drawing utilities
|
||||
├── tests/ # Unit tests
|
||||
├── examples/ # Example notebooks
|
||||
└── tools/ # CLI utilities
|
||||
```
|
||||
| Example | Colab | Description |
|
||||
|---------|:-----:|-------------|
|
||||
| [01_face_detection.ipynb](examples/01_face_detection.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/01_face_detection.ipynb) | Face detection and landmarks |
|
||||
| [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/02_face_alignment.ipynb) | Face alignment for recognition |
|
||||
| [03_face_verification.ipynb](examples/03_face_verification.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/03_face_verification.ipynb) | Compare faces for identity |
|
||||
| [04_face_search.ipynb](examples/04_face_search.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/04_face_search.ipynb) | Find a person in group photos |
|
||||
| [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/05_face_analyzer.ipynb) | All-in-one analysis |
|
||||
| [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | Semantic face segmentation |
|
||||
| [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | Privacy-preserving blur |
|
||||
| [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | Gaze direction estimation |
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- **RetinaFace Training**: [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) - PyTorch implementation and training code
|
||||
- **YOLOv5-Face ONNX**: [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) - ONNX inference implementation
|
||||
- **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) - ArcFace, MobileFace, SphereFace training code
|
||||
- **Face Parsing Training**: [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) - BiSeNet face parsing training code and pretrained weights
|
||||
- **Gaze Estimation Training**: [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) - MobileGaze training code and pretrained weights
|
||||
- **Face Anti-Spoofing**: [yakhyo/face-anti-spoofing](https://github.com/yakhyo/face-anti-spoofing) - MiniFASNet ONNX inference (weights from [minivision-ai/Silent-Face-Anti-Spoofing](https://github.com/minivision-ai/Silent-Face-Anti-Spoofing))
|
||||
- **FairFace**: [yakhyo/fairface-onnx](https://github.com/yakhyo/fairface-onnx) - FairFace ONNX inference for race, gender, age prediction
|
||||
- **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface) - Model architectures and pretrained weights
|
||||
- [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) — RetinaFace training
|
||||
- [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) — YOLOv5-Face ONNX
|
||||
- [yakhyo/yolov8-face-onnx-inference](https://github.com/yakhyo/yolov8-face-onnx-inference) — YOLOv8-Face ONNX
|
||||
- [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) — ArcFace, MobileFace, SphereFace
|
||||
- [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) — BiSeNet face parsing
|
||||
- [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) — MobileGaze training
|
||||
- [yakhyo/face-anti-spoofing](https://github.com/yakhyo/face-anti-spoofing) — MiniFASNet inference
|
||||
- [yakhyo/fairface-onnx](https://github.com/yakhyo/fairface-onnx) — FairFace attributes
|
||||
- [deepinsight/insightface](https://github.com/deepinsight/insightface) — Model architectures
|
||||
|
||||
---
|
||||
|
||||
## Contributing
|
||||
|
||||
Contributions are welcome! Please open an issue or submit a pull request on [GitHub](https://github.com/yakhyo/uniface).
|
||||
Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
|
||||
|
||||
## License
|
||||
|
||||
This project is licensed under the [MIT License](LICENSE).
|
||||
|
||||
BIN
docs/assets/logo.png
Normal file
BIN
docs/assets/logo.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 33 KiB |
BIN
docs/assets/logo.webp
Normal file
BIN
docs/assets/logo.webp
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 33 KiB |
191
docs/concepts/coordinate-systems.md
Normal file
191
docs/concepts/coordinate-systems.md
Normal file
@@ -0,0 +1,191 @@
|
||||
# Coordinate Systems
|
||||
|
||||
This page explains the coordinate formats used in UniFace.
|
||||
|
||||
---
|
||||
|
||||
## Image Coordinates
|
||||
|
||||
All coordinates use **pixel-based, top-left origin**:
|
||||
|
||||
```
|
||||
(0, 0) ────────────────► x (width)
|
||||
│
|
||||
│ Image
|
||||
│
|
||||
▼
|
||||
y (height)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Bounding Box Format
|
||||
|
||||
Bounding boxes use `[x1, y1, x2, y2]` format (top-left and bottom-right corners):
|
||||
|
||||
```
|
||||
(x1, y1) ─────────────────┐
|
||||
│ │
|
||||
│ Face │
|
||||
│ │
|
||||
└─────────────────────┘ (x2, y2)
|
||||
```
|
||||
|
||||
### Accessing Coordinates
|
||||
|
||||
```python
|
||||
face = faces[0]
|
||||
|
||||
# Direct access
|
||||
x1, y1, x2, y2 = face.bbox
|
||||
|
||||
# As properties
|
||||
bbox_xyxy = face.bbox_xyxy # [x1, y1, x2, y2]
|
||||
bbox_xywh = face.bbox_xywh # [x1, y1, width, height]
|
||||
```
|
||||
|
||||
### Conversion
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
# xyxy → xywh
|
||||
def xyxy_to_xywh(bbox):
|
||||
x1, y1, x2, y2 = bbox
|
||||
return np.array([x1, y1, x2 - x1, y2 - y1])
|
||||
|
||||
# xywh → xyxy
|
||||
def xywh_to_xyxy(bbox):
|
||||
x, y, w, h = bbox
|
||||
return np.array([x, y, x + w, y + h])
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Landmarks
|
||||
|
||||
### 5-Point Landmarks (Detection)
|
||||
|
||||
Returned by all detection models:
|
||||
|
||||
```python
|
||||
landmarks = face.landmarks # Shape: (5, 2)
|
||||
```
|
||||
|
||||
| Index | Point |
|
||||
|-------|-------|
|
||||
| 0 | Left Eye |
|
||||
| 1 | Right Eye |
|
||||
| 2 | Nose Tip |
|
||||
| 3 | Left Mouth Corner |
|
||||
| 4 | Right Mouth Corner |
|
||||
|
||||
```
|
||||
0 ● ● 1
|
||||
|
||||
● 2
|
||||
|
||||
3 ● ● 4
|
||||
```
|
||||
|
||||
### 106-Point Landmarks
|
||||
|
||||
Returned by `Landmark106`:
|
||||
|
||||
```python
|
||||
from uniface import Landmark106
|
||||
|
||||
landmarker = Landmark106()
|
||||
landmarks = landmarker.get_landmarks(image, face.bbox)
|
||||
# Shape: (106, 2)
|
||||
```
|
||||
|
||||
**Landmark Groups:**
|
||||
|
||||
| Range | Group | Points |
|
||||
|-------|-------|--------|
|
||||
| 0-32 | Face Contour | 33 |
|
||||
| 33-50 | Eyebrows | 18 |
|
||||
| 51-62 | Nose | 12 |
|
||||
| 63-86 | Eyes | 24 |
|
||||
| 87-105 | Mouth | 19 |
|
||||
|
||||
---
|
||||
|
||||
## Face Crop
|
||||
|
||||
To crop a face from an image:
|
||||
|
||||
```python
|
||||
def crop_face(image, bbox, margin=0):
|
||||
"""Crop face with optional margin."""
|
||||
h, w = image.shape[:2]
|
||||
x1, y1, x2, y2 = map(int, bbox)
|
||||
|
||||
# Add margin
|
||||
if margin > 0:
|
||||
bw, bh = x2 - x1, y2 - y1
|
||||
x1 = max(0, x1 - int(bw * margin))
|
||||
y1 = max(0, y1 - int(bh * margin))
|
||||
x2 = min(w, x2 + int(bw * margin))
|
||||
y2 = min(h, y2 + int(bh * margin))
|
||||
|
||||
return image[y1:y2, x1:x2]
|
||||
|
||||
# Usage
|
||||
face_crop = crop_face(image, face.bbox, margin=0.1)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Gaze Angles
|
||||
|
||||
Gaze estimation returns pitch and yaw in **radians**:
|
||||
|
||||
```python
|
||||
result = gaze_estimator.estimate(face_crop)
|
||||
|
||||
# Angles in radians
|
||||
pitch = result.pitch # Vertical: + = up, - = down
|
||||
yaw = result.yaw # Horizontal: + = right, - = left
|
||||
|
||||
# Convert to degrees
|
||||
import numpy as np
|
||||
pitch_deg = np.degrees(pitch)
|
||||
yaw_deg = np.degrees(yaw)
|
||||
```
|
||||
|
||||
**Angle Reference:**
|
||||
|
||||
```
|
||||
pitch = +90° (up)
|
||||
│
|
||||
│
|
||||
yaw = -90° ────┼──── yaw = +90°
|
||||
(left) │ (right)
|
||||
│
|
||||
pitch = -90° (down)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Face Alignment
|
||||
|
||||
Face alignment uses 5-point landmarks to normalize face orientation:
|
||||
|
||||
```python
|
||||
from uniface import face_alignment
|
||||
|
||||
# Align face to standard template
|
||||
aligned_face = face_alignment(image, face.landmarks)
|
||||
# Output: 112x112 aligned face image
|
||||
```
|
||||
|
||||
The alignment transforms faces to a canonical pose for better recognition accuracy.
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Inputs & Outputs](inputs-outputs.md) - Data types reference
|
||||
- [Recognition Module](../modules/recognition.md) - Face recognition details
|
||||
204
docs/concepts/execution-providers.md
Normal file
204
docs/concepts/execution-providers.md
Normal file
@@ -0,0 +1,204 @@
|
||||
# Execution Providers
|
||||
|
||||
UniFace uses ONNX Runtime for model inference, which supports multiple hardware acceleration backends.
|
||||
|
||||
---
|
||||
|
||||
## Automatic Provider Selection
|
||||
|
||||
UniFace automatically selects the optimal execution provider based on available hardware:
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
|
||||
# Automatically uses best available provider
|
||||
detector = RetinaFace()
|
||||
```
|
||||
|
||||
**Priority order:**
|
||||
|
||||
1. **CUDAExecutionProvider** - NVIDIA GPU
|
||||
2. **CoreMLExecutionProvider** - Apple Silicon
|
||||
3. **CPUExecutionProvider** - Fallback
|
||||
|
||||
---
|
||||
|
||||
## Check Available Providers
|
||||
|
||||
```python
|
||||
import onnxruntime as ort
|
||||
|
||||
providers = ort.get_available_providers()
|
||||
print("Available providers:", providers)
|
||||
```
|
||||
|
||||
**Example outputs:**
|
||||
|
||||
=== "macOS (Apple Silicon)"
|
||||
|
||||
```
|
||||
['CoreMLExecutionProvider', 'CPUExecutionProvider']
|
||||
```
|
||||
|
||||
=== "Linux (NVIDIA GPU)"
|
||||
|
||||
```
|
||||
['CUDAExecutionProvider', 'CPUExecutionProvider']
|
||||
```
|
||||
|
||||
=== "Windows (CPU)"
|
||||
|
||||
```
|
||||
['CPUExecutionProvider']
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Platform-Specific Setup
|
||||
|
||||
### Apple Silicon (M1/M2/M3/M4)
|
||||
|
||||
No additional setup required. ARM64 optimizations are built into `onnxruntime`:
|
||||
|
||||
```bash
|
||||
pip install uniface
|
||||
```
|
||||
|
||||
Verify ARM64:
|
||||
|
||||
```bash
|
||||
python -c "import platform; print(platform.machine())"
|
||||
# Should show: arm64
|
||||
```
|
||||
|
||||
!!! tip "Performance"
|
||||
Apple Silicon Macs use CoreML acceleration automatically, providing excellent performance for face analysis tasks.
|
||||
|
||||
---
|
||||
|
||||
### NVIDIA GPU (CUDA)
|
||||
|
||||
Install with GPU support:
|
||||
|
||||
```bash
|
||||
pip install uniface[gpu]
|
||||
```
|
||||
|
||||
**Requirements:**
|
||||
|
||||
- CUDA 11.x or 12.x
|
||||
- cuDNN 8.x
|
||||
- Compatible NVIDIA driver
|
||||
|
||||
Verify CUDA:
|
||||
|
||||
```python
|
||||
import onnxruntime as ort
|
||||
|
||||
if 'CUDAExecutionProvider' in ort.get_available_providers():
|
||||
print("CUDA is available!")
|
||||
else:
|
||||
print("CUDA not available, using CPU")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### CPU Fallback
|
||||
|
||||
CPU execution is always available:
|
||||
|
||||
```bash
|
||||
pip install uniface
|
||||
```
|
||||
|
||||
Works on all platforms without additional configuration.
|
||||
|
||||
---
|
||||
|
||||
## Internal API
|
||||
|
||||
For advanced use cases, you can access the provider utilities:
|
||||
|
||||
```python
|
||||
from uniface.onnx_utils import get_available_providers, create_onnx_session
|
||||
|
||||
# Check available providers
|
||||
providers = get_available_providers()
|
||||
print(f"Available: {providers}")
|
||||
|
||||
# Models use create_onnx_session() internally
|
||||
# which auto-selects the best provider
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Performance Tips
|
||||
|
||||
### 1. Use GPU When Available
|
||||
|
||||
For batch processing or real-time applications, GPU acceleration provides significant speedups:
|
||||
|
||||
```bash
|
||||
pip install uniface[gpu]
|
||||
```
|
||||
|
||||
### 2. Optimize Input Size
|
||||
|
||||
Smaller input sizes are faster but may reduce accuracy:
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
|
||||
# Faster, lower accuracy
|
||||
detector = RetinaFace(input_size=(320, 320))
|
||||
|
||||
# Balanced (default)
|
||||
detector = RetinaFace(input_size=(640, 640))
|
||||
```
|
||||
|
||||
### 3. Batch Processing
|
||||
|
||||
Process multiple images to maximize GPU utilization:
|
||||
|
||||
```python
|
||||
# Process images in batch (GPU-efficient)
|
||||
for image_path in image_paths:
|
||||
image = cv2.imread(image_path)
|
||||
faces = detector.detect(image)
|
||||
# ...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### CUDA Not Detected
|
||||
|
||||
1. Verify CUDA installation:
|
||||
```bash
|
||||
nvidia-smi
|
||||
```
|
||||
|
||||
2. Check CUDA version compatibility with ONNX Runtime
|
||||
|
||||
3. Reinstall with GPU support:
|
||||
```bash
|
||||
pip uninstall onnxruntime onnxruntime-gpu
|
||||
pip install uniface[gpu]
|
||||
```
|
||||
|
||||
### Slow Performance on Mac
|
||||
|
||||
Verify you're using ARM64 Python (not Rosetta):
|
||||
|
||||
```bash
|
||||
python -c "import platform; print(platform.machine())"
|
||||
# Should show: arm64 (not x86_64)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Model Cache & Offline](model-cache-offline.md) - Model management
|
||||
- [Thresholds & Calibration](thresholds-calibration.md) - Tuning parameters
|
||||
218
docs/concepts/inputs-outputs.md
Normal file
218
docs/concepts/inputs-outputs.md
Normal file
@@ -0,0 +1,218 @@
|
||||
# Inputs & Outputs
|
||||
|
||||
This page describes the data types used throughout UniFace.
|
||||
|
||||
---
|
||||
|
||||
## Input: Images
|
||||
|
||||
All models accept NumPy arrays in **BGR format** (OpenCV default):
|
||||
|
||||
```python
|
||||
import cv2
|
||||
|
||||
# Load image (BGR format)
|
||||
image = cv2.imread("photo.jpg")
|
||||
print(f"Shape: {image.shape}") # (H, W, 3)
|
||||
print(f"Dtype: {image.dtype}") # uint8
|
||||
```
|
||||
|
||||
!!! warning "Color Format"
|
||||
UniFace expects **BGR** format (OpenCV default). If using PIL or other libraries, convert first:
|
||||
|
||||
```python
|
||||
from PIL import Image
|
||||
import numpy as np
|
||||
|
||||
pil_image = Image.open("photo.jpg")
|
||||
bgr_image = np.array(pil_image)[:, :, ::-1] # RGB → BGR
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Output: Face Dataclass
|
||||
|
||||
Detection returns a list of `Face` objects:
|
||||
|
||||
```python
|
||||
from dataclasses import dataclass
|
||||
import numpy as np
|
||||
|
||||
@dataclass
|
||||
class Face:
|
||||
# Required (from detection)
|
||||
bbox: np.ndarray # [x1, y1, x2, y2]
|
||||
confidence: float # 0.0 to 1.0
|
||||
landmarks: np.ndarray # (5, 2) or (106, 2)
|
||||
|
||||
# Optional (enriched by analyzers)
|
||||
embedding: np.ndarray | None = None
|
||||
gender: int | None = None # 0=Female, 1=Male
|
||||
age: int | None = None # Years
|
||||
age_group: str | None = None # "20-29", etc.
|
||||
race: str | None = None # "East Asian", etc.
|
||||
emotion: str | None = None # "Happy", etc.
|
||||
emotion_confidence: float | None = None
|
||||
```
|
||||
|
||||
### Properties
|
||||
|
||||
```python
|
||||
face = faces[0]
|
||||
|
||||
# Bounding box formats
|
||||
face.bbox_xyxy # [x1, y1, x2, y2] - same as bbox
|
||||
face.bbox_xywh # [x1, y1, width, height]
|
||||
|
||||
# Gender as string
|
||||
face.sex # "Female" or "Male" (None if not predicted)
|
||||
```
|
||||
|
||||
### Methods
|
||||
|
||||
```python
|
||||
# Compute similarity with another face
|
||||
similarity = face1.compute_similarity(face2)
|
||||
|
||||
# Convert to dictionary
|
||||
face_dict = face.to_dict()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Result Types
|
||||
|
||||
### GazeResult
|
||||
|
||||
```python
|
||||
from dataclasses import dataclass
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class GazeResult:
|
||||
pitch: float # Vertical angle (radians), + = up
|
||||
yaw: float # Horizontal angle (radians), + = right
|
||||
```
|
||||
|
||||
**Usage:**
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
result = gaze_estimator.estimate(face_crop)
|
||||
print(f"Pitch: {np.degrees(result.pitch):.1f}°")
|
||||
print(f"Yaw: {np.degrees(result.yaw):.1f}°")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### SpoofingResult
|
||||
|
||||
```python
|
||||
@dataclass(frozen=True)
|
||||
class SpoofingResult:
|
||||
is_real: bool # True = real, False = fake
|
||||
confidence: float # 0.0 to 1.0
|
||||
```
|
||||
|
||||
**Usage:**
|
||||
|
||||
```python
|
||||
result = spoofer.predict(image, face.bbox)
|
||||
label = "Real" if result.is_real else "Fake"
|
||||
print(f"{label}: {result.confidence:.1%}")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### AttributeResult
|
||||
|
||||
```python
|
||||
@dataclass(frozen=True)
|
||||
class AttributeResult:
|
||||
gender: int # 0=Female, 1=Male
|
||||
age: int | None # Years (AgeGender model)
|
||||
age_group: str | None # "20-29" (FairFace model)
|
||||
race: str | None # Race label (FairFace model)
|
||||
|
||||
@property
|
||||
def sex(self) -> str:
|
||||
return "Female" if self.gender == 0 else "Male"
|
||||
```
|
||||
|
||||
**Usage:**
|
||||
|
||||
```python
|
||||
# AgeGender model
|
||||
result = age_gender.predict(image, face.bbox)
|
||||
print(f"{result.sex}, {result.age} years old")
|
||||
|
||||
# FairFace model
|
||||
result = fairface.predict(image, face.bbox)
|
||||
print(f"{result.sex}, {result.age_group}, {result.race}")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### EmotionResult
|
||||
|
||||
```python
|
||||
@dataclass(frozen=True)
|
||||
class EmotionResult:
|
||||
emotion: str # "Happy", "Sad", etc.
|
||||
confidence: float # 0.0 to 1.0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Embeddings
|
||||
|
||||
Face recognition models return normalized 512-dimensional embeddings:
|
||||
|
||||
```python
|
||||
embedding = recognizer.get_normalized_embedding(image, landmarks)
|
||||
print(f"Shape: {embedding.shape}") # (1, 512)
|
||||
print(f"Norm: {np.linalg.norm(embedding):.4f}") # ~1.0
|
||||
```
|
||||
|
||||
### Similarity Computation
|
||||
|
||||
```python
|
||||
from uniface import compute_similarity
|
||||
|
||||
similarity = compute_similarity(embedding1, embedding2)
|
||||
# Returns: float between -1 and 1 (cosine similarity)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Parsing Masks
|
||||
|
||||
Face parsing returns a segmentation mask:
|
||||
|
||||
```python
|
||||
mask = parser.parse(face_image)
|
||||
print(f"Shape: {mask.shape}") # (H, W)
|
||||
print(f"Classes: {np.unique(mask)}") # [0, 1, 2, ...]
|
||||
```
|
||||
|
||||
**19 Classes:**
|
||||
|
||||
| ID | Class | ID | Class |
|
||||
|----|-------|----|-------|
|
||||
| 0 | Background | 10 | Ear Ring |
|
||||
| 1 | Skin | 11 | Nose |
|
||||
| 2 | Left Eyebrow | 12 | Mouth |
|
||||
| 3 | Right Eyebrow | 13 | Upper Lip |
|
||||
| 4 | Left Eye | 14 | Lower Lip |
|
||||
| 5 | Right Eye | 15 | Neck |
|
||||
| 6 | Eye Glasses | 16 | Neck Lace |
|
||||
| 7 | Left Ear | 17 | Cloth |
|
||||
| 8 | Right Ear | 18 | Hair |
|
||||
| 9 | Hat | | |
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Coordinate Systems](coordinate-systems.md) - Bbox and landmark formats
|
||||
- [Thresholds & Calibration](thresholds-calibration.md) - Tuning confidence thresholds
|
||||
220
docs/concepts/model-cache-offline.md
Normal file
220
docs/concepts/model-cache-offline.md
Normal file
@@ -0,0 +1,220 @@
|
||||
# Model Cache & Offline Use
|
||||
|
||||
UniFace automatically downloads and caches models. This page explains how model management works.
|
||||
|
||||
---
|
||||
|
||||
## Automatic Download
|
||||
|
||||
Models are downloaded on first use:
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
|
||||
# First run: downloads model to cache
|
||||
detector = RetinaFace() # ~3.5 MB download
|
||||
|
||||
# Subsequent runs: loads from cache
|
||||
detector = RetinaFace() # Instant
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Cache Location
|
||||
|
||||
Default cache directory:
|
||||
|
||||
```
|
||||
~/.uniface/models/
|
||||
```
|
||||
|
||||
**Example structure:**
|
||||
|
||||
```
|
||||
~/.uniface/models/
|
||||
├── retinaface_mv2.onnx
|
||||
├── w600k_mbf.onnx
|
||||
├── 2d106det.onnx
|
||||
├── gaze_resnet34.onnx
|
||||
├── parsing_resnet18.onnx
|
||||
└── ...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Custom Cache Directory
|
||||
|
||||
Specify a custom cache location:
|
||||
|
||||
```python
|
||||
from uniface.model_store import verify_model_weights
|
||||
from uniface.constants import RetinaFaceWeights
|
||||
|
||||
# Download to custom directory
|
||||
model_path = verify_model_weights(
|
||||
RetinaFaceWeights.MNET_V2,
|
||||
root='./my_models'
|
||||
)
|
||||
print(f"Model at: {model_path}")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pre-Download Models
|
||||
|
||||
Download models before deployment:
|
||||
|
||||
```python
|
||||
from uniface.model_store import verify_model_weights
|
||||
from uniface.constants import (
|
||||
RetinaFaceWeights,
|
||||
ArcFaceWeights,
|
||||
AgeGenderWeights,
|
||||
)
|
||||
|
||||
# Download all needed models
|
||||
models = [
|
||||
RetinaFaceWeights.MNET_V2,
|
||||
ArcFaceWeights.MNET,
|
||||
AgeGenderWeights.DEFAULT,
|
||||
]
|
||||
|
||||
for model in models:
|
||||
path = verify_model_weights(model)
|
||||
print(f"Downloaded: {path}")
|
||||
```
|
||||
|
||||
Or use the CLI tool:
|
||||
|
||||
```bash
|
||||
python tools/download_model.py
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Offline Use
|
||||
|
||||
For air-gapped or offline environments:
|
||||
|
||||
### 1. Pre-download models
|
||||
|
||||
On a connected machine:
|
||||
|
||||
```python
|
||||
from uniface.model_store import verify_model_weights
|
||||
from uniface.constants import RetinaFaceWeights
|
||||
|
||||
path = verify_model_weights(RetinaFaceWeights.MNET_V2)
|
||||
print(f"Copy from: {path}")
|
||||
```
|
||||
|
||||
### 2. Copy to target machine
|
||||
|
||||
```bash
|
||||
# Copy the entire cache directory
|
||||
scp -r ~/.uniface/models/ user@offline-machine:~/.uniface/models/
|
||||
```
|
||||
|
||||
### 3. Use normally
|
||||
|
||||
```python
|
||||
# Models load from local cache
|
||||
from uniface import RetinaFace
|
||||
detector = RetinaFace() # No network required
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Model Verification
|
||||
|
||||
Models are verified with SHA-256 checksums:
|
||||
|
||||
```python
|
||||
from uniface.constants import MODEL_SHA256, RetinaFaceWeights
|
||||
|
||||
# Check expected checksum
|
||||
expected = MODEL_SHA256[RetinaFaceWeights.MNET_V2]
|
||||
print(f"Expected SHA256: {expected}")
|
||||
```
|
||||
|
||||
If a model fails verification, it's re-downloaded automatically.
|
||||
|
||||
---
|
||||
|
||||
## Available Models
|
||||
|
||||
### Detection Models
|
||||
|
||||
| Model | Size | Download |
|
||||
|-------|------|----------|
|
||||
| RetinaFace MNET_025 | 1.7 MB | ✅ |
|
||||
| RetinaFace MNET_V2 | 3.5 MB | ✅ |
|
||||
| RetinaFace RESNET34 | 56 MB | ✅ |
|
||||
| SCRFD 500M | 2.5 MB | ✅ |
|
||||
| SCRFD 10G | 17 MB | ✅ |
|
||||
| YOLOv5n-Face | 11 MB | ✅ |
|
||||
| YOLOv5s-Face | 28 MB | ✅ |
|
||||
| YOLOv5m-Face | 82 MB | ✅ |
|
||||
| YOLOv8-Lite-S | 7.4 MB | ✅ |
|
||||
| YOLOv8n-Face | 12 MB | ✅ |
|
||||
|
||||
### Recognition Models
|
||||
|
||||
| Model | Size | Download |
|
||||
|-------|------|----------|
|
||||
| ArcFace MNET | 8 MB | ✅ |
|
||||
| ArcFace RESNET | 166 MB | ✅ |
|
||||
| MobileFace MNET_V2 | 4 MB | ✅ |
|
||||
| SphereFace SPHERE20 | 50 MB | ✅ |
|
||||
|
||||
### Other Models
|
||||
|
||||
| Model | Size | Download |
|
||||
|-------|------|----------|
|
||||
| Landmark106 | 14 MB | ✅ |
|
||||
| AgeGender | 8 MB | ✅ |
|
||||
| FairFace | 44 MB | ✅ |
|
||||
| Gaze ResNet34 | 82 MB | ✅ |
|
||||
| BiSeNet ResNet18 | 51 MB | ✅ |
|
||||
| MiniFASNet V2 | 1.2 MB | ✅ |
|
||||
|
||||
---
|
||||
|
||||
## Clear Cache
|
||||
|
||||
Remove cached models:
|
||||
|
||||
```bash
|
||||
# Remove all cached models
|
||||
rm -rf ~/.uniface/models/
|
||||
|
||||
# Remove specific model
|
||||
rm ~/.uniface/models/retinaface_mv2.onnx
|
||||
```
|
||||
|
||||
Models will be re-downloaded on next use.
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables
|
||||
|
||||
Set custom cache location via environment variable:
|
||||
|
||||
```bash
|
||||
export UNIFACE_CACHE_DIR=/path/to/custom/cache
|
||||
```
|
||||
|
||||
```python
|
||||
import os
|
||||
os.environ['UNIFACE_CACHE_DIR'] = '/path/to/custom/cache'
|
||||
|
||||
from uniface import RetinaFace
|
||||
detector = RetinaFace() # Uses custom cache
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Thresholds & Calibration](thresholds-calibration.md) - Tune model parameters
|
||||
- [Detection Module](../modules/detection.md) - Detection model details
|
||||
196
docs/concepts/overview.md
Normal file
196
docs/concepts/overview.md
Normal file
@@ -0,0 +1,196 @@
|
||||
# Overview
|
||||
|
||||
UniFace is designed as a modular, production-ready face analysis library. This page explains the architecture and design principles.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
UniFace follows a modular architecture where each face analysis task is handled by a dedicated module:
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph Input
|
||||
IMG[Image/Frame]
|
||||
end
|
||||
|
||||
subgraph Detection
|
||||
DET[RetinaFace / SCRFD / YOLOv5Face / YOLOv8Face]
|
||||
end
|
||||
|
||||
subgraph Analysis
|
||||
REC[Recognition]
|
||||
LMK[Landmarks]
|
||||
ATTR[Attributes]
|
||||
GAZE[Gaze]
|
||||
PARSE[Parsing]
|
||||
SPOOF[Anti-Spoofing]
|
||||
PRIV[Privacy]
|
||||
end
|
||||
|
||||
subgraph Output
|
||||
FACE[Face Objects]
|
||||
end
|
||||
|
||||
IMG --> DET
|
||||
DET --> REC
|
||||
DET --> LMK
|
||||
DET --> ATTR
|
||||
DET --> GAZE
|
||||
DET --> PARSE
|
||||
DET --> SPOOF
|
||||
DET --> PRIV
|
||||
REC --> FACE
|
||||
LMK --> FACE
|
||||
ATTR --> FACE
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Design Principles
|
||||
|
||||
### 1. ONNX-First
|
||||
|
||||
All models use ONNX Runtime for inference:
|
||||
|
||||
- **Cross-platform**: Same models work on macOS, Linux, Windows
|
||||
- **Hardware acceleration**: Automatic selection of optimal provider
|
||||
- **Production-ready**: No Python-only dependencies for inference
|
||||
|
||||
### 2. Minimal Dependencies
|
||||
|
||||
Core dependencies are kept minimal:
|
||||
|
||||
```
|
||||
numpy # Array operations
|
||||
opencv-python # Image processing
|
||||
onnxruntime # Model inference
|
||||
requests # Model download
|
||||
tqdm # Progress bars
|
||||
```
|
||||
|
||||
### 3. Simple API
|
||||
|
||||
Factory functions and direct instantiation:
|
||||
|
||||
```python
|
||||
# Factory function
|
||||
detector = create_detector('retinaface')
|
||||
|
||||
# Direct instantiation (recommended)
|
||||
from uniface import RetinaFace
|
||||
detector = RetinaFace()
|
||||
```
|
||||
|
||||
### 4. Type Safety
|
||||
|
||||
Full type hints throughout:
|
||||
|
||||
```python
|
||||
def detect(self, image: np.ndarray) -> list[Face]:
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Module Structure
|
||||
|
||||
```
|
||||
uniface/
|
||||
├── detection/ # Face detection (RetinaFace, SCRFD, YOLOv5Face, YOLOv8Face)
|
||||
├── recognition/ # Face recognition (AdaFace, ArcFace, MobileFace, SphereFace)
|
||||
├── landmark/ # 106-point landmarks
|
||||
├── attribute/ # Age, gender, emotion, race
|
||||
├── parsing/ # Face semantic segmentation
|
||||
├── gaze/ # Gaze estimation
|
||||
├── spoofing/ # Anti-spoofing
|
||||
├── privacy/ # Face anonymization
|
||||
├── types.py # Dataclasses (Face, GazeResult, etc.)
|
||||
├── constants.py # Model weights and URLs
|
||||
├── model_store.py # Model download and caching
|
||||
├── onnx_utils.py # ONNX Runtime utilities
|
||||
└── visualization.py # Drawing utilities
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Workflow
|
||||
|
||||
A typical face analysis workflow:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace, ArcFace, AgeGender
|
||||
|
||||
# 1. Initialize models
|
||||
detector = RetinaFace()
|
||||
recognizer = ArcFace()
|
||||
age_gender = AgeGender()
|
||||
|
||||
# 2. Load image
|
||||
image = cv2.imread("photo.jpg")
|
||||
|
||||
# 3. Detect faces
|
||||
faces = detector.detect(image)
|
||||
|
||||
# 4. Analyze each face
|
||||
for face in faces:
|
||||
# Recognition embedding
|
||||
embedding = recognizer.get_normalized_embedding(image, face.landmarks)
|
||||
|
||||
# Attributes
|
||||
attrs = age_gender.predict(image, face.bbox)
|
||||
|
||||
print(f"Face: {attrs.sex}, {attrs.age} years")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## FaceAnalyzer
|
||||
|
||||
For convenience, `FaceAnalyzer` combines multiple modules:
|
||||
|
||||
```python
|
||||
from uniface import FaceAnalyzer
|
||||
|
||||
analyzer = FaceAnalyzer(
|
||||
detect=True,
|
||||
recognize=True,
|
||||
attributes=True
|
||||
)
|
||||
|
||||
faces = analyzer.analyze(image)
|
||||
for face in faces:
|
||||
print(f"Age: {face.age}, Gender: {face.sex}")
|
||||
print(f"Embedding: {face.embedding.shape}")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Model Lifecycle
|
||||
|
||||
1. **First use**: Model is downloaded from GitHub releases
|
||||
2. **Cached**: Stored in `~/.uniface/models/`
|
||||
3. **Verified**: SHA-256 checksum validation
|
||||
4. **Loaded**: ONNX Runtime session created
|
||||
5. **Inference**: Hardware-accelerated execution
|
||||
|
||||
```python
|
||||
# Models auto-download on first use
|
||||
detector = RetinaFace() # Downloads if not cached
|
||||
|
||||
# Or manually pre-download
|
||||
from uniface.model_store import verify_model_weights
|
||||
from uniface.constants import RetinaFaceWeights
|
||||
|
||||
path = verify_model_weights(RetinaFaceWeights.MNET_V2)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Inputs & Outputs](inputs-outputs.md) - Understand data types
|
||||
- [Execution Providers](execution-providers.md) - Hardware acceleration
|
||||
- [Detection Module](../modules/detection.md) - Start with face detection
|
||||
- [Image Pipeline Recipe](../recipes/image-pipeline.md) - Complete workflow
|
||||
234
docs/concepts/thresholds-calibration.md
Normal file
234
docs/concepts/thresholds-calibration.md
Normal file
@@ -0,0 +1,234 @@
|
||||
# Thresholds & Calibration
|
||||
|
||||
This page explains how to tune detection and recognition thresholds for your use case.
|
||||
|
||||
---
|
||||
|
||||
## Detection Thresholds
|
||||
|
||||
### Confidence Threshold
|
||||
|
||||
Controls minimum confidence for face detection:
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
|
||||
# Default (balanced)
|
||||
detector = RetinaFace(confidence_threshold=0.5)
|
||||
|
||||
# High precision (fewer false positives)
|
||||
detector = RetinaFace(confidence_threshold=0.8)
|
||||
|
||||
# High recall (catch more faces)
|
||||
detector = RetinaFace(confidence_threshold=0.3)
|
||||
```
|
||||
|
||||
**Guidelines:**
|
||||
|
||||
| Threshold | Use Case |
|
||||
|-----------|----------|
|
||||
| 0.3 - 0.4 | Maximum recall (research, analysis) |
|
||||
| 0.5 - 0.6 | Balanced (default, general use) |
|
||||
| 0.7 - 0.9 | High precision (production, security) |
|
||||
|
||||
---
|
||||
|
||||
### NMS Threshold
|
||||
|
||||
Non-Maximum Suppression removes overlapping detections:
|
||||
|
||||
```python
|
||||
# Default
|
||||
detector = RetinaFace(nms_threshold=0.4)
|
||||
|
||||
# Stricter (fewer overlapping boxes)
|
||||
detector = RetinaFace(nms_threshold=0.3)
|
||||
|
||||
# Looser (for crowded scenes)
|
||||
detector = RetinaFace(nms_threshold=0.5)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Input Size
|
||||
|
||||
Affects detection accuracy and speed:
|
||||
|
||||
```python
|
||||
# Faster, lower accuracy
|
||||
detector = RetinaFace(input_size=(320, 320))
|
||||
|
||||
# Balanced (default)
|
||||
detector = RetinaFace(input_size=(640, 640))
|
||||
|
||||
# Higher accuracy, slower
|
||||
detector = RetinaFace(input_size=(1280, 1280))
|
||||
```
|
||||
|
||||
!!! tip "Dynamic Size"
|
||||
For RetinaFace, enable dynamic input for variable image sizes:
|
||||
```python
|
||||
detector = RetinaFace(dynamic_size=True)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Recognition Thresholds
|
||||
|
||||
### Similarity Threshold
|
||||
|
||||
For identity verification (same person check):
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
from uniface import compute_similarity
|
||||
|
||||
similarity = compute_similarity(embedding1, embedding2)
|
||||
|
||||
# Threshold interpretation
|
||||
if similarity > 0.6:
|
||||
print("Same person (high confidence)")
|
||||
elif similarity > 0.4:
|
||||
print("Uncertain (manual review)")
|
||||
else:
|
||||
print("Different people")
|
||||
```
|
||||
|
||||
**Recommended thresholds:**
|
||||
|
||||
| Threshold | Decision | False Accept Rate |
|
||||
|-----------|----------|-------------------|
|
||||
| 0.4 | Low security | Higher FAR |
|
||||
| 0.5 | Balanced | Moderate FAR |
|
||||
| 0.6 | High security | Lower FAR |
|
||||
| 0.7 | Very strict | Very low FAR |
|
||||
|
||||
---
|
||||
|
||||
### Calibration for Your Dataset
|
||||
|
||||
Test on your data to find optimal thresholds:
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
def calibrate_threshold(same_pairs, diff_pairs, recognizer, detector):
|
||||
"""Find optimal threshold for your dataset."""
|
||||
same_scores = []
|
||||
diff_scores = []
|
||||
|
||||
# Compute similarities for same-person pairs
|
||||
for img1_path, img2_path in same_pairs:
|
||||
img1 = cv2.imread(img1_path)
|
||||
img2 = cv2.imread(img2_path)
|
||||
|
||||
faces1 = detector.detect(img1)
|
||||
faces2 = detector.detect(img2)
|
||||
|
||||
if faces1 and faces2:
|
||||
emb1 = recognizer.get_normalized_embedding(img1, faces1[0].landmarks)
|
||||
emb2 = recognizer.get_normalized_embedding(img2, faces2[0].landmarks)
|
||||
same_scores.append(np.dot(emb1, emb2.T)[0][0])
|
||||
|
||||
# Compute similarities for different-person pairs
|
||||
for img1_path, img2_path in diff_pairs:
|
||||
# ... similar process
|
||||
diff_scores.append(similarity)
|
||||
|
||||
# Find optimal threshold
|
||||
thresholds = np.arange(0.3, 0.8, 0.05)
|
||||
best_threshold = 0.5
|
||||
best_accuracy = 0
|
||||
|
||||
for thresh in thresholds:
|
||||
tp = sum(1 for s in same_scores if s >= thresh)
|
||||
tn = sum(1 for s in diff_scores if s < thresh)
|
||||
accuracy = (tp + tn) / (len(same_scores) + len(diff_scores))
|
||||
|
||||
if accuracy > best_accuracy:
|
||||
best_accuracy = accuracy
|
||||
best_threshold = thresh
|
||||
|
||||
return best_threshold, best_accuracy
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Anti-Spoofing Thresholds
|
||||
|
||||
The MiniFASNet model returns a confidence score:
|
||||
|
||||
```python
|
||||
from uniface.spoofing import MiniFASNet
|
||||
|
||||
spoofer = MiniFASNet()
|
||||
result = spoofer.predict(image, face.bbox)
|
||||
|
||||
# Default threshold (0.5)
|
||||
if result.is_real: # confidence > 0.5
|
||||
print("Real face")
|
||||
|
||||
# Custom threshold for high security
|
||||
SPOOF_THRESHOLD = 0.7
|
||||
if result.confidence > SPOOF_THRESHOLD:
|
||||
print("Real face (high confidence)")
|
||||
else:
|
||||
print("Potentially fake")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Attribute Model Confidence
|
||||
|
||||
### Emotion
|
||||
|
||||
```python
|
||||
result = emotion_predictor.predict(image, landmarks)
|
||||
|
||||
# Filter low-confidence predictions
|
||||
if result.confidence > 0.6:
|
||||
print(f"Emotion: {result.emotion}")
|
||||
else:
|
||||
print("Uncertain emotion")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Visualization Threshold
|
||||
|
||||
For drawing detections, filter by confidence:
|
||||
|
||||
```python
|
||||
from uniface.visualization import draw_detections
|
||||
|
||||
# Only draw high-confidence detections
|
||||
bboxes = [f.bbox for f in faces if f.confidence > 0.7]
|
||||
scores = [f.confidence for f in faces if f.confidence > 0.7]
|
||||
landmarks = [f.landmarks for f in faces if f.confidence > 0.7]
|
||||
|
||||
draw_detections(
|
||||
image=image,
|
||||
bboxes=bboxes,
|
||||
scores=scores,
|
||||
landmarks=landmarks,
|
||||
vis_threshold=0.6 # Additional visualization filter
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
| Parameter | Default | Range | Lower = | Higher = |
|
||||
|-----------|---------|-------|---------|----------|
|
||||
| `confidence_threshold` | 0.5 | 0.1-0.9 | More detections | Fewer false positives |
|
||||
| `nms_threshold` | 0.4 | 0.1-0.7 | Fewer overlaps | More overlapping boxes |
|
||||
| Similarity threshold | 0.6 | 0.3-0.8 | More matches (FAR↑) | Fewer matches (FRR↑) |
|
||||
| Spoof confidence | 0.5 | 0.3-0.9 | More "real" | Stricter liveness |
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Detection Module](../modules/detection.md) - Detection model options
|
||||
- [Recognition Module](../modules/recognition.md) - Recognition model options
|
||||
72
docs/contributing.md
Normal file
72
docs/contributing.md
Normal file
@@ -0,0 +1,72 @@
|
||||
# Contributing
|
||||
|
||||
Thank you for contributing to UniFace!
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Clone
|
||||
git clone https://github.com/yakhyo/uniface.git
|
||||
cd uniface
|
||||
|
||||
# Install dev dependencies
|
||||
pip install -e ".[dev]"
|
||||
|
||||
# Run tests
|
||||
pytest
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Code Style
|
||||
|
||||
We use [Ruff](https://docs.astral.sh/ruff/) for formatting:
|
||||
|
||||
```bash
|
||||
ruff format .
|
||||
ruff check . --fix
|
||||
```
|
||||
|
||||
**Guidelines:**
|
||||
|
||||
- Line length: 120
|
||||
- Python 3.11+ type hints
|
||||
- Google-style docstrings
|
||||
|
||||
---
|
||||
|
||||
## Pre-commit Hooks
|
||||
|
||||
```bash
|
||||
pip install pre-commit
|
||||
pre-commit install
|
||||
pre-commit run --all-files
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Pull Request Process
|
||||
|
||||
1. Fork the repository
|
||||
2. Create a feature branch
|
||||
3. Write tests for new features
|
||||
4. Ensure tests pass
|
||||
5. Submit PR with clear description
|
||||
|
||||
---
|
||||
|
||||
## Adding New Models
|
||||
|
||||
1. Create model class in appropriate submodule
|
||||
2. Add weight constants to `uniface/constants.py`
|
||||
3. Export in `__init__.py` files
|
||||
4. Write tests in `tests/`
|
||||
5. Add example in `tools/` or notebooks
|
||||
|
||||
---
|
||||
|
||||
## Questions?
|
||||
|
||||
Open an issue on [GitHub](https://github.com/yakhyo/uniface/issues).
|
||||
133
docs/index.md
Normal file
133
docs/index.md
Normal file
@@ -0,0 +1,133 @@
|
||||
---
|
||||
hide:
|
||||
- toc
|
||||
- navigation
|
||||
- edit
|
||||
template: home.html
|
||||
---
|
||||
|
||||
<div class="hero" markdown>
|
||||
|
||||
# UniFace { .hero-title }
|
||||
|
||||
<p class="hero-subtitle">A lightweight, production-ready face analysis library built on ONNX Runtime</p>
|
||||
|
||||
[](https://pypi.org/project/uniface/)
|
||||
[](https://www.python.org/)
|
||||
[](https://opensource.org/licenses/MIT)
|
||||
[](https://pepy.tech/projects/uniface)
|
||||
|
||||
[Get Started](quickstart.md){ .md-button .md-button--primary }
|
||||
[View on GitHub](https://github.com/yakhyo/uniface){ .md-button }
|
||||
|
||||
</div>
|
||||
|
||||
<div class="feature-grid" markdown>
|
||||
|
||||
<div class="feature-card" markdown>
|
||||
### :material-face-recognition: Face Detection
|
||||
ONNX-optimized detectors (RetinaFace, SCRFD, YOLO) with 5-point landmarks.
|
||||
</div>
|
||||
|
||||
<div class="feature-card" markdown>
|
||||
### :material-account-check: Face Recognition
|
||||
AdaFace, ArcFace, MobileFace, and SphereFace embeddings for identity verification.
|
||||
</div>
|
||||
|
||||
<div class="feature-card" markdown>
|
||||
### :material-map-marker: Landmarks
|
||||
Accurate 106-point facial landmark localization for detailed face analysis.
|
||||
</div>
|
||||
|
||||
<div class="feature-card" markdown>
|
||||
### :material-account-details: Attributes
|
||||
Age, gender, race (FairFace), and emotion detection from faces.
|
||||
</div>
|
||||
|
||||
<div class="feature-card" markdown>
|
||||
### :material-face-man-shimmer: Face Parsing
|
||||
BiSeNet semantic segmentation with 19 facial component classes.
|
||||
</div>
|
||||
|
||||
<div class="feature-card" markdown>
|
||||
### :material-eye: Gaze Estimation
|
||||
Real-time gaze direction prediction with MobileGaze models.
|
||||
</div>
|
||||
|
||||
<div class="feature-card" markdown>
|
||||
### :material-shield-check: Anti-Spoofing
|
||||
Face liveness detection with MiniFASNet to prevent fraud.
|
||||
</div>
|
||||
|
||||
<div class="feature-card" markdown>
|
||||
### :material-blur: Privacy
|
||||
Face anonymization with 5 blur methods for privacy protection.
|
||||
</div>
|
||||
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
## Installation
|
||||
|
||||
=== "Standard"
|
||||
|
||||
```bash
|
||||
pip install uniface
|
||||
```
|
||||
|
||||
=== "GPU (CUDA)"
|
||||
|
||||
```bash
|
||||
pip install uniface[gpu]
|
||||
```
|
||||
|
||||
=== "From Source"
|
||||
|
||||
```bash
|
||||
git clone https://github.com/yakhyo/uniface.git
|
||||
cd uniface
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
<div class="next-steps-grid" markdown>
|
||||
|
||||
<div class="feature-card" markdown>
|
||||
### :material-rocket-launch: Quickstart
|
||||
Get up and running in 5 minutes with common use cases.
|
||||
|
||||
[Quickstart Guide →](quickstart.md)
|
||||
</div>
|
||||
|
||||
<div class="feature-card" markdown>
|
||||
### :material-school: Tutorials
|
||||
Step-by-step examples for common workflows.
|
||||
|
||||
[View Tutorials →](recipes/image-pipeline.md)
|
||||
</div>
|
||||
|
||||
<div class="feature-card" markdown>
|
||||
### :material-api: API Reference
|
||||
Explore individual modules and their APIs.
|
||||
|
||||
[Browse API →](modules/detection.md)
|
||||
</div>
|
||||
|
||||
<div class="feature-card" markdown>
|
||||
### :material-book-open-variant: Guides
|
||||
Learn about the architecture and design principles.
|
||||
|
||||
[Read Guides →](concepts/overview.md)
|
||||
</div>
|
||||
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
## License
|
||||
|
||||
UniFace is released under the [MIT License](https://opensource.org/licenses/MIT).
|
||||
174
docs/installation.md
Normal file
174
docs/installation.md
Normal file
@@ -0,0 +1,174 @@
|
||||
# Installation
|
||||
|
||||
This guide covers all installation options for UniFace.
|
||||
|
||||
---
|
||||
|
||||
## Requirements
|
||||
|
||||
- **Python**: 3.11 or higher
|
||||
- **Operating Systems**: macOS, Linux, Windows
|
||||
|
||||
---
|
||||
|
||||
## Quick Install
|
||||
|
||||
The simplest way to install UniFace:
|
||||
|
||||
```bash
|
||||
pip install uniface
|
||||
```
|
||||
|
||||
This installs the CPU version with all core dependencies.
|
||||
|
||||
---
|
||||
|
||||
## Platform-Specific Installation
|
||||
|
||||
### macOS (Apple Silicon - M1/M2/M3/M4)
|
||||
|
||||
For Apple Silicon Macs, the standard installation automatically includes ARM64 optimizations:
|
||||
|
||||
```bash
|
||||
pip install uniface
|
||||
```
|
||||
|
||||
!!! tip "Native Performance"
|
||||
The base `onnxruntime` package has native Apple Silicon support with ARM64 optimizations built-in since version 1.13+. No additional configuration needed.
|
||||
|
||||
Verify ARM64 installation:
|
||||
|
||||
```bash
|
||||
python -c "import platform; print(platform.machine())"
|
||||
# Should show: arm64
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Linux/Windows with NVIDIA GPU
|
||||
|
||||
For CUDA acceleration on NVIDIA GPUs:
|
||||
|
||||
```bash
|
||||
pip install uniface[gpu]
|
||||
```
|
||||
|
||||
**Requirements:**
|
||||
|
||||
- CUDA 11.x or 12.x
|
||||
- cuDNN 8.x
|
||||
|
||||
!!! info "CUDA Compatibility"
|
||||
See [ONNX Runtime GPU requirements](https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html) for detailed compatibility matrix.
|
||||
|
||||
Verify GPU installation:
|
||||
|
||||
```python
|
||||
import onnxruntime as ort
|
||||
print("Available providers:", ort.get_available_providers())
|
||||
# Should include: 'CUDAExecutionProvider'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### CPU-Only (All Platforms)
|
||||
|
||||
```bash
|
||||
pip install uniface
|
||||
```
|
||||
|
||||
Works on all platforms with automatic CPU fallback.
|
||||
|
||||
---
|
||||
|
||||
## Install from Source
|
||||
|
||||
For development or the latest features:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/yakhyo/uniface.git
|
||||
cd uniface
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
With development dependencies:
|
||||
|
||||
```bash
|
||||
pip install -e ".[dev]"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
UniFace has minimal dependencies:
|
||||
|
||||
| Package | Purpose |
|
||||
|---------|---------|
|
||||
| `numpy` | Array operations |
|
||||
| `opencv-python` | Image processing |
|
||||
| `onnxruntime` | Model inference |
|
||||
| `requests` | Model download |
|
||||
| `tqdm` | Progress bars |
|
||||
|
||||
---
|
||||
|
||||
## Verify Installation
|
||||
|
||||
Test your installation:
|
||||
|
||||
```python
|
||||
import uniface
|
||||
print(f"UniFace version: {uniface.__version__}")
|
||||
|
||||
# Check available ONNX providers
|
||||
import onnxruntime as ort
|
||||
print(f"Available providers: {ort.get_available_providers()}")
|
||||
|
||||
# Quick test
|
||||
from uniface import RetinaFace
|
||||
detector = RetinaFace()
|
||||
print("Installation successful!")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Import Errors
|
||||
|
||||
If you encounter import errors, ensure you're using Python 3.11+:
|
||||
|
||||
```bash
|
||||
python --version
|
||||
# Should show: Python 3.11.x or higher
|
||||
```
|
||||
|
||||
### Model Download Issues
|
||||
|
||||
Models are automatically downloaded on first use. If downloads fail:
|
||||
|
||||
```python
|
||||
from uniface.model_store import verify_model_weights
|
||||
from uniface.constants import RetinaFaceWeights
|
||||
|
||||
# Manually download a model
|
||||
model_path = verify_model_weights(RetinaFaceWeights.MNET_V2)
|
||||
print(f"Model downloaded to: {model_path}")
|
||||
```
|
||||
|
||||
### Performance Issues on Mac
|
||||
|
||||
Verify you're using the ARM64 build (not x86_64 via Rosetta):
|
||||
|
||||
```bash
|
||||
python -c "import platform; print(platform.machine())"
|
||||
# Should show: arm64 (not x86_64)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Quickstart Guide](quickstart.md) - Get started in 5 minutes
|
||||
- [Execution Providers](concepts/execution-providers.md) - Hardware acceleration setup
|
||||
24
docs/license-attribution.md
Normal file
24
docs/license-attribution.md
Normal file
@@ -0,0 +1,24 @@
|
||||
# Licenses & Attribution
|
||||
|
||||
## UniFace License
|
||||
|
||||
UniFace is released under the [MIT License](https://opensource.org/licenses/MIT).
|
||||
|
||||
---
|
||||
|
||||
## Model Credits
|
||||
|
||||
| Model | Source | License |
|
||||
|-------|--------|---------|
|
||||
| RetinaFace | [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) | MIT |
|
||||
| SCRFD | [InsightFace](https://github.com/deepinsight/insightface) | MIT |
|
||||
| YOLOv5-Face | [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) | GPL-3.0 |
|
||||
| YOLOv8-Face | [yakhyo/yolov8-face-onnx-inference](https://github.com/yakhyo/yolov8-face-onnx-inference) | GPL-3.0 |
|
||||
| AdaFace | [yakhyo/adaface-onnx](https://github.com/yakhyo/adaface-onnx) | MIT |
|
||||
| ArcFace | [InsightFace](https://github.com/deepinsight/insightface) | MIT |
|
||||
| MobileFace | [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) | MIT |
|
||||
| SphereFace | [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) | MIT |
|
||||
| BiSeNet | [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) | MIT |
|
||||
| MobileGaze | [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) | MIT |
|
||||
| MiniFASNet | [yakhyo/face-anti-spoofing](https://github.com/yakhyo/face-anti-spoofing) | Apache-2.0 |
|
||||
| FairFace | [yakhyo/fairface-onnx](https://github.com/yakhyo/fairface-onnx) | CC BY 4.0 |
|
||||
358
docs/models.md
Normal file
358
docs/models.md
Normal file
@@ -0,0 +1,358 @@
|
||||
# Model Zoo
|
||||
|
||||
Complete guide to all available models and their performance characteristics.
|
||||
|
||||
---
|
||||
|
||||
## Face Detection Models
|
||||
|
||||
### RetinaFace Family
|
||||
|
||||
RetinaFace models are trained on the WIDER FACE dataset.
|
||||
|
||||
| Model Name | Params | Size | Easy | Medium | Hard |
|
||||
| -------------- | ------ | ----- | ------ | ------ | ------ |
|
||||
| `MNET_025` | 0.4M | 1.7MB | 88.48% | 87.02% | 80.61% |
|
||||
| `MNET_050` | 1.0M | 2.6MB | 89.42% | 87.97% | 82.40% |
|
||||
| `MNET_V1` | 3.5M | 3.8MB | 90.59% | 89.14% | 84.13% |
|
||||
| `MNET_V2` :material-check-circle: | 3.2M | 3.5MB | 91.70% | 91.03% | 86.60% |
|
||||
| `RESNET18` | 11.7M | 27MB | 92.50% | 91.02% | 86.63% |
|
||||
| `RESNET34` | 24.8M | 56MB | 94.16% | 93.12% | 88.90% |
|
||||
|
||||
!!! info "Accuracy & Benchmarks"
|
||||
**Accuracy**: WIDER FACE validation set (Easy/Medium/Hard subsets) - from [RetinaFace paper](https://arxiv.org/abs/1905.00641)
|
||||
|
||||
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --iterations 100`
|
||||
|
||||
---
|
||||
|
||||
### SCRFD Family
|
||||
|
||||
SCRFD (Sample and Computation Redistribution for Efficient Face Detection) models trained on WIDER FACE dataset.
|
||||
|
||||
| Model Name | Params | Size | Easy | Medium | Hard |
|
||||
| ---------------- | ------ | ----- | ------ | ------ | ------ |
|
||||
| `SCRFD_500M` | 0.6M | 2.5MB | 90.57% | 88.12% | 68.51% |
|
||||
| `SCRFD_10G` :material-check-circle: | 4.2M | 17MB | 95.16% | 93.87% | 83.05% |
|
||||
|
||||
!!! info "Accuracy & Benchmarks"
|
||||
**Accuracy**: WIDER FACE validation set - from [SCRFD paper](https://arxiv.org/abs/2105.04714)
|
||||
|
||||
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --iterations 100`
|
||||
|
||||
---
|
||||
|
||||
### YOLOv5-Face Family
|
||||
|
||||
YOLOv5-Face models provide detection with 5-point facial landmarks, trained on WIDER FACE dataset.
|
||||
|
||||
| Model Name | Size | Easy | Medium | Hard |
|
||||
| -------------- | ---- | ------ | ------ | ------ |
|
||||
| `YOLOV5N` | 11MB | 93.61% | 91.52% | 80.53% |
|
||||
| `YOLOV5S` :material-check-circle: | 28MB | 94.33% | 92.61% | 83.15% |
|
||||
| `YOLOV5M` | 82MB | 95.30% | 93.76% | 85.28% |
|
||||
|
||||
!!! info "Accuracy & Benchmarks"
|
||||
**Accuracy**: WIDER FACE validation set - from [YOLOv5-Face paper](https://arxiv.org/abs/2105.12931)
|
||||
|
||||
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --iterations 100`
|
||||
|
||||
!!! note "Fixed Input Size"
|
||||
All YOLOv5-Face models use a fixed input size of 640×640.
|
||||
|
||||
---
|
||||
|
||||
### YOLOv8-Face Family
|
||||
|
||||
YOLOv8-Face models use anchor-free design with DFL (Distribution Focal Loss) for bbox regression. Provides detection with 5-point facial landmarks.
|
||||
|
||||
| Model Name | Size | Easy | Medium | Hard |
|
||||
| ---------------- | ------ | ------ | ------ | ------ |
|
||||
| `YOLOV8_LITE_S`| 7.4MB | 93.4% | 91.2% | 78.6% |
|
||||
| `YOLOV8N` :material-check-circle: | 12MB | 94.6% | 92.3% | 79.6% |
|
||||
|
||||
!!! info "Accuracy & Benchmarks"
|
||||
**Accuracy**: WIDER FACE validation set (Easy/Medium/Hard subsets)
|
||||
|
||||
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --method yolov8face`
|
||||
|
||||
!!! note "Fixed Input Size"
|
||||
All YOLOv8-Face models use a fixed input size of 640×640.
|
||||
|
||||
---
|
||||
|
||||
## Face Recognition Models
|
||||
|
||||
### AdaFace
|
||||
|
||||
Face recognition using adaptive margin based on image quality.
|
||||
|
||||
| Model Name | Backbone | Dataset | Size | IJB-B TAR | IJB-C TAR |
|
||||
| ----------- | -------- | ----------- | ------ | --------- | --------- |
|
||||
| `IR_18` :material-check-circle: | IR-18 | WebFace4M | 92 MB | 93.03% | 94.99% |
|
||||
| `IR_101` | IR-101 | WebFace12M | 249 MB | - | 97.66% |
|
||||
|
||||
!!! info "Training Data & Accuracy"
|
||||
**Dataset**: WebFace4M (4M images) / WebFace12M (12M images)
|
||||
|
||||
**Accuracy**: IJB-B and IJB-C benchmarks, TAR@FAR=0.01%
|
||||
|
||||
!!! tip "Key Innovation"
|
||||
AdaFace introduces adaptive margin that adjusts based on image quality, providing better performance on low-quality images compared to fixed-margin approaches.
|
||||
|
||||
|
||||
---
|
||||
|
||||
### ArcFace
|
||||
|
||||
Face recognition using additive angular margin loss.
|
||||
|
||||
| Model Name | Backbone | Params | Size | LFW | CFP-FP | AgeDB-30 | IJB-C |
|
||||
| ----------- | --------- | ------ | ----- | ------ | ------ | -------- | ----- |
|
||||
| `MNET` :material-check-circle: | MobileNet | 2.0M | 8MB | 99.70% | 98.00% | 96.58% | 95.02% |
|
||||
| `RESNET` | ResNet50 | 43.6M | 166MB | 99.83% | 99.33% | 98.23% | 97.25% |
|
||||
|
||||
!!! info "Training Data"
|
||||
**Dataset**: Trained on WebFace600K (600K images)
|
||||
|
||||
**Accuracy**: IJB-C accuracy reported as TAR@FAR=1e-4
|
||||
|
||||
---
|
||||
|
||||
### MobileFace
|
||||
|
||||
Lightweight face recognition models with MobileNet backbones.
|
||||
|
||||
| Model Name | Backbone | Params | Size | LFW | CALFW | CPLFW | AgeDB-30 |
|
||||
| ----------------- | ---------------- | ------ | ---- | ------ | ------ | ------ | -------- |
|
||||
| `MNET_025` | MobileNetV1 0.25 | 0.36M | 1MB | 98.76% | 92.02% | 82.37% | 90.02% |
|
||||
| `MNET_V2` :material-check-circle: | MobileNetV2 | 2.29M | 4MB | 99.55% | 94.87% | 86.89% | 95.16% |
|
||||
| `MNET_V3_SMALL` | MobileNetV3-S | 1.25M | 3MB | 99.30% | 93.77% | 85.29% | 92.79% |
|
||||
| `MNET_V3_LARGE` | MobileNetV3-L | 3.52M | 10MB | 99.53% | 94.56% | 86.79% | 95.13% |
|
||||
|
||||
!!! info "Training Data"
|
||||
**Dataset**: Trained on MS1M-V2 (5.8M images, 85K identities)
|
||||
|
||||
**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks
|
||||
|
||||
---
|
||||
|
||||
### SphereFace
|
||||
|
||||
Face recognition using angular softmax loss.
|
||||
|
||||
| Model Name | Backbone | Params | Size | LFW | CALFW | CPLFW | AgeDB-30 |
|
||||
| ------------ | -------- | ------ | ---- | ------ | ------ | ------ | -------- |
|
||||
| `SPHERE20` | Sphere20 | 24.5M | 50MB | 99.67% | 95.61% | 88.75% | 96.58% |
|
||||
| `SPHERE36` | Sphere36 | 34.6M | 92MB | 99.72% | 95.64% | 89.92% | 96.83% |
|
||||
|
||||
!!! info "Training Data"
|
||||
**Dataset**: Trained on MS1M-V2 (5.8M images, 85K identities)
|
||||
|
||||
**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks
|
||||
|
||||
!!! note "Architecture"
|
||||
SphereFace uses angular softmax loss, an earlier approach before ArcFace. These models provide good accuracy with moderate resource requirements.
|
||||
|
||||
---
|
||||
|
||||
## Facial Landmark Models
|
||||
|
||||
### 106-Point Landmark Detection
|
||||
|
||||
Facial landmark localization model.
|
||||
|
||||
| Model Name | Points | Params | Size |
|
||||
| ---------- | ------ | ------ | ---- |
|
||||
| `2D106` | 106 | 3.7M | 14MB |
|
||||
|
||||
**Landmark Groups:**
|
||||
|
||||
| Group | Points | Count |
|
||||
|-------|--------|-------|
|
||||
| Face contour | 0-32 | 33 points |
|
||||
| Eyebrows | 33-50 | 18 points |
|
||||
| Nose | 51-62 | 12 points |
|
||||
| Eyes | 63-86 | 24 points |
|
||||
| Mouth | 87-105 | 19 points |
|
||||
|
||||
---
|
||||
|
||||
## Attribute Analysis Models
|
||||
|
||||
### Age & Gender Detection
|
||||
|
||||
| Model Name | Attributes | Params | Size |
|
||||
| ----------- | ----------- | ------ | ---- |
|
||||
| `AgeGender` | Age, Gender | 2.1M | 8MB |
|
||||
|
||||
!!! info "Training Data"
|
||||
**Dataset**: Trained on CelebA
|
||||
|
||||
!!! warning "Accuracy Note"
|
||||
Accuracy varies by demographic and image quality. Test on your specific use case.
|
||||
|
||||
---
|
||||
|
||||
### FairFace Attributes
|
||||
|
||||
| Model Name | Attributes | Params | Size |
|
||||
| ----------- | --------------------- | ------ | ----- |
|
||||
| `FairFace` | Race, Gender, Age Group | - | 44MB |
|
||||
|
||||
!!! info "Training Data"
|
||||
**Dataset**: Trained on FairFace dataset with balanced demographics
|
||||
|
||||
!!! tip "Equitable Predictions"
|
||||
FairFace provides more equitable predictions across different racial and gender groups.
|
||||
|
||||
**Race Categories (7):** White, Black, Latino Hispanic, East Asian, Southeast Asian, Indian, Middle Eastern
|
||||
|
||||
**Age Groups (9):** 0-2, 3-9, 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70+
|
||||
|
||||
---
|
||||
|
||||
### Emotion Detection
|
||||
|
||||
| Model Name | Classes | Params | Size |
|
||||
| ------------- | ------- | ------ | ---- |
|
||||
| `AFFECNET7` | 7 | 0.5M | 2MB |
|
||||
| `AFFECNET8` | 8 | 0.5M | 2MB |
|
||||
|
||||
**Classes (7)**: Neutral, Happy, Sad, Surprise, Fear, Disgust, Anger
|
||||
|
||||
**Classes (8)**: Above + Contempt
|
||||
|
||||
!!! info "Training Data"
|
||||
**Dataset**: Trained on AffectNet
|
||||
|
||||
!!! note "Accuracy Note"
|
||||
Emotion detection accuracy depends heavily on facial expression clarity and cultural context.
|
||||
|
||||
---
|
||||
|
||||
## Gaze Estimation Models
|
||||
|
||||
### MobileGaze Family
|
||||
|
||||
Gaze direction prediction models trained on Gaze360 dataset. Returns pitch (vertical) and yaw (horizontal) angles in radians.
|
||||
|
||||
| Model Name | Params | Size | MAE* |
|
||||
| -------------- | ------ | ------- | ----- |
|
||||
| `RESNET18` | 11.7M | 43 MB | 12.84 |
|
||||
| `RESNET34` :material-check-circle: | 24.8M | 81.6 MB | 11.33 |
|
||||
| `RESNET50` | 25.6M | 91.3 MB | 11.34 |
|
||||
| `MOBILENET_V2` | 3.5M | 9.59 MB | 13.07 |
|
||||
| `MOBILEONE_S0` | 2.1M | 4.8 MB | 12.58 |
|
||||
|
||||
*MAE (Mean Absolute Error) in degrees on Gaze360 test set - lower is better
|
||||
|
||||
!!! info "Training Data"
|
||||
**Dataset**: Trained on Gaze360 (indoor/outdoor scenes with diverse head poses)
|
||||
|
||||
**Training**: 200 epochs with classification-based approach (binned angles)
|
||||
|
||||
!!! note "Input Requirements"
|
||||
Requires face crop as input. Use face detection first to obtain bounding boxes.
|
||||
|
||||
---
|
||||
|
||||
## Face Parsing Models
|
||||
|
||||
### BiSeNet Family
|
||||
|
||||
BiSeNet (Bilateral Segmentation Network) models for semantic face parsing. Segments face images into 19 facial component classes.
|
||||
|
||||
| Model Name | Params | Size | Classes |
|
||||
| -------------- | ------ | ------- | ------- |
|
||||
| `RESNET18` :material-check-circle: | 13.3M | 50.7 MB | 19 |
|
||||
| `RESNET34` | 24.1M | 89.2 MB | 19 |
|
||||
|
||||
!!! info "Training Data"
|
||||
**Dataset**: Trained on CelebAMask-HQ
|
||||
|
||||
**Architecture**: BiSeNet with ResNet backbone
|
||||
|
||||
**Input Size**: 512×512 (automatically resized)
|
||||
|
||||
**19 Facial Component Classes:**
|
||||
|
||||
| # | Class | # | Class | # | Class |
|
||||
|---|-------|---|-------|---|-------|
|
||||
| 1 | Background | 8 | Left Ear | 15 | Neck |
|
||||
| 2 | Skin | 9 | Right Ear | 16 | Neck Lace |
|
||||
| 3 | Left Eyebrow | 10 | Ear Ring | 17 | Cloth |
|
||||
| 4 | Right Eyebrow | 11 | Nose | 18 | Hair |
|
||||
| 5 | Left Eye | 12 | Mouth | 19 | Hat |
|
||||
| 6 | Right Eye | 13 | Upper Lip | | |
|
||||
| 7 | Eye Glasses | 14 | Lower Lip | | |
|
||||
|
||||
**Applications:**
|
||||
|
||||
- Face makeup and beauty applications
|
||||
- Virtual try-on systems
|
||||
- Face editing and manipulation
|
||||
- Facial feature extraction
|
||||
- Portrait segmentation
|
||||
|
||||
!!! note "Input Requirements"
|
||||
Input should be a cropped face image. For full pipeline, use face detection first to obtain face crops.
|
||||
|
||||
---
|
||||
|
||||
## Anti-Spoofing Models
|
||||
|
||||
### MiniFASNet Family
|
||||
|
||||
Face anti-spoofing models for liveness detection. Detect if a face is real (live) or fake (photo, video replay, mask).
|
||||
|
||||
| Model Name | Size | Scale |
|
||||
| ---------- | ------ | ----- |
|
||||
| `V1SE` | 1.2 MB | 4.0 |
|
||||
| `V2` :material-check-circle: | 1.2 MB | 2.7 |
|
||||
|
||||
!!! info "Output Format"
|
||||
**Output**: Returns `SpoofingResult(is_real, confidence)` where is_real: True=Real, False=Fake
|
||||
|
||||
!!! note "Input Requirements"
|
||||
Requires face bounding box from a detector.
|
||||
|
||||
---
|
||||
|
||||
## Model Management
|
||||
|
||||
Models are automatically downloaded and cached on first use.
|
||||
|
||||
- **Cache location**: `~/.uniface/models/`
|
||||
- **Verification**: Models are verified with SHA-256 checksums
|
||||
- **Manual download**: Use `python tools/download_model.py` to pre-download models
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
### Model Training & Architectures
|
||||
|
||||
- **RetinaFace Training**: [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) - PyTorch implementation and training code
|
||||
- **YOLOv5-Face Original**: [deepcam-cn/yolov5-face](https://github.com/deepcam-cn/yolov5-face) - Original PyTorch implementation
|
||||
- **YOLOv5-Face ONNX**: [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) - ONNX inference implementation
|
||||
- **YOLOv8-Face Original**: [derronqi/yolov8-face](https://github.com/derronqi/yolov8-face) - Original PyTorch implementation
|
||||
- **YOLOv8-Face ONNX**: [yakhyo/yolov8-face-onnx-inference](https://github.com/yakhyo/yolov8-face-onnx-inference) - ONNX inference implementation
|
||||
- **AdaFace Original**: [mk-minchul/AdaFace](https://github.com/mk-minchul/AdaFace) - Original PyTorch implementation
|
||||
- **AdaFace ONNX**: [yakhyo/adaface-onnx](https://github.com/yakhyo/adaface-onnx) - ONNX export and inference
|
||||
- **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) - ArcFace, MobileFace, SphereFace training code
|
||||
- **Gaze Estimation Training**: [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) - MobileGaze training code and pretrained weights
|
||||
- **Face Parsing Training**: [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) - BiSeNet training code and pretrained weights
|
||||
- **Face Anti-Spoofing**: [yakhyo/face-anti-spoofing](https://github.com/yakhyo/face-anti-spoofing) - MiniFASNet ONNX inference (weights from [minivision-ai/Silent-Face-Anti-Spoofing](https://github.com/minivision-ai/Silent-Face-Anti-Spoofing))
|
||||
- **FairFace**: [yakhyo/fairface-onnx](https://github.com/yakhyo/fairface-onnx) - FairFace ONNX inference for race, gender, age prediction
|
||||
- **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface) - Model architectures and pretrained weights
|
||||
|
||||
### Papers
|
||||
|
||||
- **RetinaFace**: [Single-Shot Multi-Level Face Localisation in the Wild](https://arxiv.org/abs/1905.00641)
|
||||
- **SCRFD**: [Sample and Computation Redistribution for Efficient Face Detection](https://arxiv.org/abs/2105.04714)
|
||||
- **YOLOv5-Face**: [YOLO5Face: Why Reinventing a Face Detector](https://arxiv.org/abs/2105.12931)
|
||||
- **AdaFace**: [AdaFace: Quality Adaptive Margin for Face Recognition](https://arxiv.org/abs/2204.00964)
|
||||
- **ArcFace**: [Additive Angular Margin Loss for Deep Face Recognition](https://arxiv.org/abs/1801.07698)
|
||||
- **SphereFace**: [Deep Hypersphere Embedding for Face Recognition](https://arxiv.org/abs/1704.08063)
|
||||
- **BiSeNet**: [Bilateral Segmentation Network for Real-time Semantic Segmentation](https://arxiv.org/abs/1808.00897)
|
||||
279
docs/modules/attributes.md
Normal file
279
docs/modules/attributes.md
Normal file
@@ -0,0 +1,279 @@
|
||||
# Attributes
|
||||
|
||||
Facial attribute analysis for age, gender, race, and emotion detection.
|
||||
|
||||
---
|
||||
|
||||
## Available Models
|
||||
|
||||
| Model | Attributes | Size | Notes |
|
||||
|-------|------------|------|-------|
|
||||
| **AgeGender** | Age, Gender | 8 MB | Exact age prediction |
|
||||
| **FairFace** | Gender, Age Group, Race | 44 MB | Balanced demographics |
|
||||
| **Emotion** | 7-8 emotions | 2 MB | Requires PyTorch |
|
||||
|
||||
---
|
||||
|
||||
## AgeGender
|
||||
|
||||
Predicts exact age and binary gender.
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace, AgeGender
|
||||
|
||||
detector = RetinaFace()
|
||||
age_gender = AgeGender()
|
||||
|
||||
faces = detector.detect(image)
|
||||
|
||||
for face in faces:
|
||||
result = age_gender.predict(image, face.bbox)
|
||||
print(f"Gender: {result.sex}") # "Female" or "Male"
|
||||
print(f"Age: {result.age} years")
|
||||
```
|
||||
|
||||
### Output
|
||||
|
||||
```python
|
||||
# AttributeResult fields
|
||||
result.gender # 0=Female, 1=Male
|
||||
result.sex # "Female" or "Male" (property)
|
||||
result.age # int, age in years
|
||||
result.age_group # None (not provided by this model)
|
||||
result.race # None (not provided by this model)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## FairFace
|
||||
|
||||
Predicts gender, age group, and race with balanced demographics.
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace, FairFace
|
||||
|
||||
detector = RetinaFace()
|
||||
fairface = FairFace()
|
||||
|
||||
faces = detector.detect(image)
|
||||
|
||||
for face in faces:
|
||||
result = fairface.predict(image, face.bbox)
|
||||
print(f"Gender: {result.sex}")
|
||||
print(f"Age Group: {result.age_group}")
|
||||
print(f"Race: {result.race}")
|
||||
```
|
||||
|
||||
### Output
|
||||
|
||||
```python
|
||||
# AttributeResult fields
|
||||
result.gender # 0=Female, 1=Male
|
||||
result.sex # "Female" or "Male"
|
||||
result.age # None (not provided by this model)
|
||||
result.age_group # "20-29", "30-39", etc.
|
||||
result.race # Race/ethnicity label
|
||||
```
|
||||
|
||||
### Race Categories
|
||||
|
||||
| Label |
|
||||
|-------|
|
||||
| White |
|
||||
| Black |
|
||||
| Latino Hispanic |
|
||||
| East Asian |
|
||||
| Southeast Asian |
|
||||
| Indian |
|
||||
| Middle Eastern |
|
||||
|
||||
### Age Groups
|
||||
|
||||
| Group |
|
||||
|-------|
|
||||
| 0-2 |
|
||||
| 3-9 |
|
||||
| 10-19 |
|
||||
| 20-29 |
|
||||
| 30-39 |
|
||||
| 40-49 |
|
||||
| 50-59 |
|
||||
| 60-69 |
|
||||
| 70+ |
|
||||
|
||||
---
|
||||
|
||||
## Emotion
|
||||
|
||||
Predicts facial emotions. Requires PyTorch.
|
||||
|
||||
!!! warning "Optional Dependency"
|
||||
Emotion detection requires PyTorch. Install with:
|
||||
```bash
|
||||
pip install torch
|
||||
```
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
from uniface.attribute import Emotion
|
||||
from uniface.constants import DDAMFNWeights
|
||||
|
||||
detector = RetinaFace()
|
||||
emotion = Emotion(model_name=DDAMFNWeights.AFFECNET7)
|
||||
|
||||
faces = detector.detect(image)
|
||||
|
||||
for face in faces:
|
||||
result = emotion.predict(image, face.landmarks)
|
||||
print(f"Emotion: {result.emotion}")
|
||||
print(f"Confidence: {result.confidence:.2%}")
|
||||
```
|
||||
|
||||
### Emotion Classes
|
||||
|
||||
=== "7-Class (AFFECNET7)"
|
||||
|
||||
| Label |
|
||||
|-------|
|
||||
| Neutral |
|
||||
| Happy |
|
||||
| Sad |
|
||||
| Surprise |
|
||||
| Fear |
|
||||
| Disgust |
|
||||
| Anger |
|
||||
|
||||
=== "8-Class (AFFECNET8)"
|
||||
|
||||
| Label |
|
||||
|-------|
|
||||
| Neutral |
|
||||
| Happy |
|
||||
| Sad |
|
||||
| Surprise |
|
||||
| Fear |
|
||||
| Disgust |
|
||||
| Anger |
|
||||
| Contempt |
|
||||
|
||||
### Model Variants
|
||||
|
||||
```python
|
||||
from uniface.attribute import Emotion
|
||||
from uniface.constants import DDAMFNWeights
|
||||
|
||||
# 7-class emotion
|
||||
emotion = Emotion(model_name=DDAMFNWeights.AFFECNET7)
|
||||
|
||||
# 8-class emotion
|
||||
emotion = Emotion(model_name=DDAMFNWeights.AFFECNET8)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Combining Models
|
||||
|
||||
### Full Attribute Analysis
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace, AgeGender, FairFace
|
||||
|
||||
detector = RetinaFace()
|
||||
age_gender = AgeGender()
|
||||
fairface = FairFace()
|
||||
|
||||
faces = detector.detect(image)
|
||||
|
||||
for face in faces:
|
||||
# Get exact age from AgeGender
|
||||
ag_result = age_gender.predict(image, face.bbox)
|
||||
|
||||
# Get race from FairFace
|
||||
ff_result = fairface.predict(image, face.bbox)
|
||||
|
||||
print(f"Gender: {ag_result.sex}")
|
||||
print(f"Exact Age: {ag_result.age}")
|
||||
print(f"Age Group: {ff_result.age_group}")
|
||||
print(f"Race: {ff_result.race}")
|
||||
```
|
||||
|
||||
### Using FaceAnalyzer
|
||||
|
||||
```python
|
||||
from uniface import FaceAnalyzer
|
||||
|
||||
analyzer = FaceAnalyzer(
|
||||
detect=True,
|
||||
recognize=False,
|
||||
attributes=True # Uses AgeGender
|
||||
)
|
||||
|
||||
faces = analyzer.analyze(image)
|
||||
|
||||
for face in faces:
|
||||
print(f"Age: {face.age}, Gender: {face.sex}")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Visualization
|
||||
|
||||
```python
|
||||
import cv2
|
||||
|
||||
def draw_attributes(image, face, result):
|
||||
"""Draw attributes on image."""
|
||||
x1, y1, x2, y2 = map(int, face.bbox)
|
||||
|
||||
# Draw bounding box
|
||||
cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
||||
|
||||
# Build label
|
||||
label = f"{result.sex}"
|
||||
if result.age:
|
||||
label += f", {result.age}y"
|
||||
if result.age_group:
|
||||
label += f", {result.age_group}"
|
||||
if result.race:
|
||||
label += f", {result.race}"
|
||||
|
||||
# Draw label
|
||||
cv2.putText(
|
||||
image, label, (x1, y1 - 10),
|
||||
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2
|
||||
)
|
||||
|
||||
return image
|
||||
|
||||
# Usage
|
||||
for face in faces:
|
||||
result = age_gender.predict(image, face.bbox)
|
||||
image = draw_attributes(image, face, result)
|
||||
|
||||
cv2.imwrite("attributes.jpg", image)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Accuracy Notes
|
||||
|
||||
!!! note "Model Limitations"
|
||||
- **AgeGender**: Trained on CelebA; accuracy varies by demographic
|
||||
- **FairFace**: Trained for balanced demographics; better cross-racial accuracy
|
||||
- **Emotion**: Accuracy depends on facial expression clarity
|
||||
|
||||
Always test on your specific use case and consider cultural context.
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Parsing](parsing.md) - Face semantic segmentation
|
||||
- [Gaze](gaze.md) - Gaze estimation
|
||||
- [Image Pipeline Recipe](../recipes/image-pipeline.md) - Complete workflow
|
||||
305
docs/modules/detection.md
Normal file
305
docs/modules/detection.md
Normal file
@@ -0,0 +1,305 @@
|
||||
# Detection
|
||||
|
||||
Face detection is the first step in any face analysis pipeline. UniFace provides four detection models.
|
||||
|
||||
---
|
||||
|
||||
## Available Models
|
||||
|
||||
| Model | Backbone | Size | Easy | Medium | Hard | Landmarks |
|
||||
|-------|----------|------|------|--------|------|:---------:|
|
||||
| **RetinaFace** | MobileNet V2 | 3.5 MB | 91.7% | 91.0% | 86.6% | :material-check: |
|
||||
| **SCRFD** | SCRFD-10G | 17 MB | 95.2% | 93.9% | 83.1% | :material-check: |
|
||||
| **YOLOv5-Face** | YOLOv5s | 28 MB | 94.3% | 92.6% | 83.2% | :material-check: |
|
||||
| **YOLOv8-Face** | YOLOv8n | 12 MB | 94.6% | 92.3% | 79.6% | :material-check: |
|
||||
|
||||
!!! note "Dataset"
|
||||
All models trained on WIDERFACE dataset.
|
||||
---
|
||||
|
||||
## RetinaFace
|
||||
|
||||
Single-shot face detector with multi-scale feature pyramid.
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
|
||||
detector = RetinaFace()
|
||||
faces = detector.detect(image)
|
||||
|
||||
for face in faces:
|
||||
print(f"Confidence: {face.confidence:.2f}")
|
||||
print(f"BBox: {face.bbox}")
|
||||
print(f"Landmarks: {face.landmarks.shape}") # (5, 2)
|
||||
```
|
||||
|
||||
### Model Variants
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
from uniface.constants import RetinaFaceWeights
|
||||
|
||||
# Lightweight (mobile/edge)
|
||||
detector = RetinaFace(model_name=RetinaFaceWeights.MNET_025)
|
||||
|
||||
# Balanced (default)
|
||||
detector = RetinaFace(model_name=RetinaFaceWeights.MNET_V2)
|
||||
|
||||
# High accuracy
|
||||
detector = RetinaFace(model_name=RetinaFaceWeights.RESNET34)
|
||||
```
|
||||
|
||||
| Variant | Params | Size | Easy | Medium | Hard |
|
||||
|---------|--------|------|------|--------|------|
|
||||
| MNET_025 | 0.4M | 1.7 MB | 88.5% | 87.0% | 80.6% |
|
||||
| MNET_050 | 1.0M | 2.6 MB | 89.4% | 88.0% | 82.4% |
|
||||
| MNET_V1 | 3.5M | 3.8 MB | 90.6% | 89.1% | 84.1% |
|
||||
| **MNET_V2** :material-check-circle: | 3.2M | 3.5 MB | 91.7% | 91.0% | 86.6% |
|
||||
| RESNET18 | 11.7M | 27 MB | 92.5% | 91.0% | 86.6% |
|
||||
| RESNET34 | 24.8M | 56 MB | 94.2% | 93.1% | 88.9% |
|
||||
|
||||
### Configuration
|
||||
|
||||
```python
|
||||
detector = RetinaFace(
|
||||
model_name=RetinaFaceWeights.MNET_V2,
|
||||
confidence_threshold=0.5, # Min confidence
|
||||
nms_threshold=0.4, # NMS IoU threshold
|
||||
input_size=(640, 640), # Input resolution
|
||||
dynamic_size=False # Enable dynamic input size
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## SCRFD
|
||||
|
||||
State-of-the-art detection with excellent accuracy-speed tradeoff.
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
from uniface import SCRFD
|
||||
|
||||
detector = SCRFD()
|
||||
faces = detector.detect(image)
|
||||
```
|
||||
|
||||
### Model Variants
|
||||
|
||||
```python
|
||||
from uniface import SCRFD
|
||||
from uniface.constants import SCRFDWeights
|
||||
|
||||
# Real-time (lightweight)
|
||||
detector = SCRFD(model_name=SCRFDWeights.SCRFD_500M_KPS)
|
||||
|
||||
# High accuracy (default)
|
||||
detector = SCRFD(model_name=SCRFDWeights.SCRFD_10G_KPS)
|
||||
```
|
||||
|
||||
| Variant | Params | Size | Easy | Medium | Hard |
|
||||
|---------|--------|------|------|--------|------|
|
||||
| SCRFD_500M_KPS | 0.6M | 2.5 MB | 90.6% | 88.1% | 68.5% |
|
||||
| **SCRFD_10G_KPS** :material-check-circle: | 4.2M | 17 MB | 95.2% | 93.9% | 83.1% |
|
||||
|
||||
### Configuration
|
||||
|
||||
```python
|
||||
detector = SCRFD(
|
||||
model_name=SCRFDWeights.SCRFD_10G_KPS,
|
||||
confidence_threshold=0.5,
|
||||
nms_threshold=0.4,
|
||||
input_size=(640, 640)
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## YOLOv5-Face
|
||||
|
||||
YOLO-based detection optimized for faces.
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
from uniface import YOLOv5Face
|
||||
|
||||
detector = YOLOv5Face()
|
||||
faces = detector.detect(image)
|
||||
```
|
||||
|
||||
### Model Variants
|
||||
|
||||
```python
|
||||
from uniface import YOLOv5Face
|
||||
from uniface.constants import YOLOv5FaceWeights
|
||||
|
||||
# Lightweight
|
||||
detector = YOLOv5Face(model_name=YOLOv5FaceWeights.YOLOV5N)
|
||||
|
||||
# Balanced (default)
|
||||
detector = YOLOv5Face(model_name=YOLOv5FaceWeights.YOLOV5S)
|
||||
|
||||
# High accuracy
|
||||
detector = YOLOv5Face(model_name=YOLOv5FaceWeights.YOLOV5M)
|
||||
```
|
||||
|
||||
| Variant | Size | Easy | Medium | Hard |
|
||||
|---------|------|------|--------|------|
|
||||
| YOLOV5N | 11 MB | 93.6% | 91.5% | 80.5% |
|
||||
| **YOLOV5S** :material-check-circle: | 28 MB | 94.3% | 92.6% | 83.2% |
|
||||
| YOLOV5M | 82 MB | 95.3% | 93.8% | 85.3% |
|
||||
|
||||
!!! note "Fixed Input Size"
|
||||
YOLOv5-Face uses a fixed input size of 640×640.
|
||||
|
||||
### Configuration
|
||||
|
||||
```python
|
||||
detector = YOLOv5Face(
|
||||
model_name=YOLOv5FaceWeights.YOLOV5S,
|
||||
confidence_threshold=0.6,
|
||||
nms_threshold=0.5,
|
||||
nms_mode='numpy' # or 'torchvision' for faster NMS
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## YOLOv8-Face
|
||||
|
||||
Anchor-free detection with DFL (Distribution Focal Loss) for accurate bbox regression.
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
from uniface import YOLOv8Face
|
||||
|
||||
detector = YOLOv8Face()
|
||||
faces = detector.detect(image)
|
||||
```
|
||||
|
||||
### Model Variants
|
||||
|
||||
```python
|
||||
from uniface import YOLOv8Face
|
||||
from uniface.constants import YOLOv8FaceWeights
|
||||
|
||||
# Lightweight
|
||||
detector = YOLOv8Face(model_name=YOLOv8FaceWeights.YOLOV8_LITE_S)
|
||||
|
||||
# Recommended (default)
|
||||
detector = YOLOv8Face(model_name=YOLOv8FaceWeights.YOLOV8N)
|
||||
```
|
||||
|
||||
| Variant | Size | Easy | Medium | Hard |
|
||||
|---------|------|------|--------|------|
|
||||
| YOLOV8_LITE_S | 7.4 MB | 93.4% | 91.2% | 78.6% |
|
||||
| **YOLOV8N** :material-check-circle: | 12 MB | 94.6% | 92.3% | 79.6% |
|
||||
|
||||
!!! note "Fixed Input Size"
|
||||
YOLOv8-Face uses a fixed input size of 640×640.
|
||||
|
||||
### Configuration
|
||||
|
||||
```python
|
||||
detector = YOLOv8Face(
|
||||
model_name=YOLOv8FaceWeights.YOLOV8N,
|
||||
confidence_threshold=0.5,
|
||||
nms_threshold=0.45,
|
||||
nms_mode='numpy' # or 'torchvision' for faster NMS
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Factory Function
|
||||
|
||||
Create detectors dynamically:
|
||||
|
||||
```python
|
||||
from uniface import create_detector
|
||||
|
||||
detector = create_detector('retinaface')
|
||||
# or
|
||||
detector = create_detector('scrfd')
|
||||
# or
|
||||
detector = create_detector('yolov5face')
|
||||
# or
|
||||
detector = create_detector('yolov8face')
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## High-Level API
|
||||
|
||||
One-line detection:
|
||||
|
||||
```python
|
||||
from uniface import detect_faces
|
||||
|
||||
# Using RetinaFace (default)
|
||||
faces = detect_faces(image, method='retinaface', confidence_threshold=0.5)
|
||||
|
||||
# Using YOLOv8-Face
|
||||
faces = detect_faces(image, method='yolov8face', confidence_threshold=0.5)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Output Format
|
||||
|
||||
All detectors return `list[Face]`:
|
||||
|
||||
```python
|
||||
for face in faces:
|
||||
# Bounding box [x1, y1, x2, y2]
|
||||
bbox = face.bbox
|
||||
|
||||
# Detection confidence (0-1)
|
||||
confidence = face.confidence
|
||||
|
||||
# 5-point landmarks (5, 2)
|
||||
landmarks = face.landmarks
|
||||
# [left_eye, right_eye, nose, left_mouth, right_mouth]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Visualization
|
||||
|
||||
```python
|
||||
from uniface.visualization import draw_detections
|
||||
|
||||
draw_detections(
|
||||
image=image,
|
||||
bboxes=[f.bbox for f in faces],
|
||||
scores=[f.confidence for f in faces],
|
||||
landmarks=[f.landmarks for f in faces],
|
||||
vis_threshold=0.6
|
||||
)
|
||||
|
||||
cv2.imwrite("result.jpg", image)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Performance Comparison
|
||||
|
||||
Benchmark on your hardware:
|
||||
|
||||
```bash
|
||||
python tools/detection.py --source image.jpg --iterations 100
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [Recognition Module](recognition.md) - Extract embeddings from detected faces
|
||||
- [Landmarks Module](landmarks.md) - Get 106-point landmarks
|
||||
- [Image Pipeline Recipe](../recipes/image-pipeline.md) - Complete detection workflow
|
||||
- [Concepts: Thresholds](../concepts/thresholds-calibration.md) - Tuning detection parameters
|
||||
270
docs/modules/gaze.md
Normal file
270
docs/modules/gaze.md
Normal file
@@ -0,0 +1,270 @@
|
||||
# Gaze Estimation
|
||||
|
||||
Gaze estimation predicts where a person is looking (pitch and yaw angles).
|
||||
|
||||
---
|
||||
|
||||
## Available Models
|
||||
|
||||
| Model | Backbone | Size | MAE* |
|
||||
|-------|----------|------|------|
|
||||
| ResNet18 | ResNet18 | 43 MB | 12.84° |
|
||||
| **ResNet34** :material-check-circle: | ResNet34 | 82 MB | 11.33° |
|
||||
| ResNet50 | ResNet50 | 91 MB | 11.34° |
|
||||
| MobileNetV2 | MobileNetV2 | 9.6 MB | 13.07° |
|
||||
| MobileOne-S0 | MobileOne | 4.8 MB | 12.58° |
|
||||
|
||||
*MAE = Mean Absolute Error on Gaze360 test set (lower is better)
|
||||
|
||||
---
|
||||
|
||||
## Basic Usage
|
||||
|
||||
```python
|
||||
import cv2
|
||||
import numpy as np
|
||||
from uniface import RetinaFace, MobileGaze
|
||||
|
||||
detector = RetinaFace()
|
||||
gaze_estimator = MobileGaze()
|
||||
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
for face in faces:
|
||||
# Crop face
|
||||
x1, y1, x2, y2 = map(int, face.bbox)
|
||||
face_crop = image[y1:y2, x1:x2]
|
||||
|
||||
if face_crop.size > 0:
|
||||
# Estimate gaze
|
||||
result = gaze_estimator.estimate(face_crop)
|
||||
|
||||
# Convert to degrees
|
||||
pitch_deg = np.degrees(result.pitch)
|
||||
yaw_deg = np.degrees(result.yaw)
|
||||
|
||||
print(f"Pitch: {pitch_deg:.1f}°, Yaw: {yaw_deg:.1f}°")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Model Variants
|
||||
|
||||
```python
|
||||
from uniface import MobileGaze
|
||||
from uniface.constants import GazeWeights
|
||||
|
||||
# Default (ResNet34, recommended)
|
||||
gaze = MobileGaze()
|
||||
|
||||
# Lightweight for mobile/edge
|
||||
gaze = MobileGaze(model_name=GazeWeights.MOBILEONE_S0)
|
||||
|
||||
# Higher accuracy
|
||||
gaze = MobileGaze(model_name=GazeWeights.RESNET50)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Output Format
|
||||
|
||||
```python
|
||||
result = gaze_estimator.estimate(face_crop)
|
||||
|
||||
# GazeResult dataclass
|
||||
result.pitch # Vertical angle in radians
|
||||
result.yaw # Horizontal angle in radians
|
||||
```
|
||||
|
||||
### Angle Convention
|
||||
|
||||
```
|
||||
pitch = +90° (looking up)
|
||||
│
|
||||
│
|
||||
yaw = -90° ────┼──── yaw = +90°
|
||||
(looking left) │ (looking right)
|
||||
│
|
||||
pitch = -90° (looking down)
|
||||
```
|
||||
|
||||
- **Pitch**: Vertical gaze angle
|
||||
- Positive = looking up
|
||||
- Negative = looking down
|
||||
|
||||
- **Yaw**: Horizontal gaze angle
|
||||
- Positive = looking right
|
||||
- Negative = looking left
|
||||
|
||||
---
|
||||
|
||||
## Visualization
|
||||
|
||||
```python
|
||||
from uniface.visualization import draw_gaze
|
||||
|
||||
# Detect faces
|
||||
faces = detector.detect(image)
|
||||
|
||||
for face in faces:
|
||||
x1, y1, x2, y2 = map(int, face.bbox)
|
||||
face_crop = image[y1:y2, x1:x2]
|
||||
|
||||
if face_crop.size > 0:
|
||||
result = gaze_estimator.estimate(face_crop)
|
||||
|
||||
# Draw gaze arrow on image
|
||||
draw_gaze(image, face.bbox, result.pitch, result.yaw)
|
||||
|
||||
cv2.imwrite("gaze_output.jpg", image)
|
||||
```
|
||||
|
||||
### Custom Visualization
|
||||
|
||||
```python
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
def draw_gaze_custom(image, bbox, pitch, yaw, length=100, color=(0, 255, 0)):
|
||||
"""Draw custom gaze arrow."""
|
||||
x1, y1, x2, y2 = map(int, bbox)
|
||||
|
||||
# Face center
|
||||
cx = (x1 + x2) // 2
|
||||
cy = (y1 + y2) // 2
|
||||
|
||||
# Calculate endpoint
|
||||
dx = -length * np.sin(yaw) * np.cos(pitch)
|
||||
dy = -length * np.sin(pitch)
|
||||
|
||||
# Draw arrow
|
||||
end_x = int(cx + dx)
|
||||
end_y = int(cy + dy)
|
||||
|
||||
cv2.arrowedLine(image, (cx, cy), (end_x, end_y), color, 2, tipLength=0.3)
|
||||
|
||||
return image
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Real-Time Gaze Tracking
|
||||
|
||||
```python
|
||||
import cv2
|
||||
import numpy as np
|
||||
from uniface import RetinaFace, MobileGaze
|
||||
from uniface.visualization import draw_gaze
|
||||
|
||||
detector = RetinaFace()
|
||||
gaze_estimator = MobileGaze()
|
||||
|
||||
cap = cv2.VideoCapture(0)
|
||||
|
||||
while True:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
faces = detector.detect(frame)
|
||||
|
||||
for face in faces:
|
||||
x1, y1, x2, y2 = map(int, face.bbox)
|
||||
face_crop = frame[y1:y2, x1:x2]
|
||||
|
||||
if face_crop.size > 0:
|
||||
result = gaze_estimator.estimate(face_crop)
|
||||
|
||||
# Draw bounding box
|
||||
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
||||
|
||||
# Draw gaze
|
||||
draw_gaze(frame, face.bbox, result.pitch, result.yaw)
|
||||
|
||||
# Display angles
|
||||
pitch_deg = np.degrees(result.pitch)
|
||||
yaw_deg = np.degrees(result.yaw)
|
||||
label = f"P:{pitch_deg:.0f} Y:{yaw_deg:.0f}"
|
||||
cv2.putText(frame, label, (x1, y1 - 10),
|
||||
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
|
||||
|
||||
cv2.imshow("Gaze Estimation", frame)
|
||||
|
||||
if cv2.waitKey(1) & 0xFF == ord('q'):
|
||||
break
|
||||
|
||||
cap.release()
|
||||
cv2.destroyAllWindows()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Use Cases
|
||||
|
||||
### Attention Detection
|
||||
|
||||
```python
|
||||
def is_looking_at_camera(result, threshold=15):
|
||||
"""Check if person is looking at camera."""
|
||||
pitch_deg = abs(np.degrees(result.pitch))
|
||||
yaw_deg = abs(np.degrees(result.yaw))
|
||||
|
||||
return pitch_deg < threshold and yaw_deg < threshold
|
||||
|
||||
# Usage
|
||||
result = gaze_estimator.estimate(face_crop)
|
||||
if is_looking_at_camera(result):
|
||||
print("Looking at camera")
|
||||
else:
|
||||
print("Looking away")
|
||||
```
|
||||
|
||||
### Gaze Direction Classification
|
||||
|
||||
```python
|
||||
def classify_gaze_direction(result, threshold=20):
|
||||
"""Classify gaze into directions."""
|
||||
pitch_deg = np.degrees(result.pitch)
|
||||
yaw_deg = np.degrees(result.yaw)
|
||||
|
||||
directions = []
|
||||
|
||||
if pitch_deg > threshold:
|
||||
directions.append("up")
|
||||
elif pitch_deg < -threshold:
|
||||
directions.append("down")
|
||||
|
||||
if yaw_deg > threshold:
|
||||
directions.append("right")
|
||||
elif yaw_deg < -threshold:
|
||||
directions.append("left")
|
||||
|
||||
if not directions:
|
||||
return "center"
|
||||
|
||||
return " ".join(directions)
|
||||
|
||||
# Usage
|
||||
result = gaze_estimator.estimate(face_crop)
|
||||
direction = classify_gaze_direction(result)
|
||||
print(f"Looking: {direction}")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Factory Function
|
||||
|
||||
```python
|
||||
from uniface import create_gaze_estimator
|
||||
|
||||
gaze = create_gaze_estimator() # Returns MobileGaze
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Anti-Spoofing](spoofing.md) - Face liveness detection
|
||||
- [Privacy](privacy.md) - Face anonymization
|
||||
- [Video Recipe](../recipes/video-webcam.md) - Real-time processing
|
||||
251
docs/modules/landmarks.md
Normal file
251
docs/modules/landmarks.md
Normal file
@@ -0,0 +1,251 @@
|
||||
# Landmarks
|
||||
|
||||
Facial landmark detection provides precise localization of facial features.
|
||||
|
||||
---
|
||||
|
||||
## Available Models
|
||||
|
||||
| Model | Points | Size |
|
||||
|-------|--------|------|
|
||||
| **Landmark106** | 106 | 14 MB |
|
||||
|
||||
!!! info "5-Point Landmarks"
|
||||
Basic 5-point landmarks are included with all detection models (RetinaFace, SCRFD, YOLOv5-Face, YOLOv8-Face).
|
||||
|
||||
---
|
||||
|
||||
## 106-Point Landmarks
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace, Landmark106
|
||||
|
||||
detector = RetinaFace()
|
||||
landmarker = Landmark106()
|
||||
|
||||
# Detect face
|
||||
faces = detector.detect(image)
|
||||
|
||||
# Get detailed landmarks
|
||||
if faces:
|
||||
landmarks = landmarker.get_landmarks(image, faces[0].bbox)
|
||||
print(f"Landmarks shape: {landmarks.shape}") # (106, 2)
|
||||
```
|
||||
|
||||
### Landmark Groups
|
||||
|
||||
| Range | Group | Points |
|
||||
|-------|-------|--------|
|
||||
| 0-32 | Face Contour | 33 |
|
||||
| 33-50 | Eyebrows | 18 |
|
||||
| 51-62 | Nose | 12 |
|
||||
| 63-86 | Eyes | 24 |
|
||||
| 87-105 | Mouth | 19 |
|
||||
|
||||
### Extract Specific Features
|
||||
|
||||
```python
|
||||
landmarks = landmarker.get_landmarks(image, face.bbox)
|
||||
|
||||
# Face contour
|
||||
contour = landmarks[0:33]
|
||||
|
||||
# Left eyebrow
|
||||
left_eyebrow = landmarks[33:42]
|
||||
|
||||
# Right eyebrow
|
||||
right_eyebrow = landmarks[42:51]
|
||||
|
||||
# Nose
|
||||
nose = landmarks[51:63]
|
||||
|
||||
# Left eye
|
||||
left_eye = landmarks[63:72]
|
||||
|
||||
# Right eye
|
||||
right_eye = landmarks[76:84]
|
||||
|
||||
# Mouth
|
||||
mouth = landmarks[87:106]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5-Point Landmarks (Detection)
|
||||
|
||||
All detection models provide 5-point landmarks:
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
|
||||
detector = RetinaFace()
|
||||
faces = detector.detect(image)
|
||||
|
||||
if faces:
|
||||
landmarks_5 = faces[0].landmarks
|
||||
print(f"Shape: {landmarks_5.shape}") # (5, 2)
|
||||
|
||||
left_eye = landmarks_5[0]
|
||||
right_eye = landmarks_5[1]
|
||||
nose = landmarks_5[2]
|
||||
left_mouth = landmarks_5[3]
|
||||
right_mouth = landmarks_5[4]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Visualization
|
||||
|
||||
### Draw 106 Landmarks
|
||||
|
||||
```python
|
||||
import cv2
|
||||
|
||||
def draw_landmarks(image, landmarks, color=(0, 255, 0), radius=2):
|
||||
"""Draw landmarks on image."""
|
||||
for x, y in landmarks.astype(int):
|
||||
cv2.circle(image, (x, y), radius, color, -1)
|
||||
return image
|
||||
|
||||
# Usage
|
||||
landmarks = landmarker.get_landmarks(image, face.bbox)
|
||||
image_with_landmarks = draw_landmarks(image.copy(), landmarks)
|
||||
cv2.imwrite("landmarks.jpg", image_with_landmarks)
|
||||
```
|
||||
|
||||
### Draw with Connections
|
||||
|
||||
```python
|
||||
def draw_landmarks_with_connections(image, landmarks):
|
||||
"""Draw landmarks with facial feature connections."""
|
||||
landmarks = landmarks.astype(int)
|
||||
|
||||
# Face contour (0-32)
|
||||
for i in range(32):
|
||||
cv2.line(image, tuple(landmarks[i]), tuple(landmarks[i+1]), (255, 255, 0), 1)
|
||||
|
||||
# Left eyebrow (33-41)
|
||||
for i in range(33, 41):
|
||||
cv2.line(image, tuple(landmarks[i]), tuple(landmarks[i+1]), (0, 255, 0), 1)
|
||||
|
||||
# Right eyebrow (42-50)
|
||||
for i in range(42, 50):
|
||||
cv2.line(image, tuple(landmarks[i]), tuple(landmarks[i+1]), (0, 255, 0), 1)
|
||||
|
||||
# Nose (51-62)
|
||||
for i in range(51, 62):
|
||||
cv2.line(image, tuple(landmarks[i]), tuple(landmarks[i+1]), (0, 0, 255), 1)
|
||||
|
||||
# Draw points
|
||||
for x, y in landmarks:
|
||||
cv2.circle(image, (x, y), 2, (0, 255, 255), -1)
|
||||
|
||||
return image
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Use Cases
|
||||
|
||||
### Face Alignment
|
||||
|
||||
```python
|
||||
from uniface import face_alignment
|
||||
|
||||
# Align face using 5-point landmarks
|
||||
aligned = face_alignment(image, faces[0].landmarks)
|
||||
# Returns: 112x112 aligned face
|
||||
```
|
||||
|
||||
### Eye Aspect Ratio (Blink Detection)
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
def eye_aspect_ratio(eye_landmarks):
|
||||
"""Calculate eye aspect ratio for blink detection."""
|
||||
# Vertical distances
|
||||
v1 = np.linalg.norm(eye_landmarks[1] - eye_landmarks[5])
|
||||
v2 = np.linalg.norm(eye_landmarks[2] - eye_landmarks[4])
|
||||
|
||||
# Horizontal distance
|
||||
h = np.linalg.norm(eye_landmarks[0] - eye_landmarks[3])
|
||||
|
||||
ear = (v1 + v2) / (2.0 * h)
|
||||
return ear
|
||||
|
||||
# Usage with 106-point landmarks
|
||||
left_eye = landmarks[63:72] # Approximate eye points
|
||||
ear = eye_aspect_ratio(left_eye)
|
||||
|
||||
if ear < 0.2:
|
||||
print("Eye closed (blink detected)")
|
||||
```
|
||||
|
||||
### Head Pose Estimation
|
||||
|
||||
```python
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
def estimate_head_pose(landmarks, image_shape):
|
||||
"""Estimate head pose from facial landmarks."""
|
||||
# 3D model points (generic face model)
|
||||
model_points = np.array([
|
||||
(0.0, 0.0, 0.0), # Nose tip
|
||||
(0.0, -330.0, -65.0), # Chin
|
||||
(-225.0, 170.0, -135.0), # Left eye corner
|
||||
(225.0, 170.0, -135.0), # Right eye corner
|
||||
(-150.0, -150.0, -125.0), # Left mouth corner
|
||||
(150.0, -150.0, -125.0) # Right mouth corner
|
||||
], dtype=np.float64)
|
||||
|
||||
# 2D image points (from 106 landmarks)
|
||||
image_points = np.array([
|
||||
landmarks[51], # Nose tip
|
||||
landmarks[16], # Chin
|
||||
landmarks[63], # Left eye corner
|
||||
landmarks[76], # Right eye corner
|
||||
landmarks[87], # Left mouth corner
|
||||
landmarks[93] # Right mouth corner
|
||||
], dtype=np.float64)
|
||||
|
||||
# Camera matrix
|
||||
h, w = image_shape[:2]
|
||||
focal_length = w
|
||||
center = (w / 2, h / 2)
|
||||
camera_matrix = np.array([
|
||||
[focal_length, 0, center[0]],
|
||||
[0, focal_length, center[1]],
|
||||
[0, 0, 1]
|
||||
], dtype=np.float64)
|
||||
|
||||
# Solve PnP
|
||||
dist_coeffs = np.zeros((4, 1))
|
||||
success, rotation_vector, translation_vector = cv2.solvePnP(
|
||||
model_points, image_points, camera_matrix, dist_coeffs
|
||||
)
|
||||
|
||||
return rotation_vector, translation_vector
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Factory Function
|
||||
|
||||
```python
|
||||
from uniface import create_landmarker
|
||||
|
||||
landmarker = create_landmarker() # Returns Landmark106
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [Detection Module](detection.md) - Face detection with 5-point landmarks
|
||||
- [Attributes Module](attributes.md) - Age, gender, emotion
|
||||
- [Gaze Module](gaze.md) - Gaze estimation
|
||||
- [Concepts: Coordinate Systems](../concepts/coordinate-systems.md) - Landmark formats
|
||||
265
docs/modules/parsing.md
Normal file
265
docs/modules/parsing.md
Normal file
@@ -0,0 +1,265 @@
|
||||
# Parsing
|
||||
|
||||
Face parsing segments faces into semantic components (skin, eyes, nose, mouth, hair, etc.).
|
||||
|
||||
---
|
||||
|
||||
## Available Models
|
||||
|
||||
| Model | Backbone | Size | Classes |
|
||||
|-------|----------|------|---------|
|
||||
| **BiSeNet ResNet18** :material-check-circle: | ResNet18 | 51 MB | 19 |
|
||||
| BiSeNet ResNet34 | ResNet34 | 89 MB | 19 |
|
||||
|
||||
---
|
||||
|
||||
## Basic Usage
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface.parsing import BiSeNet
|
||||
from uniface.visualization import vis_parsing_maps
|
||||
|
||||
# Initialize parser
|
||||
parser = BiSeNet()
|
||||
|
||||
# Load face image (cropped)
|
||||
face_image = cv2.imread("face.jpg")
|
||||
|
||||
# Parse face
|
||||
mask = parser.parse(face_image)
|
||||
print(f"Mask shape: {mask.shape}") # (H, W)
|
||||
|
||||
# Visualize
|
||||
face_rgb = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
|
||||
vis_result = vis_parsing_maps(face_rgb, mask, save_image=False)
|
||||
|
||||
# Save result
|
||||
vis_bgr = cv2.cvtColor(vis_result, cv2.COLOR_RGB2BGR)
|
||||
cv2.imwrite("parsed.jpg", vis_bgr)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 19 Facial Component Classes
|
||||
|
||||
| ID | Class | ID | Class |
|
||||
|----|-------|----|-------|
|
||||
| 0 | Background | 10 | Ear Ring |
|
||||
| 1 | Skin | 11 | Nose |
|
||||
| 2 | Left Eyebrow | 12 | Mouth |
|
||||
| 3 | Right Eyebrow | 13 | Upper Lip |
|
||||
| 4 | Left Eye | 14 | Lower Lip |
|
||||
| 5 | Right Eye | 15 | Neck |
|
||||
| 6 | Eye Glasses | 16 | Neck Lace |
|
||||
| 7 | Left Ear | 17 | Cloth |
|
||||
| 8 | Right Ear | 18 | Hair |
|
||||
| 9 | Hat | | |
|
||||
|
||||
---
|
||||
|
||||
## Model Variants
|
||||
|
||||
```python
|
||||
from uniface.parsing import BiSeNet
|
||||
from uniface.constants import ParsingWeights
|
||||
|
||||
# Default (ResNet18)
|
||||
parser = BiSeNet()
|
||||
|
||||
# Higher accuracy (ResNet34)
|
||||
parser = BiSeNet(model_name=ParsingWeights.RESNET34)
|
||||
```
|
||||
|
||||
| Variant | Params | Size |
|
||||
|---------|--------|------|
|
||||
| **RESNET18** :material-check-circle: | 13.3M | 51 MB |
|
||||
| RESNET34 | 24.1M | 89 MB |
|
||||
|
||||
---
|
||||
|
||||
## Full Pipeline
|
||||
|
||||
### With Face Detection
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.parsing import BiSeNet
|
||||
from uniface.visualization import vis_parsing_maps
|
||||
|
||||
detector = RetinaFace()
|
||||
parser = BiSeNet()
|
||||
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
for i, face in enumerate(faces):
|
||||
# Crop face
|
||||
x1, y1, x2, y2 = map(int, face.bbox)
|
||||
face_crop = image[y1:y2, x1:x2]
|
||||
|
||||
# Parse
|
||||
mask = parser.parse(face_crop)
|
||||
|
||||
# Visualize
|
||||
face_rgb = cv2.cvtColor(face_crop, cv2.COLOR_BGR2RGB)
|
||||
vis_result = vis_parsing_maps(face_rgb, mask, save_image=False)
|
||||
|
||||
# Save
|
||||
vis_bgr = cv2.cvtColor(vis_result, cv2.COLOR_RGB2BGR)
|
||||
cv2.imwrite(f"face_{i}_parsed.jpg", vis_bgr)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Extract Specific Components
|
||||
|
||||
### Get Single Component Mask
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
# Parse face
|
||||
mask = parser.parse(face_image)
|
||||
|
||||
# Extract specific component
|
||||
SKIN = 1
|
||||
HAIR = 18
|
||||
LEFT_EYE = 4
|
||||
RIGHT_EYE = 5
|
||||
|
||||
# Binary mask for skin
|
||||
skin_mask = (mask == SKIN).astype(np.uint8) * 255
|
||||
|
||||
# Binary mask for hair
|
||||
hair_mask = (mask == HAIR).astype(np.uint8) * 255
|
||||
|
||||
# Binary mask for eyes
|
||||
eyes_mask = ((mask == LEFT_EYE) | (mask == RIGHT_EYE)).astype(np.uint8) * 255
|
||||
```
|
||||
|
||||
### Count Pixels per Component
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
|
||||
mask = parser.parse(face_image)
|
||||
|
||||
component_names = {
|
||||
0: 'Background', 1: 'Skin', 2: 'L-Eyebrow', 3: 'R-Eyebrow',
|
||||
4: 'L-Eye', 5: 'R-Eye', 6: 'Glasses', 7: 'L-Ear', 8: 'R-Ear',
|
||||
9: 'Hat', 10: 'Earring', 11: 'Nose', 12: 'Mouth',
|
||||
13: 'U-Lip', 14: 'L-Lip', 15: 'Neck', 16: 'Necklace',
|
||||
17: 'Cloth', 18: 'Hair'
|
||||
}
|
||||
|
||||
for class_id in np.unique(mask):
|
||||
pixel_count = np.sum(mask == class_id)
|
||||
name = component_names.get(class_id, f'Class {class_id}')
|
||||
print(f"{name}: {pixel_count} pixels")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Applications
|
||||
|
||||
### Face Makeup
|
||||
|
||||
Apply virtual makeup using component masks:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
def apply_lip_color(image, mask, color=(180, 50, 50)):
|
||||
"""Apply lip color using parsing mask."""
|
||||
result = image.copy()
|
||||
|
||||
# Get lip mask (upper + lower lip)
|
||||
lip_mask = ((mask == 13) | (mask == 14)).astype(np.uint8)
|
||||
|
||||
# Create color overlay
|
||||
overlay = np.zeros_like(image)
|
||||
overlay[:] = color
|
||||
|
||||
# Blend with original
|
||||
lip_region = cv2.bitwise_and(overlay, overlay, mask=lip_mask)
|
||||
non_lip = cv2.bitwise_and(result, result, mask=1 - lip_mask)
|
||||
|
||||
# Combine with alpha blending
|
||||
alpha = 0.4
|
||||
result = cv2.addWeighted(result, 1 - alpha * lip_mask[:,:,np.newaxis] / 255,
|
||||
lip_region, alpha, 0)
|
||||
|
||||
return result.astype(np.uint8)
|
||||
```
|
||||
|
||||
### Background Replacement
|
||||
|
||||
```python
|
||||
def replace_background(image, mask, background):
|
||||
"""Replace background using parsing mask."""
|
||||
# Create foreground mask (everything except background)
|
||||
foreground_mask = (mask != 0).astype(np.uint8)
|
||||
|
||||
# Resize background to match image
|
||||
background = cv2.resize(background, (image.shape[1], image.shape[0]))
|
||||
|
||||
# Combine
|
||||
result = image.copy()
|
||||
result[foreground_mask == 0] = background[foreground_mask == 0]
|
||||
|
||||
return result
|
||||
```
|
||||
|
||||
### Hair Segmentation
|
||||
|
||||
```python
|
||||
def get_hair_mask(mask):
|
||||
"""Extract clean hair mask."""
|
||||
hair_mask = (mask == 18).astype(np.uint8) * 255
|
||||
|
||||
# Clean up with morphological operations
|
||||
kernel = np.ones((5, 5), np.uint8)
|
||||
hair_mask = cv2.morphologyEx(hair_mask, cv2.MORPH_CLOSE, kernel)
|
||||
hair_mask = cv2.morphologyEx(hair_mask, cv2.MORPH_OPEN, kernel)
|
||||
|
||||
return hair_mask
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Visualization Options
|
||||
|
||||
```python
|
||||
from uniface.visualization import vis_parsing_maps
|
||||
|
||||
# Default visualization
|
||||
vis_result = vis_parsing_maps(face_rgb, mask)
|
||||
|
||||
# With different parameters
|
||||
vis_result = vis_parsing_maps(
|
||||
face_rgb,
|
||||
mask,
|
||||
save_image=False, # Don't save to file
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Factory Function
|
||||
|
||||
```python
|
||||
from uniface import create_face_parser
|
||||
|
||||
parser = create_face_parser() # Returns BiSeNet
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Gaze](gaze.md) - Gaze estimation
|
||||
- [Privacy](privacy.md) - Face anonymization
|
||||
- [Detection](detection.md) - Face detection
|
||||
277
docs/modules/privacy.md
Normal file
277
docs/modules/privacy.md
Normal file
@@ -0,0 +1,277 @@
|
||||
# Privacy
|
||||
|
||||
Face anonymization protects privacy by blurring or obscuring faces in images and videos.
|
||||
|
||||
---
|
||||
|
||||
## Available Methods
|
||||
|
||||
| Method | Description |
|
||||
|--------|-------------|
|
||||
| **pixelate** | Blocky pixelation |
|
||||
| **gaussian** | Smooth blur |
|
||||
| **blackout** | Solid color fill |
|
||||
| **elliptical** | Oval-shaped blur |
|
||||
| **median** | Edge-preserving blur |
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
### One-Line Anonymization
|
||||
|
||||
```python
|
||||
from uniface.privacy import anonymize_faces
|
||||
import cv2
|
||||
|
||||
image = cv2.imread("group_photo.jpg")
|
||||
anonymized = anonymize_faces(image, method='pixelate')
|
||||
cv2.imwrite("anonymized.jpg", anonymized)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## BlurFace Class
|
||||
|
||||
For more control, use the `BlurFace` class:
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
from uniface.privacy import BlurFace
|
||||
import cv2
|
||||
|
||||
detector = RetinaFace()
|
||||
blurrer = BlurFace(method='gaussian', blur_strength=5.0)
|
||||
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
anonymized = blurrer.anonymize(image, faces)
|
||||
|
||||
cv2.imwrite("anonymized.jpg", anonymized)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Blur Methods
|
||||
|
||||
### Pixelate
|
||||
|
||||
Blocky pixelation effect (common in news media):
|
||||
|
||||
```python
|
||||
blurrer = BlurFace(method='pixelate', pixel_blocks=10)
|
||||
```
|
||||
|
||||
| Parameter | Default | Description |
|
||||
|-----------|---------|-------------|
|
||||
| `pixel_blocks` | 10 | Number of blocks (lower = more pixelated) |
|
||||
|
||||
### Gaussian
|
||||
|
||||
Smooth, natural-looking blur:
|
||||
|
||||
```python
|
||||
blurrer = BlurFace(method='gaussian', blur_strength=3.0)
|
||||
```
|
||||
|
||||
| Parameter | Default | Description |
|
||||
|-----------|---------|-------------|
|
||||
| `blur_strength` | 3.0 | Blur intensity (higher = more blur) |
|
||||
|
||||
### Blackout
|
||||
|
||||
Solid color fill for maximum privacy:
|
||||
|
||||
```python
|
||||
blurrer = BlurFace(method='blackout', color=(0, 0, 0))
|
||||
```
|
||||
|
||||
| Parameter | Default | Description |
|
||||
|-----------|---------|-------------|
|
||||
| `color` | (0, 0, 0) | Fill color (BGR format) |
|
||||
|
||||
### Elliptical
|
||||
|
||||
Oval-shaped blur matching natural face shape:
|
||||
|
||||
```python
|
||||
blurrer = BlurFace(method='elliptical', blur_strength=3.0, margin=20)
|
||||
```
|
||||
|
||||
| Parameter | Default | Description |
|
||||
|-----------|---------|-------------|
|
||||
| `blur_strength` | 3.0 | Blur intensity |
|
||||
| `margin` | 20 | Margin around face |
|
||||
|
||||
### Median
|
||||
|
||||
Edge-preserving blur with artistic effect:
|
||||
|
||||
```python
|
||||
blurrer = BlurFace(method='median', blur_strength=3.0)
|
||||
```
|
||||
|
||||
| Parameter | Default | Description |
|
||||
|-----------|---------|-------------|
|
||||
| `blur_strength` | 3.0 | Blur intensity |
|
||||
|
||||
---
|
||||
|
||||
## In-Place Processing
|
||||
|
||||
Modify image directly (faster, saves memory):
|
||||
|
||||
```python
|
||||
blurrer = BlurFace(method='pixelate')
|
||||
|
||||
# In-place modification
|
||||
result = blurrer.anonymize(image, faces, inplace=True)
|
||||
# 'image' and 'result' point to the same array
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Real-Time Anonymization
|
||||
|
||||
### Webcam
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.privacy import BlurFace
|
||||
|
||||
detector = RetinaFace()
|
||||
blurrer = BlurFace(method='pixelate')
|
||||
|
||||
cap = cv2.VideoCapture(0)
|
||||
|
||||
while True:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
faces = detector.detect(frame)
|
||||
frame = blurrer.anonymize(frame, faces, inplace=True)
|
||||
|
||||
cv2.imshow('Anonymized', frame)
|
||||
|
||||
if cv2.waitKey(1) & 0xFF == ord('q'):
|
||||
break
|
||||
|
||||
cap.release()
|
||||
cv2.destroyAllWindows()
|
||||
```
|
||||
|
||||
### Video File
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.privacy import BlurFace
|
||||
|
||||
detector = RetinaFace()
|
||||
blurrer = BlurFace(method='gaussian')
|
||||
|
||||
cap = cv2.VideoCapture("input_video.mp4")
|
||||
fps = cap.get(cv2.CAP_PROP_FPS)
|
||||
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
|
||||
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
|
||||
|
||||
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
|
||||
out = cv2.VideoWriter('output_video.mp4', fourcc, fps, (width, height))
|
||||
|
||||
while True:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
faces = detector.detect(frame)
|
||||
frame = blurrer.anonymize(frame, faces, inplace=True)
|
||||
out.write(frame)
|
||||
|
||||
cap.release()
|
||||
out.release()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Selective Anonymization
|
||||
|
||||
### Exclude Specific Faces
|
||||
|
||||
```python
|
||||
def anonymize_except(image, all_faces, exclude_embeddings, recognizer, threshold=0.6):
|
||||
"""Anonymize all faces except those matching exclude_embeddings."""
|
||||
faces_to_blur = []
|
||||
|
||||
for face in all_faces:
|
||||
# Get embedding
|
||||
embedding = recognizer.get_normalized_embedding(image, face.landmarks)
|
||||
|
||||
# Check if should be excluded
|
||||
should_exclude = False
|
||||
for ref_emb in exclude_embeddings:
|
||||
similarity = np.dot(embedding, ref_emb.T)[0][0]
|
||||
if similarity > threshold:
|
||||
should_exclude = True
|
||||
break
|
||||
|
||||
if not should_exclude:
|
||||
faces_to_blur.append(face)
|
||||
|
||||
# Blur remaining faces
|
||||
return blurrer.anonymize(image, faces_to_blur)
|
||||
```
|
||||
|
||||
### Confidence-Based
|
||||
|
||||
```python
|
||||
def anonymize_low_confidence(image, faces, blurrer, confidence_threshold=0.8):
|
||||
"""Anonymize faces below confidence threshold."""
|
||||
faces_to_blur = [f for f in faces if f.confidence < confidence_threshold]
|
||||
return blurrer.anonymize(image, faces_to_blur)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Comparison
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.privacy import BlurFace
|
||||
|
||||
detector = RetinaFace()
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
methods = ['pixelate', 'gaussian', 'blackout', 'elliptical', 'median']
|
||||
|
||||
for method in methods:
|
||||
blurrer = BlurFace(method=method)
|
||||
result = blurrer.anonymize(image.copy(), faces)
|
||||
cv2.imwrite(f"anonymized_{method}.jpg", result)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Command-Line Tool
|
||||
|
||||
```bash
|
||||
# Anonymize image with pixelation
|
||||
python tools/face_anonymize.py --source photo.jpg
|
||||
|
||||
# Real-time webcam
|
||||
python tools/face_anonymize.py --source 0 --method gaussian
|
||||
|
||||
# Custom blur strength
|
||||
python tools/face_anonymize.py --source photo.jpg --method gaussian --blur-strength 5.0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Anonymize Stream Recipe](../recipes/anonymize-stream.md) - Video pipeline
|
||||
- [Detection](detection.md) - Face detection options
|
||||
- [Batch Processing Recipe](../recipes/batch-processing.md) - Process multiple files
|
||||
292
docs/modules/recognition.md
Normal file
292
docs/modules/recognition.md
Normal file
@@ -0,0 +1,292 @@
|
||||
# Recognition
|
||||
|
||||
Face recognition extracts embeddings for identity verification and face search.
|
||||
|
||||
---
|
||||
|
||||
## Available Models
|
||||
|
||||
| Model | Backbone | Size | Embedding Dim |
|
||||
|-------|----------|------|---------------|
|
||||
| **AdaFace** | IR-18/IR-101 | 92-249 MB | 512 |
|
||||
| **ArcFace** | MobileNet/ResNet | 8-166 MB | 512 |
|
||||
| **MobileFace** | MobileNet V2/V3 | 1-10 MB | 512 |
|
||||
| **SphereFace** | Sphere20/36 | 50-92 MB | 512 |
|
||||
|
||||
---
|
||||
|
||||
## AdaFace
|
||||
|
||||
Face recognition using adaptive margin based on image quality.
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace, AdaFace
|
||||
|
||||
detector = RetinaFace()
|
||||
recognizer = AdaFace()
|
||||
|
||||
# Detect face
|
||||
faces = detector.detect(image)
|
||||
|
||||
# Extract embedding
|
||||
if faces:
|
||||
embedding = recognizer.get_normalized_embedding(image, faces[0].landmarks)
|
||||
print(f"Embedding shape: {embedding.shape}") # (1, 512)
|
||||
```
|
||||
|
||||
### Model Variants
|
||||
|
||||
```python
|
||||
from uniface import AdaFace
|
||||
from uniface.constants import AdaFaceWeights
|
||||
|
||||
# Lightweight (default)
|
||||
recognizer = AdaFace(model_name=AdaFaceWeights.IR_18)
|
||||
|
||||
# High accuracy
|
||||
recognizer = AdaFace(model_name=AdaFaceWeights.IR_101)
|
||||
```
|
||||
|
||||
| Variant | Dataset | Size | IJB-B | IJB-C |
|
||||
|---------|---------|------|-------|-------|
|
||||
| **IR_18** :material-check-circle: | WebFace4M | 92 MB | 93.03% | 94.99% |
|
||||
| IR_101 | WebFace12M | 249 MB | - | 97.66% |
|
||||
|
||||
!!! info "Benchmark Metrics"
|
||||
IJB-B and IJB-C accuracy reported as TAR@FAR=0.01%
|
||||
|
||||
---
|
||||
|
||||
## ArcFace
|
||||
|
||||
Face recognition using additive angular margin loss.
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace, ArcFace
|
||||
|
||||
detector = RetinaFace()
|
||||
recognizer = ArcFace()
|
||||
|
||||
# Detect face
|
||||
faces = detector.detect(image)
|
||||
|
||||
# Extract embedding
|
||||
if faces:
|
||||
embedding = recognizer.get_normalized_embedding(image, faces[0].landmarks)
|
||||
print(f"Embedding shape: {embedding.shape}") # (1, 512)
|
||||
```
|
||||
|
||||
### Model Variants
|
||||
|
||||
```python
|
||||
from uniface import ArcFace
|
||||
from uniface.constants import ArcFaceWeights
|
||||
|
||||
# Lightweight (default)
|
||||
recognizer = ArcFace(model_name=ArcFaceWeights.MNET)
|
||||
|
||||
# High accuracy
|
||||
recognizer = ArcFace(model_name=ArcFaceWeights.RESNET)
|
||||
```
|
||||
|
||||
| Variant | Backbone | Size | LFW | CFP-FP | AgeDB-30 | IJB-C |
|
||||
|---------|----------|------|-----|--------|----------|-------|
|
||||
| **MNET** :material-check-circle: | MobileNet | 8 MB | 99.70% | 98.00% | 96.58% | 95.02% |
|
||||
| RESNET | ResNet50 | 166 MB | 99.83% | 99.33% | 98.23% | 97.25% |
|
||||
|
||||
!!! info "Training Data & Metrics"
|
||||
**Dataset**: Trained on WebFace600K (600K images)
|
||||
|
||||
**Accuracy**: IJB-C reported as TAR@FAR=1e-4
|
||||
|
||||
---
|
||||
|
||||
## MobileFace
|
||||
|
||||
Lightweight face recognition models with MobileNet backbones.
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
from uniface import MobileFace
|
||||
|
||||
recognizer = MobileFace()
|
||||
embedding = recognizer.get_normalized_embedding(image, landmarks)
|
||||
```
|
||||
|
||||
### Model Variants
|
||||
|
||||
```python
|
||||
from uniface import MobileFace
|
||||
from uniface.constants import MobileFaceWeights
|
||||
|
||||
# Ultra-lightweight
|
||||
recognizer = MobileFace(model_name=MobileFaceWeights.MNET_025)
|
||||
|
||||
# Balanced (default)
|
||||
recognizer = MobileFace(model_name=MobileFaceWeights.MNET_V2)
|
||||
|
||||
# Higher accuracy
|
||||
recognizer = MobileFace(model_name=MobileFaceWeights.MNET_V3_LARGE)
|
||||
```
|
||||
|
||||
| Variant | Params | Size | LFW | CALFW | CPLFW | AgeDB-30 |
|
||||
|---------|--------|------|-----|-------|-------|----------|
|
||||
| MNET_025 | 0.36M | 1 MB | 98.76% | 92.02% | 82.37% | 90.02% |
|
||||
| **MNET_V2** :material-check-circle: | 2.29M | 4 MB | 99.55% | 94.87% | 86.89% | 95.16% |
|
||||
| MNET_V3_SMALL | 1.25M | 3 MB | 99.30% | 93.77% | 85.29% | 92.79% |
|
||||
| MNET_V3_LARGE | 3.52M | 10 MB | 99.53% | 94.56% | 86.79% | 95.13% |
|
||||
|
||||
---
|
||||
|
||||
## SphereFace
|
||||
|
||||
Face recognition using angular softmax loss (A-Softmax).
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
from uniface import SphereFace
|
||||
from uniface.constants import SphereFaceWeights
|
||||
|
||||
recognizer = SphereFace(model_name=SphereFaceWeights.SPHERE20)
|
||||
embedding = recognizer.get_normalized_embedding(image, landmarks)
|
||||
```
|
||||
|
||||
| Variant | Params | Size | LFW | CALFW | CPLFW | AgeDB-30 |
|
||||
|---------|--------|------|-----|-------|-------|----------|
|
||||
| SPHERE20 | 24.5M | 50 MB | 99.67% | 95.61% | 88.75% | 96.58% |
|
||||
| SPHERE36 | 34.6M | 92 MB | 99.72% | 95.64% | 89.92% | 96.83% |
|
||||
|
||||
---
|
||||
|
||||
## Face Comparison
|
||||
|
||||
### Compute Similarity
|
||||
|
||||
```python
|
||||
from uniface import compute_similarity
|
||||
import numpy as np
|
||||
|
||||
# Extract embeddings
|
||||
emb1 = recognizer.get_normalized_embedding(image1, landmarks1)
|
||||
emb2 = recognizer.get_normalized_embedding(image2, landmarks2)
|
||||
|
||||
# Method 1: Using utility function
|
||||
similarity = compute_similarity(emb1, emb2)
|
||||
|
||||
# Method 2: Direct computation
|
||||
similarity = np.dot(emb1, emb2.T)[0][0]
|
||||
|
||||
print(f"Similarity: {similarity:.4f}")
|
||||
```
|
||||
|
||||
### Threshold Guidelines
|
||||
|
||||
| Threshold | Decision | Use Case |
|
||||
|-----------|----------|----------|
|
||||
| > 0.7 | Very high confidence | Security-critical |
|
||||
| > 0.6 | Same person | General verification |
|
||||
| 0.4 - 0.6 | Uncertain | Manual review needed |
|
||||
| < 0.4 | Different people | Rejection |
|
||||
|
||||
---
|
||||
|
||||
## Face Alignment
|
||||
|
||||
Recognition models require aligned faces. UniFace handles this internally:
|
||||
|
||||
```python
|
||||
# Alignment is done automatically
|
||||
embedding = recognizer.get_normalized_embedding(image, landmarks)
|
||||
|
||||
# Or manually align
|
||||
from uniface import face_alignment
|
||||
|
||||
aligned_face = face_alignment(image, landmarks)
|
||||
# Returns: 112x112 aligned face image
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Building a Face Database
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
from uniface import RetinaFace, ArcFace
|
||||
|
||||
detector = RetinaFace()
|
||||
recognizer = ArcFace()
|
||||
|
||||
# Build database
|
||||
database = {}
|
||||
for person_id, image_path in person_images.items():
|
||||
image = cv2.imread(image_path)
|
||||
faces = detector.detect(image)
|
||||
|
||||
if faces:
|
||||
embedding = recognizer.get_normalized_embedding(image, faces[0].landmarks)
|
||||
database[person_id] = embedding
|
||||
|
||||
# Save for later use
|
||||
np.savez('face_database.npz', **database)
|
||||
|
||||
# Load database
|
||||
data = np.load('face_database.npz')
|
||||
database = {key: data[key] for key in data.files}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Face Search
|
||||
|
||||
Find a person in a database:
|
||||
|
||||
```python
|
||||
def search_face(query_embedding, database, threshold=0.6):
|
||||
"""Find best match in database."""
|
||||
best_match = None
|
||||
best_similarity = -1
|
||||
|
||||
for person_id, db_embedding in database.items():
|
||||
similarity = np.dot(query_embedding, db_embedding.T)[0][0]
|
||||
|
||||
if similarity > best_similarity and similarity > threshold:
|
||||
best_similarity = similarity
|
||||
best_match = person_id
|
||||
|
||||
return best_match, best_similarity
|
||||
|
||||
# Usage
|
||||
query_embedding = recognizer.get_normalized_embedding(query_image, landmarks)
|
||||
match, similarity = search_face(query_embedding, database)
|
||||
|
||||
if match:
|
||||
print(f"Found: {match} (similarity: {similarity:.4f})")
|
||||
else:
|
||||
print("No match found")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Factory Function
|
||||
|
||||
```python
|
||||
from uniface import create_recognizer
|
||||
|
||||
# Available methods: 'arcface', 'adaface', 'mobileface', 'sphereface'
|
||||
recognizer = create_recognizer('arcface')
|
||||
recognizer = create_recognizer('adaface')
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [Detection Module](detection.md) - Detect faces first
|
||||
- [Face Search Recipe](../recipes/face-search.md) - Complete search system
|
||||
- [Thresholds](../concepts/thresholds-calibration.md) - Calibration guide
|
||||
266
docs/modules/spoofing.md
Normal file
266
docs/modules/spoofing.md
Normal file
@@ -0,0 +1,266 @@
|
||||
# Anti-Spoofing
|
||||
|
||||
Face anti-spoofing detects whether a face is real (live) or fake (photo, video replay, mask).
|
||||
|
||||
---
|
||||
|
||||
## Available Models
|
||||
|
||||
| Model | Size |
|
||||
|-------|------|
|
||||
| MiniFASNet V1SE | 1.2 MB |
|
||||
| **MiniFASNet V2** :material-check-circle: | 1.2 MB |
|
||||
|
||||
---
|
||||
|
||||
## Basic Usage
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.spoofing import MiniFASNet
|
||||
|
||||
detector = RetinaFace()
|
||||
spoofer = MiniFASNet()
|
||||
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
for face in faces:
|
||||
result = spoofer.predict(image, face.bbox)
|
||||
|
||||
label = "Real" if result.is_real else "Fake"
|
||||
print(f"{label}: {result.confidence:.1%}")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Output Format
|
||||
|
||||
```python
|
||||
result = spoofer.predict(image, face.bbox)
|
||||
|
||||
# SpoofingResult dataclass
|
||||
result.is_real # True = real, False = fake
|
||||
result.confidence # 0.0 to 1.0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Model Variants
|
||||
|
||||
```python
|
||||
from uniface.spoofing import MiniFASNet
|
||||
from uniface.constants import MiniFASNetWeights
|
||||
|
||||
# Default (V2, recommended)
|
||||
spoofer = MiniFASNet()
|
||||
|
||||
# V1SE variant
|
||||
spoofer = MiniFASNet(model_name=MiniFASNetWeights.V1SE)
|
||||
```
|
||||
|
||||
| Variant | Size | Scale Factor |
|
||||
|---------|------|--------------|
|
||||
| V1SE | 1.2 MB | 4.0 |
|
||||
| **V2** :material-check-circle: | 1.2 MB | 2.7 |
|
||||
|
||||
---
|
||||
|
||||
## Confidence Thresholds
|
||||
|
||||
The default threshold is 0.5. Adjust for your use case:
|
||||
|
||||
```python
|
||||
result = spoofer.predict(image, face.bbox)
|
||||
|
||||
# High security (fewer false accepts)
|
||||
HIGH_THRESHOLD = 0.7
|
||||
if result.confidence > HIGH_THRESHOLD:
|
||||
print("Real (high confidence)")
|
||||
else:
|
||||
print("Suspicious")
|
||||
|
||||
# Balanced
|
||||
if result.is_real: # Uses default 0.5 threshold
|
||||
print("Real")
|
||||
else:
|
||||
print("Fake")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Visualization
|
||||
|
||||
```python
|
||||
import cv2
|
||||
|
||||
def draw_spoofing_result(image, face, result):
|
||||
"""Draw spoofing result on image."""
|
||||
x1, y1, x2, y2 = map(int, face.bbox)
|
||||
|
||||
# Color based on result
|
||||
color = (0, 255, 0) if result.is_real else (0, 0, 255)
|
||||
label = "Real" if result.is_real else "Fake"
|
||||
|
||||
# Draw bounding box
|
||||
cv2.rectangle(image, (x1, y1), (x2, y2), color, 2)
|
||||
|
||||
# Draw label
|
||||
text = f"{label}: {result.confidence:.1%}"
|
||||
cv2.putText(image, text, (x1, y1 - 10),
|
||||
cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
|
||||
|
||||
return image
|
||||
|
||||
# Usage
|
||||
for face in faces:
|
||||
result = spoofer.predict(image, face.bbox)
|
||||
image = draw_spoofing_result(image, face, result)
|
||||
|
||||
cv2.imwrite("spoofing_result.jpg", image)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Real-Time Liveness Detection
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.spoofing import MiniFASNet
|
||||
|
||||
detector = RetinaFace()
|
||||
spoofer = MiniFASNet()
|
||||
|
||||
cap = cv2.VideoCapture(0)
|
||||
|
||||
while True:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
faces = detector.detect(frame)
|
||||
|
||||
for face in faces:
|
||||
result = spoofer.predict(frame, face.bbox)
|
||||
|
||||
# Draw result
|
||||
x1, y1, x2, y2 = map(int, face.bbox)
|
||||
color = (0, 255, 0) if result.is_real else (0, 0, 255)
|
||||
label = f"{'Real' if result.is_real else 'Fake'}: {result.confidence:.0%}"
|
||||
|
||||
cv2.rectangle(frame, (x1, y1), (x2, y2), color, 2)
|
||||
cv2.putText(frame, label, (x1, y1 - 10),
|
||||
cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
|
||||
|
||||
cv2.imshow("Liveness Detection", frame)
|
||||
|
||||
if cv2.waitKey(1) & 0xFF == ord('q'):
|
||||
break
|
||||
|
||||
cap.release()
|
||||
cv2.destroyAllWindows()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Use Cases
|
||||
|
||||
### Access Control
|
||||
|
||||
```python
|
||||
def verify_liveness(image, face, spoofer, threshold=0.6):
|
||||
"""Verify face is real for access control."""
|
||||
result = spoofer.predict(image, face.bbox)
|
||||
|
||||
if result.is_real and result.confidence > threshold:
|
||||
return True, result.confidence
|
||||
return False, result.confidence
|
||||
|
||||
# Usage
|
||||
is_live, confidence = verify_liveness(image, face, spoofer)
|
||||
if is_live:
|
||||
print(f"Access granted (confidence: {confidence:.1%})")
|
||||
else:
|
||||
print(f"Access denied - possible spoof attempt")
|
||||
```
|
||||
|
||||
### Multi-Frame Verification
|
||||
|
||||
For higher security, verify across multiple frames:
|
||||
|
||||
```python
|
||||
def verify_liveness_multiframe(frames, detector, spoofer, min_real=3):
|
||||
"""Verify liveness across multiple frames."""
|
||||
real_count = 0
|
||||
|
||||
for frame in frames:
|
||||
faces = detector.detect(frame)
|
||||
if not faces:
|
||||
continue
|
||||
|
||||
result = spoofer.predict(frame, faces[0].bbox)
|
||||
if result.is_real:
|
||||
real_count += 1
|
||||
|
||||
return real_count >= min_real
|
||||
|
||||
# Collect frames and verify
|
||||
frames = []
|
||||
for _ in range(5):
|
||||
ret, frame = cap.read()
|
||||
if ret:
|
||||
frames.append(frame)
|
||||
|
||||
is_verified = verify_liveness_multiframe(frames, detector, spoofer)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Attack Types Detected
|
||||
|
||||
MiniFASNet can detect various spoof attacks:
|
||||
|
||||
| Attack Type | Detection |
|
||||
|-------------|-----------|
|
||||
| Printed photos | ✅ |
|
||||
| Screen replay | ✅ |
|
||||
| Video replay | ✅ |
|
||||
| Paper masks | ✅ |
|
||||
| 3D masks | Limited |
|
||||
|
||||
!!! warning "Limitations"
|
||||
- High-quality 3D masks may not be detected
|
||||
- Performance varies with lighting and image quality
|
||||
- Always combine with other verification methods for high-security applications
|
||||
|
||||
---
|
||||
|
||||
## Command-Line Tool
|
||||
|
||||
```bash
|
||||
# Image
|
||||
python tools/spoofing.py --source photo.jpg
|
||||
|
||||
# Webcam
|
||||
python tools/spoofing.py --source 0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Factory Function
|
||||
|
||||
```python
|
||||
from uniface import create_spoofer
|
||||
|
||||
spoofer = create_spoofer() # Returns MiniFASNet
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Privacy](privacy.md) - Face anonymization
|
||||
- [Detection](detection.md) - Face detection
|
||||
- [Recognition](recognition.md) - Face recognition
|
||||
57
docs/notebooks.md
Normal file
57
docs/notebooks.md
Normal file
@@ -0,0 +1,57 @@
|
||||
# Interactive Notebooks
|
||||
|
||||
Run UniFace examples directly in your browser with Google Colab, or download and run locally with Jupyter.
|
||||
|
||||
---
|
||||
|
||||
## Available Notebooks
|
||||
|
||||
| Notebook | Colab | Description |
|
||||
|----------|:-----:|-------------|
|
||||
| [Face Detection](https://github.com/yakhyo/uniface/blob/main/examples/01_face_detection.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/01_face_detection.ipynb) | Detect faces and 5-point landmarks |
|
||||
| [Face Alignment](https://github.com/yakhyo/uniface/blob/main/examples/02_face_alignment.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/02_face_alignment.ipynb) | Align faces for recognition |
|
||||
| [Face Verification](https://github.com/yakhyo/uniface/blob/main/examples/03_face_verification.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/03_face_verification.ipynb) | Compare faces for identity |
|
||||
| [Face Search](https://github.com/yakhyo/uniface/blob/main/examples/04_face_search.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/04_face_search.ipynb) | Find a person in group photos |
|
||||
| [Face Analyzer](https://github.com/yakhyo/uniface/blob/main/examples/05_face_analyzer.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/05_face_analyzer.ipynb) | All-in-one face analysis |
|
||||
| [Face Parsing](https://github.com/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | Semantic face segmentation |
|
||||
| [Face Anonymization](https://github.com/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | Privacy-preserving blur |
|
||||
| [Gaze Estimation](https://github.com/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | Gaze direction estimation |
|
||||
|
||||
---
|
||||
|
||||
## Running Locally
|
||||
|
||||
Download and run notebooks on your machine:
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://github.com/yakhyo/uniface.git
|
||||
cd uniface
|
||||
|
||||
# Install dependencies
|
||||
pip install uniface jupyter
|
||||
|
||||
# Launch Jupyter
|
||||
jupyter notebook examples/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Running on Google Colab
|
||||
|
||||
Click any **"Open in Colab"** badge above. The notebooks automatically:
|
||||
|
||||
1. Install UniFace via pip
|
||||
2. Clone the repository to access test images
|
||||
3. Set up the correct working directory
|
||||
|
||||
!!! tip "GPU Acceleration"
|
||||
In Colab, go to **Runtime → Change runtime type → GPU** for faster inference.
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Quickstart](quickstart.md) - Code snippets for common use cases
|
||||
- [Tutorials](recipes/image-pipeline.md) - Step-by-step workflow guides
|
||||
- [API Reference](modules/detection.md) - Detailed module documentation
|
||||
5
docs/overrides/home.html
Normal file
5
docs/overrides/home.html
Normal file
@@ -0,0 +1,5 @@
|
||||
{% extends "main.html" %}
|
||||
|
||||
{% block source %}
|
||||
<!-- Hide edit/view source on home page -->
|
||||
{% endblock %}
|
||||
426
docs/quickstart.md
Normal file
426
docs/quickstart.md
Normal file
@@ -0,0 +1,426 @@
|
||||
# Quickstart
|
||||
|
||||
Get up and running with UniFace in 5 minutes. This guide covers the most common use cases.
|
||||
|
||||
---
|
||||
|
||||
## Face Detection
|
||||
|
||||
Detect faces in an image:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
|
||||
# Load image
|
||||
image = cv2.imread("photo.jpg")
|
||||
|
||||
# Initialize detector (models auto-download on first use)
|
||||
detector = RetinaFace()
|
||||
|
||||
# Detect faces
|
||||
faces = detector.detect(image)
|
||||
|
||||
# Print results
|
||||
for i, face in enumerate(faces):
|
||||
print(f"Face {i+1}:")
|
||||
print(f" Confidence: {face.confidence:.2f}")
|
||||
print(f" BBox: {face.bbox}")
|
||||
print(f" Landmarks: {len(face.landmarks)} points")
|
||||
```
|
||||
|
||||
**Output:**
|
||||
|
||||
```
|
||||
Face 1:
|
||||
Confidence: 0.99
|
||||
BBox: [120.5, 85.3, 245.8, 210.6]
|
||||
Landmarks: 5 points
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Visualize Detections
|
||||
|
||||
Draw bounding boxes and landmarks:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.visualization import draw_detections
|
||||
|
||||
# Detect faces
|
||||
detector = RetinaFace()
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
# Extract visualization data
|
||||
bboxes = [f.bbox for f in faces]
|
||||
scores = [f.confidence for f in faces]
|
||||
landmarks = [f.landmarks for f in faces]
|
||||
|
||||
# Draw on image
|
||||
draw_detections(
|
||||
image=image,
|
||||
bboxes=bboxes,
|
||||
scores=scores,
|
||||
landmarks=landmarks,
|
||||
vis_threshold=0.6,
|
||||
)
|
||||
|
||||
# Save result
|
||||
cv2.imwrite("output.jpg", image)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Face Recognition
|
||||
|
||||
Compare two faces:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
import numpy as np
|
||||
from uniface import RetinaFace, ArcFace
|
||||
|
||||
# Initialize models
|
||||
detector = RetinaFace()
|
||||
recognizer = ArcFace()
|
||||
|
||||
# Load two images
|
||||
image1 = cv2.imread("person1.jpg")
|
||||
image2 = cv2.imread("person2.jpg")
|
||||
|
||||
# Detect faces
|
||||
faces1 = detector.detect(image1)
|
||||
faces2 = detector.detect(image2)
|
||||
|
||||
if faces1 and faces2:
|
||||
# Extract embeddings
|
||||
emb1 = recognizer.get_normalized_embedding(image1, faces1[0].landmarks)
|
||||
emb2 = recognizer.get_normalized_embedding(image2, faces2[0].landmarks)
|
||||
|
||||
# Compute similarity (cosine similarity)
|
||||
similarity = np.dot(emb1, emb2.T)[0][0]
|
||||
|
||||
# Interpret result
|
||||
if similarity > 0.6:
|
||||
print(f"Same person (similarity: {similarity:.3f})")
|
||||
else:
|
||||
print(f"Different people (similarity: {similarity:.3f})")
|
||||
```
|
||||
|
||||
!!! tip "Similarity Thresholds"
|
||||
- `> 0.6`: Same person (high confidence)
|
||||
- `0.4 - 0.6`: Uncertain (manual review)
|
||||
- `< 0.4`: Different people
|
||||
|
||||
---
|
||||
|
||||
## Age & Gender Detection
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace, AgeGender
|
||||
|
||||
# Initialize models
|
||||
detector = RetinaFace()
|
||||
age_gender = AgeGender()
|
||||
|
||||
# Load image
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
# Predict attributes
|
||||
for i, face in enumerate(faces):
|
||||
result = age_gender.predict(image, face.bbox)
|
||||
print(f"Face {i+1}: {result.sex}, {result.age} years old")
|
||||
```
|
||||
|
||||
**Output:**
|
||||
|
||||
```
|
||||
Face 1: Male, 32 years old
|
||||
Face 2: Female, 28 years old
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## FairFace Attributes
|
||||
|
||||
Detect race, gender, and age group:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace, FairFace
|
||||
|
||||
detector = RetinaFace()
|
||||
fairface = FairFace()
|
||||
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
for i, face in enumerate(faces):
|
||||
result = fairface.predict(image, face.bbox)
|
||||
print(f"Face {i+1}: {result.sex}, {result.age_group}, {result.race}")
|
||||
```
|
||||
|
||||
**Output:**
|
||||
|
||||
```
|
||||
Face 1: Male, 30-39, East Asian
|
||||
Face 2: Female, 20-29, White
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Facial Landmarks (106 Points)
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace, Landmark106
|
||||
|
||||
detector = RetinaFace()
|
||||
landmarker = Landmark106()
|
||||
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
if faces:
|
||||
landmarks = landmarker.get_landmarks(image, faces[0].bbox)
|
||||
print(f"Detected {len(landmarks)} landmarks")
|
||||
|
||||
# Draw landmarks
|
||||
for x, y in landmarks.astype(int):
|
||||
cv2.circle(image, (x, y), 2, (0, 255, 0), -1)
|
||||
|
||||
cv2.imwrite("landmarks.jpg", image)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Gaze Estimation
|
||||
|
||||
```python
|
||||
import cv2
|
||||
import numpy as np
|
||||
from uniface import RetinaFace, MobileGaze
|
||||
from uniface.visualization import draw_gaze
|
||||
|
||||
detector = RetinaFace()
|
||||
gaze_estimator = MobileGaze()
|
||||
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
for i, face in enumerate(faces):
|
||||
x1, y1, x2, y2 = map(int, face.bbox[:4])
|
||||
face_crop = image[y1:y2, x1:x2]
|
||||
|
||||
if face_crop.size > 0:
|
||||
result = gaze_estimator.estimate(face_crop)
|
||||
print(f"Face {i+1}: pitch={np.degrees(result.pitch):.1f}°, yaw={np.degrees(result.yaw):.1f}°")
|
||||
|
||||
# Draw gaze direction
|
||||
draw_gaze(image, face.bbox, result.pitch, result.yaw)
|
||||
|
||||
cv2.imwrite("gaze_output.jpg", image)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Face Parsing
|
||||
|
||||
Segment face into semantic components:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
import numpy as np
|
||||
from uniface.parsing import BiSeNet
|
||||
from uniface.visualization import vis_parsing_maps
|
||||
|
||||
parser = BiSeNet()
|
||||
|
||||
# Load face image (already cropped)
|
||||
face_image = cv2.imread("face.jpg")
|
||||
|
||||
# Parse face into 19 components
|
||||
mask = parser.parse(face_image)
|
||||
|
||||
# Visualize with overlay
|
||||
face_rgb = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
|
||||
vis_result = vis_parsing_maps(face_rgb, mask, save_image=False)
|
||||
|
||||
print(f"Detected {len(np.unique(mask))} facial components")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Face Anonymization
|
||||
|
||||
Blur faces for privacy protection:
|
||||
|
||||
```python
|
||||
from uniface.privacy import anonymize_faces
|
||||
import cv2
|
||||
|
||||
# One-liner: automatic detection and blurring
|
||||
image = cv2.imread("group_photo.jpg")
|
||||
anonymized = anonymize_faces(image, method='pixelate')
|
||||
cv2.imwrite("anonymized.jpg", anonymized)
|
||||
```
|
||||
|
||||
**Manual control:**
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
from uniface.privacy import BlurFace
|
||||
|
||||
detector = RetinaFace()
|
||||
blurrer = BlurFace(method='gaussian', blur_strength=5.0)
|
||||
|
||||
faces = detector.detect(image)
|
||||
anonymized = blurrer.anonymize(image, faces)
|
||||
```
|
||||
|
||||
**Available methods:**
|
||||
|
||||
| Method | Description |
|
||||
|--------|-------------|
|
||||
| `pixelate` | Blocky effect (news media standard) |
|
||||
| `gaussian` | Smooth, natural blur |
|
||||
| `blackout` | Solid color boxes (maximum privacy) |
|
||||
| `elliptical` | Soft oval blur (natural face shape) |
|
||||
| `median` | Edge-preserving blur |
|
||||
|
||||
---
|
||||
|
||||
## Face Anti-Spoofing
|
||||
|
||||
Detect real vs. fake faces:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.spoofing import MiniFASNet
|
||||
|
||||
detector = RetinaFace()
|
||||
spoofer = MiniFASNet()
|
||||
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
for i, face in enumerate(faces):
|
||||
result = spoofer.predict(image, face.bbox)
|
||||
label = 'Real' if result.is_real else 'Fake'
|
||||
print(f"Face {i+1}: {label} ({result.confidence:.1%})")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Webcam Demo
|
||||
|
||||
Real-time face detection:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.visualization import draw_detections
|
||||
|
||||
detector = RetinaFace()
|
||||
cap = cv2.VideoCapture(0)
|
||||
|
||||
print("Press 'q' to quit")
|
||||
|
||||
while True:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
faces = detector.detect(frame)
|
||||
|
||||
bboxes = [f.bbox for f in faces]
|
||||
scores = [f.confidence for f in faces]
|
||||
landmarks = [f.landmarks for f in faces]
|
||||
draw_detections(image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks)
|
||||
|
||||
cv2.imshow("UniFace - Press 'q' to quit", frame)
|
||||
|
||||
if cv2.waitKey(1) & 0xFF == ord('q'):
|
||||
break
|
||||
|
||||
cap.release()
|
||||
cv2.destroyAllWindows()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Model Selection
|
||||
|
||||
For detailed model comparisons and benchmarks, see the [Model Zoo](models.md).
|
||||
|
||||
**Available models by task:**
|
||||
|
||||
| Task | Available Models |
|
||||
|------|------------------|
|
||||
| Detection | `RetinaFace`, `SCRFD`, `YOLOv5Face`, `YOLOv8Face` |
|
||||
| Recognition | `ArcFace`, `AdaFace`, `MobileFace`, `SphereFace` |
|
||||
| Gaze | `MobileGaze` (ResNet18/34/50, MobileNetV2, MobileOneS0) |
|
||||
| Parsing | `BiSeNet` (ResNet18/34) |
|
||||
| Attributes | `AgeGender`, `FairFace`, `Emotion` |
|
||||
| Anti-Spoofing | `MiniFASNet` (V1SE, V2) |
|
||||
|
||||
---
|
||||
|
||||
## Common Issues
|
||||
|
||||
### Models Not Downloading
|
||||
|
||||
```python
|
||||
from uniface.model_store import verify_model_weights
|
||||
from uniface.constants import RetinaFaceWeights
|
||||
|
||||
# Manually download a model
|
||||
model_path = verify_model_weights(RetinaFaceWeights.MNET_V2)
|
||||
print(f"Model downloaded to: {model_path}")
|
||||
```
|
||||
|
||||
### Check Hardware Acceleration
|
||||
|
||||
```python
|
||||
import onnxruntime as ort
|
||||
print("Available providers:", ort.get_available_providers())
|
||||
|
||||
# macOS M-series should show: ['CoreMLExecutionProvider', ...]
|
||||
# NVIDIA GPU should show: ['CUDAExecutionProvider', ...]
|
||||
```
|
||||
|
||||
### Slow Performance on Mac
|
||||
|
||||
Verify you're using the ARM64 build of Python:
|
||||
|
||||
```bash
|
||||
python -c "import platform; print(platform.machine())"
|
||||
# Should show: arm64 (not x86_64)
|
||||
```
|
||||
|
||||
### Import Errors
|
||||
|
||||
```python
|
||||
# Correct imports
|
||||
from uniface.detection import RetinaFace
|
||||
from uniface.recognition import ArcFace
|
||||
from uniface.landmark import Landmark106
|
||||
|
||||
# Also works (re-exported at package level)
|
||||
from uniface import RetinaFace, ArcFace, Landmark106
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Model Zoo](models.md) - All models, benchmarks, and selection guide
|
||||
- [API Reference](modules/detection.md) - Explore individual modules and their APIs
|
||||
- [Tutorials](recipes/image-pipeline.md) - Step-by-step examples for common workflows
|
||||
- [Guides](concepts/overview.md) - Learn about the architecture and design principles
|
||||
99
docs/recipes/anonymize-stream.md
Normal file
99
docs/recipes/anonymize-stream.md
Normal file
@@ -0,0 +1,99 @@
|
||||
# Anonymize Stream
|
||||
|
||||
Blur faces in real-time video streams for privacy protection.
|
||||
|
||||
!!! note "Work in Progress"
|
||||
This page contains example code patterns. Test thoroughly before using in production.
|
||||
|
||||
---
|
||||
|
||||
## Webcam Anonymization
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.privacy import BlurFace
|
||||
|
||||
detector = RetinaFace()
|
||||
blurrer = BlurFace(method='pixelate')
|
||||
cap = cv2.VideoCapture(0)
|
||||
|
||||
while True:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
faces = detector.detect(frame)
|
||||
frame = blurrer.anonymize(frame, faces, inplace=True)
|
||||
|
||||
cv2.imshow('Anonymized', frame)
|
||||
if cv2.waitKey(1) & 0xFF == ord('q'):
|
||||
break
|
||||
|
||||
cap.release()
|
||||
cv2.destroyAllWindows()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Video File Anonymization
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.privacy import BlurFace
|
||||
|
||||
detector = RetinaFace()
|
||||
blurrer = BlurFace(method='gaussian')
|
||||
|
||||
cap = cv2.VideoCapture("input.mp4")
|
||||
fps = cap.get(cv2.CAP_PROP_FPS)
|
||||
w, h = int(cap.get(3)), int(cap.get(4))
|
||||
|
||||
out = cv2.VideoWriter('output.mp4', cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))
|
||||
|
||||
while cap.read()[0]:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
faces = detector.detect(frame)
|
||||
blurrer.anonymize(frame, faces, inplace=True)
|
||||
out.write(frame)
|
||||
|
||||
cap.release()
|
||||
out.release()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## One-Liner for Images
|
||||
|
||||
```python
|
||||
from uniface.privacy import anonymize_faces
|
||||
import cv2
|
||||
|
||||
image = cv2.imread("photo.jpg")
|
||||
result = anonymize_faces(image, method='pixelate')
|
||||
cv2.imwrite("anonymized.jpg", result)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Available Blur Methods
|
||||
|
||||
| Method | Usage |
|
||||
|--------|-------|
|
||||
| Pixelate | `BlurFace(method='pixelate', pixel_blocks=10)` |
|
||||
| Gaussian | `BlurFace(method='gaussian', blur_strength=3.0)` |
|
||||
| Blackout | `BlurFace(method='blackout', color=(0,0,0))` |
|
||||
| Elliptical | `BlurFace(method='elliptical', margin=20)` |
|
||||
| Median | `BlurFace(method='median', blur_strength=3.0)` |
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [Privacy Module](../modules/privacy.md) - Privacy protection details
|
||||
- [Video & Webcam](video-webcam.md) - Real-time processing
|
||||
- [Detection Module](../modules/detection.md) - Face detection
|
||||
83
docs/recipes/batch-processing.md
Normal file
83
docs/recipes/batch-processing.md
Normal file
@@ -0,0 +1,83 @@
|
||||
# Batch Processing
|
||||
|
||||
Process multiple images efficiently.
|
||||
|
||||
!!! note "Work in Progress"
|
||||
This page contains example code patterns. Test thoroughly before using in production.
|
||||
|
||||
---
|
||||
|
||||
## Basic Batch Processing
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from pathlib import Path
|
||||
from uniface import RetinaFace
|
||||
|
||||
detector = RetinaFace()
|
||||
|
||||
def process_directory(input_dir, output_dir):
|
||||
"""Process all images in a directory."""
|
||||
input_path = Path(input_dir)
|
||||
output_path = Path(output_dir)
|
||||
output_path.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
for image_path in input_path.glob("*.jpg"):
|
||||
print(f"Processing {image_path.name}...")
|
||||
|
||||
image = cv2.imread(str(image_path))
|
||||
faces = detector.detect(image)
|
||||
|
||||
print(f" Found {len(faces)} face(s)")
|
||||
|
||||
# Process and save results
|
||||
# ... your code here ...
|
||||
|
||||
# Usage
|
||||
process_directory("input_images/", "output_images/")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## With Progress Bar
|
||||
|
||||
```python
|
||||
from tqdm import tqdm
|
||||
|
||||
for image_path in tqdm(image_files, desc="Processing"):
|
||||
# ... process image ...
|
||||
pass
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Extract Embeddings
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace, ArcFace
|
||||
import numpy as np
|
||||
|
||||
detector = RetinaFace()
|
||||
recognizer = ArcFace()
|
||||
|
||||
embeddings = {}
|
||||
for image_path in Path("faces/").glob("*.jpg"):
|
||||
image = cv2.imread(str(image_path))
|
||||
faces = detector.detect(image)
|
||||
|
||||
if faces:
|
||||
embedding = recognizer.get_normalized_embedding(image, faces[0].landmarks)
|
||||
embeddings[image_path.stem] = embedding
|
||||
|
||||
# Save embeddings
|
||||
np.savez("embeddings.npz", **embeddings)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [Video & Webcam](video-webcam.md) - Real-time processing
|
||||
- [Face Search](face-search.md) - Search through embeddings
|
||||
- [Image Pipeline](image-pipeline.md) - Full analysis pipeline
|
||||
- [Detection Module](../modules/detection.md) - Detection options
|
||||
114
docs/recipes/custom-models.md
Normal file
114
docs/recipes/custom-models.md
Normal file
@@ -0,0 +1,114 @@
|
||||
# Custom Models
|
||||
|
||||
Add your own ONNX models to UniFace.
|
||||
|
||||
!!! note "Work in Progress"
|
||||
This page contains example code patterns for advanced users. Test thoroughly before using in production.
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
UniFace is designed to be extensible. You can add custom ONNX models by:
|
||||
|
||||
1. Creating a class that inherits from the appropriate base class
|
||||
2. Implementing required methods
|
||||
3. Using the ONNX Runtime utilities provided by UniFace
|
||||
|
||||
---
|
||||
|
||||
## Add Custom Detection Model
|
||||
|
||||
```python
|
||||
from uniface.detection.base import BaseDetector
|
||||
from uniface.onnx_utils import create_onnx_session
|
||||
from uniface.types import Face
|
||||
import numpy as np
|
||||
|
||||
class MyDetector(BaseDetector):
|
||||
def __init__(self, model_path: str, confidence_threshold: float = 0.5):
|
||||
self.session = create_onnx_session(model_path)
|
||||
self.threshold = confidence_threshold
|
||||
|
||||
def detect(self, image: np.ndarray) -> list[Face]:
|
||||
# 1. Preprocess image
|
||||
input_tensor = self._preprocess(image)
|
||||
|
||||
# 2. Run inference
|
||||
outputs = self.session.run(None, {'input': input_tensor})
|
||||
|
||||
# 3. Postprocess outputs to Face objects
|
||||
faces = self._postprocess(outputs, image.shape)
|
||||
return faces
|
||||
|
||||
def _preprocess(self, image):
|
||||
# Your preprocessing logic
|
||||
# e.g., resize, normalize, transpose
|
||||
pass
|
||||
|
||||
def _postprocess(self, outputs, shape):
|
||||
# Your postprocessing logic
|
||||
# e.g., decode boxes, apply NMS, create Face objects
|
||||
pass
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Add Custom Recognition Model
|
||||
|
||||
```python
|
||||
from uniface.recognition.base import BaseRecognizer
|
||||
from uniface.onnx_utils import create_onnx_session
|
||||
from uniface import face_alignment
|
||||
import numpy as np
|
||||
|
||||
class MyRecognizer(BaseRecognizer):
|
||||
def __init__(self, model_path: str):
|
||||
self.session = create_onnx_session(model_path)
|
||||
|
||||
def get_normalized_embedding(
|
||||
self,
|
||||
image: np.ndarray,
|
||||
landmarks: np.ndarray
|
||||
) -> np.ndarray:
|
||||
# 1. Align face
|
||||
aligned = face_alignment(image, landmarks)
|
||||
|
||||
# 2. Preprocess
|
||||
input_tensor = self._preprocess(aligned)
|
||||
|
||||
# 3. Run inference
|
||||
embedding = self.session.run(None, {'input': input_tensor})[0]
|
||||
|
||||
# 4. Normalize
|
||||
embedding = embedding / np.linalg.norm(embedding)
|
||||
return embedding
|
||||
|
||||
def _preprocess(self, image):
|
||||
# Your preprocessing logic
|
||||
pass
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Usage
|
||||
|
||||
```python
|
||||
from my_module import MyDetector, MyRecognizer
|
||||
|
||||
# Use custom models
|
||||
detector = MyDetector("path/to/detection_model.onnx")
|
||||
recognizer = MyRecognizer("path/to/recognition_model.onnx")
|
||||
|
||||
# Use like built-in models
|
||||
faces = detector.detect(image)
|
||||
embedding = recognizer.get_normalized_embedding(image, faces[0].landmarks)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [Detection Module](../modules/detection.md) - Built-in detection models
|
||||
- [Recognition Module](../modules/recognition.md) - Built-in recognition models
|
||||
- [Concepts: Overview](../concepts/overview.md) - Architecture overview
|
||||
178
docs/recipes/face-search.md
Normal file
178
docs/recipes/face-search.md
Normal file
@@ -0,0 +1,178 @@
|
||||
# Face Search
|
||||
|
||||
Build a face search system for finding people in images.
|
||||
|
||||
!!! note "Work in Progress"
|
||||
This page contains example code patterns. Test thoroughly before using in production.
|
||||
|
||||
---
|
||||
|
||||
## Basic Face Database
|
||||
|
||||
```python
|
||||
import numpy as np
|
||||
import cv2
|
||||
from pathlib import Path
|
||||
from uniface import RetinaFace, ArcFace
|
||||
|
||||
class FaceDatabase:
|
||||
def __init__(self):
|
||||
self.detector = RetinaFace()
|
||||
self.recognizer = ArcFace()
|
||||
self.embeddings = {}
|
||||
|
||||
def add_face(self, person_id, image):
|
||||
"""Add a face to the database."""
|
||||
faces = self.detector.detect(image)
|
||||
if not faces:
|
||||
raise ValueError(f"No face found for {person_id}")
|
||||
|
||||
face = max(faces, key=lambda f: f.confidence)
|
||||
embedding = self.recognizer.get_normalized_embedding(image, face.landmarks)
|
||||
self.embeddings[person_id] = embedding
|
||||
return True
|
||||
|
||||
def search(self, image, threshold=0.6):
|
||||
"""Search for faces in an image."""
|
||||
faces = self.detector.detect(image)
|
||||
results = []
|
||||
|
||||
for face in faces:
|
||||
embedding = self.recognizer.get_normalized_embedding(image, face.landmarks)
|
||||
|
||||
best_match = None
|
||||
best_similarity = -1
|
||||
|
||||
for person_id, db_embedding in self.embeddings.items():
|
||||
similarity = np.dot(embedding, db_embedding.T)[0][0]
|
||||
if similarity > best_similarity:
|
||||
best_similarity = similarity
|
||||
best_match = person_id
|
||||
|
||||
results.append({
|
||||
'bbox': face.bbox,
|
||||
'match': best_match if best_similarity >= threshold else None,
|
||||
'similarity': best_similarity
|
||||
})
|
||||
|
||||
return results
|
||||
|
||||
def save(self, path):
|
||||
"""Save database to file."""
|
||||
np.savez(path, embeddings=dict(self.embeddings))
|
||||
|
||||
def load(self, path):
|
||||
"""Load database from file."""
|
||||
data = np.load(path, allow_pickle=True)
|
||||
self.embeddings = data['embeddings'].item()
|
||||
|
||||
# Usage
|
||||
db = FaceDatabase()
|
||||
|
||||
# Add faces
|
||||
for image_path in Path("known_faces/").glob("*.jpg"):
|
||||
person_id = image_path.stem
|
||||
image = cv2.imread(str(image_path))
|
||||
try:
|
||||
db.add_face(person_id, image)
|
||||
print(f"Added: {person_id}")
|
||||
except ValueError as e:
|
||||
print(f"Skipped: {e}")
|
||||
|
||||
# Save database
|
||||
db.save("face_database.npz")
|
||||
|
||||
# Search
|
||||
query_image = cv2.imread("group_photo.jpg")
|
||||
results = db.search(query_image)
|
||||
|
||||
for r in results:
|
||||
if r['match']:
|
||||
print(f"Found: {r['match']} (similarity: {r['similarity']:.3f})")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Visualization
|
||||
|
||||
```python
|
||||
import cv2
|
||||
|
||||
def visualize_search_results(image, results):
|
||||
"""Draw search results on image."""
|
||||
for r in results:
|
||||
x1, y1, x2, y2 = map(int, r['bbox'])
|
||||
|
||||
if r['match']:
|
||||
color = (0, 255, 0) # Green for match
|
||||
label = f"{r['match']} ({r['similarity']:.2f})"
|
||||
else:
|
||||
color = (0, 0, 255) # Red for unknown
|
||||
label = f"Unknown ({r['similarity']:.2f})"
|
||||
|
||||
cv2.rectangle(image, (x1, y1), (x2, y2), color, 2)
|
||||
cv2.putText(image, label, (x1, y1 - 10),
|
||||
cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
|
||||
|
||||
return image
|
||||
|
||||
# Usage
|
||||
results = db.search(image)
|
||||
annotated = visualize_search_results(image.copy(), results)
|
||||
cv2.imwrite("search_result.jpg", annotated)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Real-Time Search
|
||||
|
||||
```python
|
||||
import cv2
|
||||
|
||||
def realtime_search(db):
|
||||
"""Real-time face search from webcam."""
|
||||
cap = cv2.VideoCapture(0)
|
||||
|
||||
while True:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
results = db.search(frame, threshold=0.5)
|
||||
|
||||
for r in results:
|
||||
x1, y1, x2, y2 = map(int, r['bbox'])
|
||||
|
||||
if r['match']:
|
||||
color = (0, 255, 0)
|
||||
label = r['match']
|
||||
else:
|
||||
color = (0, 0, 255)
|
||||
label = "Unknown"
|
||||
|
||||
cv2.rectangle(frame, (x1, y1), (x2, y2), color, 2)
|
||||
cv2.putText(frame, label, (x1, y1 - 10),
|
||||
cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
|
||||
|
||||
cv2.imshow("Face Search", frame)
|
||||
|
||||
if cv2.waitKey(1) & 0xFF == ord('q'):
|
||||
break
|
||||
|
||||
cap.release()
|
||||
cv2.destroyAllWindows()
|
||||
|
||||
# Usage
|
||||
db = FaceDatabase()
|
||||
db.load("face_database.npz")
|
||||
realtime_search(db)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [Recognition Module](../modules/recognition.md) - Face recognition details
|
||||
- [Batch Processing](batch-processing.md) - Process multiple files
|
||||
- [Video & Webcam](video-webcam.md) - Real-time processing
|
||||
- [Concepts: Thresholds](../concepts/thresholds-calibration.md) - Tuning similarity thresholds
|
||||
281
docs/recipes/image-pipeline.md
Normal file
281
docs/recipes/image-pipeline.md
Normal file
@@ -0,0 +1,281 @@
|
||||
# Image Pipeline
|
||||
|
||||
A complete pipeline for processing images with detection, recognition, and attribute analysis.
|
||||
|
||||
---
|
||||
|
||||
## Basic Pipeline
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace, ArcFace, AgeGender
|
||||
from uniface.visualization import draw_detections
|
||||
|
||||
# Initialize models
|
||||
detector = RetinaFace()
|
||||
recognizer = ArcFace()
|
||||
age_gender = AgeGender()
|
||||
|
||||
def process_image(image_path):
|
||||
"""Process a single image through the full pipeline."""
|
||||
# Load image
|
||||
image = cv2.imread(image_path)
|
||||
|
||||
# Step 1: Detect faces
|
||||
faces = detector.detect(image)
|
||||
print(f"Found {len(faces)} face(s)")
|
||||
|
||||
results = []
|
||||
|
||||
for i, face in enumerate(faces):
|
||||
# Step 2: Extract embedding
|
||||
embedding = recognizer.get_normalized_embedding(image, face.landmarks)
|
||||
|
||||
# Step 3: Predict attributes
|
||||
attrs = age_gender.predict(image, face.bbox)
|
||||
|
||||
results.append({
|
||||
'face_id': i,
|
||||
'bbox': face.bbox,
|
||||
'confidence': face.confidence,
|
||||
'embedding': embedding,
|
||||
'gender': attrs.sex,
|
||||
'age': attrs.age
|
||||
})
|
||||
|
||||
print(f" Face {i+1}: {attrs.sex}, {attrs.age} years old")
|
||||
|
||||
# Visualize
|
||||
draw_detections(
|
||||
image=image,
|
||||
bboxes=[f.bbox for f in faces],
|
||||
scores=[f.confidence for f in faces],
|
||||
landmarks=[f.landmarks for f in faces]
|
||||
)
|
||||
|
||||
return image, results
|
||||
|
||||
# Usage
|
||||
result_image, results = process_image("photo.jpg")
|
||||
cv2.imwrite("result.jpg", result_image)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Using FaceAnalyzer
|
||||
|
||||
For convenience, use the built-in `FaceAnalyzer`:
|
||||
|
||||
```python
|
||||
from uniface import FaceAnalyzer
|
||||
import cv2
|
||||
|
||||
# Initialize with desired modules
|
||||
analyzer = FaceAnalyzer(
|
||||
detect=True,
|
||||
recognize=True,
|
||||
attributes=True
|
||||
)
|
||||
|
||||
# Process image
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = analyzer.analyze(image)
|
||||
|
||||
# Access enriched Face objects
|
||||
for face in faces:
|
||||
print(f"Confidence: {face.confidence:.2f}")
|
||||
print(f"Embedding: {face.embedding.shape}")
|
||||
print(f"Age: {face.age}, Gender: {face.sex}")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Full Analysis Pipeline
|
||||
|
||||
Complete pipeline with all modules:
|
||||
|
||||
```python
|
||||
import cv2
|
||||
import numpy as np
|
||||
from uniface import (
|
||||
RetinaFace, ArcFace, AgeGender, FairFace,
|
||||
Landmark106, MobileGaze
|
||||
)
|
||||
from uniface.parsing import BiSeNet
|
||||
from uniface.spoofing import MiniFASNet
|
||||
from uniface.visualization import draw_detections, draw_gaze
|
||||
|
||||
class FaceAnalysisPipeline:
|
||||
def __init__(self):
|
||||
# Initialize all models
|
||||
self.detector = RetinaFace()
|
||||
self.recognizer = ArcFace()
|
||||
self.age_gender = AgeGender()
|
||||
self.fairface = FairFace()
|
||||
self.landmarker = Landmark106()
|
||||
self.gaze = MobileGaze()
|
||||
self.parser = BiSeNet()
|
||||
self.spoofer = MiniFASNet()
|
||||
|
||||
def analyze(self, image):
|
||||
"""Run full analysis pipeline."""
|
||||
faces = self.detector.detect(image)
|
||||
results = []
|
||||
|
||||
for face in faces:
|
||||
result = {
|
||||
'bbox': face.bbox,
|
||||
'confidence': face.confidence,
|
||||
'landmarks_5': face.landmarks
|
||||
}
|
||||
|
||||
# Recognition embedding
|
||||
result['embedding'] = self.recognizer.get_normalized_embedding(
|
||||
image, face.landmarks
|
||||
)
|
||||
|
||||
# Attributes
|
||||
ag_result = self.age_gender.predict(image, face.bbox)
|
||||
result['age'] = ag_result.age
|
||||
result['gender'] = ag_result.sex
|
||||
|
||||
# FairFace attributes
|
||||
ff_result = self.fairface.predict(image, face.bbox)
|
||||
result['age_group'] = ff_result.age_group
|
||||
result['race'] = ff_result.race
|
||||
|
||||
# 106-point landmarks
|
||||
result['landmarks_106'] = self.landmarker.get_landmarks(
|
||||
image, face.bbox
|
||||
)
|
||||
|
||||
# Gaze estimation
|
||||
x1, y1, x2, y2 = map(int, face.bbox)
|
||||
face_crop = image[y1:y2, x1:x2]
|
||||
if face_crop.size > 0:
|
||||
gaze_result = self.gaze.estimate(face_crop)
|
||||
result['gaze_pitch'] = gaze_result.pitch
|
||||
result['gaze_yaw'] = gaze_result.yaw
|
||||
|
||||
# Face parsing
|
||||
if face_crop.size > 0:
|
||||
result['parsing_mask'] = self.parser.parse(face_crop)
|
||||
|
||||
# Anti-spoofing
|
||||
spoof_result = self.spoofer.predict(image, face.bbox)
|
||||
result['is_real'] = spoof_result.is_real
|
||||
result['spoof_confidence'] = spoof_result.confidence
|
||||
|
||||
results.append(result)
|
||||
|
||||
return results
|
||||
|
||||
# Usage
|
||||
pipeline = FaceAnalysisPipeline()
|
||||
results = pipeline.analyze(cv2.imread("photo.jpg"))
|
||||
|
||||
for i, r in enumerate(results):
|
||||
print(f"\nFace {i+1}:")
|
||||
print(f" Gender: {r['gender']}, Age: {r['age']}")
|
||||
print(f" Race: {r['race']}, Age Group: {r['age_group']}")
|
||||
print(f" Gaze: pitch={np.degrees(r['gaze_pitch']):.1f}°")
|
||||
print(f" Real: {r['is_real']} ({r['spoof_confidence']:.1%})")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Visualization Pipeline
|
||||
|
||||
```python
|
||||
import cv2
|
||||
import numpy as np
|
||||
from uniface import RetinaFace, AgeGender, MobileGaze
|
||||
from uniface.visualization import draw_detections, draw_gaze
|
||||
|
||||
def visualize_analysis(image_path, output_path):
|
||||
"""Create annotated visualization of face analysis."""
|
||||
detector = RetinaFace()
|
||||
age_gender = AgeGender()
|
||||
gaze = MobileGaze()
|
||||
|
||||
image = cv2.imread(image_path)
|
||||
faces = detector.detect(image)
|
||||
|
||||
for face in faces:
|
||||
x1, y1, x2, y2 = map(int, face.bbox)
|
||||
|
||||
# Draw bounding box
|
||||
cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
||||
|
||||
# Age and gender
|
||||
attrs = age_gender.predict(image, face.bbox)
|
||||
label = f"{attrs.sex}, {attrs.age}y"
|
||||
cv2.putText(image, label, (x1, y1 - 10),
|
||||
cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)
|
||||
|
||||
# Gaze
|
||||
face_crop = image[y1:y2, x1:x2]
|
||||
if face_crop.size > 0:
|
||||
gaze_result = gaze.estimate(face_crop)
|
||||
draw_gaze(image, face.bbox, gaze_result.pitch, gaze_result.yaw)
|
||||
|
||||
# Confidence
|
||||
conf_label = f"{face.confidence:.0%}"
|
||||
cv2.putText(image, conf_label, (x1, y2 + 20),
|
||||
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 1)
|
||||
|
||||
cv2.imwrite(output_path, image)
|
||||
print(f"Saved to {output_path}")
|
||||
|
||||
# Usage
|
||||
visualize_analysis("input.jpg", "output.jpg")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## JSON Output
|
||||
|
||||
Export results to JSON:
|
||||
|
||||
```python
|
||||
import json
|
||||
import numpy as np
|
||||
|
||||
def results_to_json(results):
|
||||
"""Convert analysis results to JSON-serializable format."""
|
||||
output = []
|
||||
|
||||
for r in results:
|
||||
item = {
|
||||
'bbox': r['bbox'].tolist(),
|
||||
'confidence': float(r['confidence']),
|
||||
'age': int(r['age']) if r.get('age') else None,
|
||||
'gender': r.get('gender'),
|
||||
'race': r.get('race'),
|
||||
'is_real': r.get('is_real'),
|
||||
'gaze': {
|
||||
'pitch_deg': float(np.degrees(r['gaze_pitch'])) if 'gaze_pitch' in r else None,
|
||||
'yaw_deg': float(np.degrees(r['gaze_yaw'])) if 'gaze_yaw' in r else None
|
||||
}
|
||||
}
|
||||
output.append(item)
|
||||
|
||||
return output
|
||||
|
||||
# Usage
|
||||
results = pipeline.analyze(image)
|
||||
json_data = results_to_json(results)
|
||||
|
||||
with open('results.json', 'w') as f:
|
||||
json.dump(json_data, f, indent=2)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- [Batch Processing](batch-processing.md) - Process multiple images
|
||||
- [Video & Webcam](video-webcam.md) - Real-time processing
|
||||
- [Face Search](face-search.md) - Build a search system
|
||||
- [Detection Module](../modules/detection.md) - Detection options
|
||||
- [Recognition Module](../modules/recognition.md) - Recognition details
|
||||
125
docs/recipes/video-webcam.md
Normal file
125
docs/recipes/video-webcam.md
Normal file
@@ -0,0 +1,125 @@
|
||||
# Video & Webcam
|
||||
|
||||
Real-time face analysis for video streams.
|
||||
|
||||
!!! note "Work in Progress"
|
||||
This page contains example code patterns. Test thoroughly before using in production.
|
||||
|
||||
---
|
||||
|
||||
## Webcam Detection
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.visualization import draw_detections
|
||||
|
||||
detector = RetinaFace()
|
||||
cap = cv2.VideoCapture(0)
|
||||
|
||||
print("Press 'q' to quit")
|
||||
|
||||
while True:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
faces = detector.detect(frame)
|
||||
|
||||
draw_detections(
|
||||
image=frame,
|
||||
bboxes=[f.bbox for f in faces],
|
||||
scores=[f.confidence for f in faces],
|
||||
landmarks=[f.landmarks for f in faces]
|
||||
)
|
||||
|
||||
cv2.imshow("Face Detection", frame)
|
||||
|
||||
if cv2.waitKey(1) & 0xFF == ord('q'):
|
||||
break
|
||||
|
||||
cap.release()
|
||||
cv2.destroyAllWindows()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Video File Processing
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
|
||||
def process_video(input_path, output_path):
|
||||
"""Process a video file."""
|
||||
detector = RetinaFace()
|
||||
cap = cv2.VideoCapture(input_path)
|
||||
|
||||
# Get video properties
|
||||
fps = cap.get(cv2.CAP_PROP_FPS)
|
||||
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
|
||||
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
|
||||
|
||||
# Setup output
|
||||
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
|
||||
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
|
||||
|
||||
while cap.read()[0]:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
faces = detector.detect(frame)
|
||||
# ... process and draw ...
|
||||
|
||||
out.write(frame)
|
||||
|
||||
cap.release()
|
||||
out.release()
|
||||
|
||||
# Usage
|
||||
process_video("input.mp4", "output.mp4")
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Performance Tips
|
||||
|
||||
### Skip Frames
|
||||
|
||||
```python
|
||||
PROCESS_EVERY_N = 3 # Process every 3rd frame
|
||||
frame_count = 0
|
||||
last_faces = []
|
||||
|
||||
while True:
|
||||
ret, frame = cap.read()
|
||||
if frame_count % PROCESS_EVERY_N == 0:
|
||||
last_faces = detector.detect(frame)
|
||||
frame_count += 1
|
||||
# Draw last_faces...
|
||||
```
|
||||
|
||||
### FPS Counter
|
||||
|
||||
```python
|
||||
import time
|
||||
|
||||
prev_time = time.time()
|
||||
while True:
|
||||
curr_time = time.time()
|
||||
fps = 1 / (curr_time - prev_time)
|
||||
prev_time = curr_time
|
||||
|
||||
cv2.putText(frame, f"FPS: {fps:.1f}", (10, 30),
|
||||
cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## See Also
|
||||
|
||||
- [Anonymize Stream](anonymize-stream.md) - Privacy protection in video
|
||||
- [Batch Processing](batch-processing.md) - Process multiple files
|
||||
- [Detection Module](../modules/detection.md) - Detection options
|
||||
- [Gaze Module](../modules/gaze.md) - Gaze tracking
|
||||
225
docs/stylesheets/extra.css
Normal file
225
docs/stylesheets/extra.css
Normal file
@@ -0,0 +1,225 @@
|
||||
/* UniFace Documentation - Custom Styles */
|
||||
|
||||
/* ===== Hero Section ===== */
|
||||
|
||||
.md-content .hero {
|
||||
text-align: center;
|
||||
padding: 3rem 1rem 2rem;
|
||||
margin: 0 auto;
|
||||
max-width: 900px;
|
||||
}
|
||||
|
||||
.hero-title {
|
||||
font-size: 3.5rem !important;
|
||||
font-weight: 800 !important;
|
||||
margin-bottom: 0.5rem !important;
|
||||
background: linear-gradient(135deg, var(--md-primary-fg-color) 0%, #7c4dff 100%);
|
||||
-webkit-background-clip: text;
|
||||
-webkit-text-fill-color: transparent;
|
||||
background-clip: text;
|
||||
}
|
||||
|
||||
.hero-tagline {
|
||||
font-size: 1.5rem;
|
||||
color: var(--md-default-fg-color);
|
||||
margin-bottom: 0.5rem !important;
|
||||
font-weight: 500;
|
||||
}
|
||||
|
||||
.hero-subtitle {
|
||||
font-size: 1rem;
|
||||
color: var(--md-default-fg-color--light);
|
||||
margin-bottom: 1.5rem !important;
|
||||
font-weight: 400;
|
||||
letter-spacing: 0.5px;
|
||||
}
|
||||
|
||||
.hero .md-button {
|
||||
margin: 0.5rem 0.25rem;
|
||||
padding: 0.7rem 1.5rem;
|
||||
font-weight: 600;
|
||||
border-radius: 8px;
|
||||
transition: all 0.2s ease;
|
||||
}
|
||||
|
||||
.hero .md-button--primary {
|
||||
background: linear-gradient(135deg, var(--md-primary-fg-color) 0%, #5c6bc0 100%);
|
||||
border: none;
|
||||
box-shadow: 0 4px 14px rgba(63, 81, 181, 0.4);
|
||||
}
|
||||
|
||||
.hero .md-button--primary:hover {
|
||||
transform: translateY(-2px);
|
||||
box-shadow: 0 6px 20px rgba(63, 81, 181, 0.5);
|
||||
}
|
||||
|
||||
.hero .md-button:not(.md-button--primary) {
|
||||
border: 2px solid var(--md-primary-fg-color);
|
||||
background: transparent;
|
||||
color: var(--md-primary-fg-color);
|
||||
}
|
||||
|
||||
.hero .md-button:not(.md-button--primary):hover {
|
||||
background: var(--md-primary-fg-color);
|
||||
border-color: var(--md-primary-fg-color);
|
||||
color: white;
|
||||
transform: translateY(-2px);
|
||||
}
|
||||
|
||||
/* Badge styling in hero */
|
||||
.hero p a img {
|
||||
margin: 0 3px;
|
||||
height: 24px !important;
|
||||
}
|
||||
|
||||
/* ===== Feature Grid ===== */
|
||||
.feature-grid {
|
||||
display: grid;
|
||||
grid-template-columns: repeat(4, 1fr);
|
||||
gap: 1.25rem;
|
||||
margin: 2rem 0;
|
||||
}
|
||||
|
||||
.feature-card {
|
||||
padding: 1.5rem;
|
||||
border-radius: 12px;
|
||||
background: var(--md-code-bg-color);
|
||||
border: 1px solid var(--md-default-fg-color--lightest);
|
||||
transition: all 0.3s ease;
|
||||
position: relative;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
.feature-card::before {
|
||||
content: '';
|
||||
position: absolute;
|
||||
top: 0;
|
||||
left: 0;
|
||||
right: 0;
|
||||
height: 3px;
|
||||
background: linear-gradient(90deg, var(--md-primary-fg-color), #7c4dff);
|
||||
opacity: 0;
|
||||
transition: opacity 0.3s ease;
|
||||
}
|
||||
|
||||
.feature-card:hover {
|
||||
transform: translateY(-4px);
|
||||
box-shadow: 0 12px 24px rgba(0, 0, 0, 0.1);
|
||||
border-color: var(--md-primary-fg-color--light);
|
||||
}
|
||||
|
||||
.feature-card:hover::before {
|
||||
opacity: 1;
|
||||
}
|
||||
|
||||
.feature-card h3 {
|
||||
margin-top: 0 !important;
|
||||
margin-bottom: 0.75rem !important;
|
||||
font-size: 1rem !important;
|
||||
font-weight: 600;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 0.5rem;
|
||||
}
|
||||
|
||||
.feature-card p {
|
||||
margin: 0;
|
||||
font-size: 0.875rem;
|
||||
color: var(--md-default-fg-color--light);
|
||||
line-height: 1.5;
|
||||
}
|
||||
|
||||
.feature-card a {
|
||||
display: inline-block;
|
||||
margin-top: 0.75rem;
|
||||
font-weight: 500;
|
||||
font-size: 0.875rem;
|
||||
}
|
||||
|
||||
/* ===== Next Steps Grid (2 columns) ===== */
|
||||
.next-steps-grid {
|
||||
display: grid;
|
||||
grid-template-columns: repeat(2, 1fr);
|
||||
gap: 1.25rem;
|
||||
margin: 2rem 0;
|
||||
}
|
||||
|
||||
.next-steps-grid .feature-card {
|
||||
padding: 2rem;
|
||||
}
|
||||
|
||||
.next-steps-grid .feature-card h3 {
|
||||
font-size: 1.1rem !important;
|
||||
}
|
||||
|
||||
/* ===== Dark Mode Adjustments ===== */
|
||||
[data-md-color-scheme="slate"] .hero-title {
|
||||
background: linear-gradient(135deg, #7c4dff 0%, #b388ff 100%);
|
||||
-webkit-background-clip: text;
|
||||
-webkit-text-fill-color: transparent;
|
||||
background-clip: text;
|
||||
}
|
||||
|
||||
[data-md-color-scheme="slate"] .feature-card:hover {
|
||||
box-shadow: 0 12px 24px rgba(0, 0, 0, 0.3);
|
||||
}
|
||||
|
||||
[data-md-color-scheme="slate"] .hero .md-button--primary {
|
||||
background: linear-gradient(135deg, #7c4dff 0%, #b388ff 100%);
|
||||
box-shadow: 0 4px 14px rgba(124, 77, 255, 0.4);
|
||||
}
|
||||
|
||||
[data-md-color-scheme="slate"] .hero .md-button--primary:hover {
|
||||
box-shadow: 0 6px 20px rgba(124, 77, 255, 0.5);
|
||||
}
|
||||
|
||||
[data-md-color-scheme="slate"] .hero .md-button:not(.md-button--primary) {
|
||||
border: 2px solid rgba(255, 255, 255, 0.3);
|
||||
background: rgba(255, 255, 255, 0.05);
|
||||
color: rgba(255, 255, 255, 0.9);
|
||||
}
|
||||
|
||||
[data-md-color-scheme="slate"] .hero .md-button:not(.md-button--primary):hover {
|
||||
background: rgba(255, 255, 255, 0.1);
|
||||
border-color: rgba(255, 255, 255, 0.5);
|
||||
color: white;
|
||||
transform: translateY(-2px);
|
||||
}
|
||||
|
||||
/* ===== Responsive Design ===== */
|
||||
@media (max-width: 1200px) {
|
||||
.feature-grid {
|
||||
grid-template-columns: repeat(2, 1fr);
|
||||
}
|
||||
}
|
||||
|
||||
@media (max-width: 768px) {
|
||||
.hero-title {
|
||||
font-size: 2.5rem !important;
|
||||
}
|
||||
|
||||
.hero-subtitle {
|
||||
font-size: 1.1rem;
|
||||
}
|
||||
|
||||
.feature-grid,
|
||||
.next-steps-grid {
|
||||
grid-template-columns: 1fr;
|
||||
}
|
||||
|
||||
.hero .md-button {
|
||||
display: block;
|
||||
margin: 0.5rem auto;
|
||||
max-width: 200px;
|
||||
}
|
||||
}
|
||||
|
||||
@media (max-width: 480px) {
|
||||
.hero-title {
|
||||
font-size: 2rem !important;
|
||||
}
|
||||
|
||||
.feature-card {
|
||||
padding: 1.25rem;
|
||||
}
|
||||
}
|
||||
@@ -25,7 +25,14 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%pip install -q uniface"
|
||||
"%pip install -q uniface\n",
|
||||
"\n",
|
||||
"# Clone repo for assets (Colab only)\n",
|
||||
"import os\n",
|
||||
"if 'COLAB_GPU' in os.environ or 'COLAB_RELEASE_TAG' in os.environ:\n",
|
||||
" if not os.path.exists('uniface'):\n",
|
||||
" !git clone --depth 1 https://github.com/yakhyo/uniface.git\n",
|
||||
" os.chdir('uniface/examples')"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -71,15 +78,7 @@
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"✓ Model loaded (CoreML (Apple Silicon))\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"detector = RetinaFace(\n",
|
||||
" confidence_threshold=0.5,\n",
|
||||
|
||||
@@ -29,7 +29,14 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%pip install -q uniface"
|
||||
"%pip install -q uniface\n",
|
||||
"\n",
|
||||
"# Clone repo for assets (Colab only)\n",
|
||||
"import os\n",
|
||||
"if 'COLAB_GPU' in os.environ or 'COLAB_RELEASE_TAG' in os.environ:\n",
|
||||
" if not os.path.exists('uniface'):\n",
|
||||
" !git clone --depth 1 https://github.com/yakhyo/uniface.git\n",
|
||||
" os.chdir('uniface/examples')"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -76,15 +83,7 @@
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"✓ Model loaded (CoreML (Apple Silicon))\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"detector = RetinaFace(\n",
|
||||
" confidence_threshold=0.5,\n",
|
||||
|
||||
@@ -25,7 +25,14 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%pip install -q uniface"
|
||||
"%pip install -q uniface\n",
|
||||
"\n",
|
||||
"# Clone repo for assets (Colab only)\n",
|
||||
"import os\n",
|
||||
"if 'COLAB_GPU' in os.environ or 'COLAB_RELEASE_TAG' in os.environ:\n",
|
||||
" if not os.path.exists('uniface'):\n",
|
||||
" !git clone --depth 1 https://github.com/yakhyo/uniface.git\n",
|
||||
" os.chdir('uniface/examples')"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -66,16 +73,7 @@
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"✓ Model loaded (CoreML (Apple Silicon))\n",
|
||||
"✓ Model loaded (CoreML (Apple Silicon))\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"analyzer = FaceAnalyzer(\n",
|
||||
" detector=RetinaFace(confidence_threshold=0.5),\n",
|
||||
|
||||
@@ -11,7 +11,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
@@ -23,7 +23,14 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%pip install -q uniface"
|
||||
"%pip install -q uniface\n",
|
||||
"\n",
|
||||
"# Clone repo for assets (Colab only)\n",
|
||||
"import os\n",
|
||||
"if 'COLAB_GPU' in os.environ or 'COLAB_RELEASE_TAG' in os.environ:\n",
|
||||
" if not os.path.exists('uniface'):\n",
|
||||
" !git clone --depth 1 https://github.com/yakhyo/uniface.git\n",
|
||||
" os.chdir('uniface/examples')"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
@@ -25,7 +25,14 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%pip install -q uniface"
|
||||
"%pip install -q uniface\n",
|
||||
"\n",
|
||||
"# Clone repo for assets (Colab only)\n",
|
||||
"import os\n",
|
||||
"if 'COLAB_GPU' in os.environ or 'COLAB_RELEASE_TAG' in os.environ:\n",
|
||||
" if not os.path.exists('uniface'):\n",
|
||||
" !git clone --depth 1 https://github.com/yakhyo/uniface.git\n",
|
||||
" os.chdir('uniface/examples')"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -75,17 +82,7 @@
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"✓ Model loaded (CoreML (Apple Silicon))\n",
|
||||
"✓ Model loaded (CoreML (Apple Silicon))\n",
|
||||
"✓ Model loaded (CoreML (Apple Silicon))\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"analyzer = FaceAnalyzer(\n",
|
||||
" detector=RetinaFace(confidence_threshold=0.5),\n",
|
||||
|
||||
@@ -15,7 +15,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
@@ -27,7 +27,14 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%pip install -q uniface"
|
||||
"%pip install -q uniface\n",
|
||||
"\n",
|
||||
"# Clone repo for assets (Colab only)\n",
|
||||
"import os\n",
|
||||
"if 'COLAB_GPU' in os.environ or 'COLAB_RELEASE_TAG' in os.environ:\n",
|
||||
" if not os.path.exists('uniface'):\n",
|
||||
" !git clone --depth 1 https://github.com/yakhyo/uniface.git\n",
|
||||
" os.chdir('uniface/examples')"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
File diff suppressed because one or more lines are too long
@@ -25,7 +25,14 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%pip install -q uniface"
|
||||
"%pip install -q uniface\n",
|
||||
"\n",
|
||||
"# Clone repo for assets (Colab only)\n",
|
||||
"import os\n",
|
||||
"if 'COLAB_GPU' in os.environ or 'COLAB_RELEASE_TAG' in os.environ:\n",
|
||||
" if not os.path.exists('uniface'):\n",
|
||||
" !git clone --depth 1 https://github.com/yakhyo/uniface.git\n",
|
||||
" os.chdir('uniface/examples')"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -74,16 +81,7 @@
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"✓ Model loaded (CoreML (Apple Silicon))\n",
|
||||
"✓ Model loaded (CoreML (Apple Silicon))\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Initialize face detector\n",
|
||||
"detector = RetinaFace(confidence_threshold=0.5)\n",
|
||||
@@ -103,7 +101,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
@@ -156,14 +154,14 @@
|
||||
" face_crop = image[y1:y2, x1:x2]\n",
|
||||
"\n",
|
||||
" if face_crop.size > 0:\n",
|
||||
" pitch, yaw = gaze_estimator.estimate(face_crop)\n",
|
||||
" pitch_deg = np.degrees(pitch)\n",
|
||||
" yaw_deg = np.degrees(yaw)\n",
|
||||
" gaze = gaze_estimator.estimate(face_crop)\n",
|
||||
" pitch_deg = np.degrees(gaze.pitch)\n",
|
||||
" yaw_deg = np.degrees(gaze.yaw)\n",
|
||||
"\n",
|
||||
" print(f' Face {i+1}: pitch={pitch_deg:.1f}°, yaw={yaw_deg:.1f}°')\n",
|
||||
"\n",
|
||||
" # Draw gaze without angle text\n",
|
||||
" draw_gaze(image, face.bbox, pitch, yaw, draw_angles=False)\n",
|
||||
" draw_gaze(image, face.bbox, gaze.pitch, gaze.yaw, draw_angles=False)\n",
|
||||
"\n",
|
||||
" # Convert BGR to RGB for display\n",
|
||||
" original_rgb = cv2.cvtColor(original, cv2.COLOR_BGR2RGB)\n",
|
||||
@@ -234,7 +232,7 @@
|
||||
"## Notes\n",
|
||||
"\n",
|
||||
"- **Input**: Gaze estimation requires a face crop (obtained from face detection)\n",
|
||||
"- **Output**: Returns (pitch, yaw) angles in radians\n",
|
||||
"- **Output**: Returns a `GazeResult` object with `pitch` and `yaw` attributes (angles in radians)\n",
|
||||
"- **Visualization**: `draw_gaze()` automatically draws bounding box and gaze arrow\n",
|
||||
"- **Models**: Trained on Gaze360 dataset with diverse head poses\n",
|
||||
"- **Performance**: MAE (Mean Absolute Error) ranges from 11-13 degrees\n",
|
||||
|
||||
164
mkdocs.yml
Normal file
164
mkdocs.yml
Normal file
@@ -0,0 +1,164 @@
|
||||
site_name: UniFace
|
||||
site_description: All-in-One Face Analysis Library with ONNX Runtime
|
||||
site_author: Yakhyokhuja Valikhujaev
|
||||
site_url: https://yakhyo.github.io/uniface
|
||||
|
||||
repo_name: yakhyo/uniface
|
||||
repo_url: https://github.com/yakhyo/uniface
|
||||
edit_uri: edit/main/docs/
|
||||
|
||||
copyright: Copyright © 2025 Yakhyokhuja Valikhujaev
|
||||
|
||||
theme:
|
||||
name: material
|
||||
custom_dir: docs/overrides
|
||||
palette:
|
||||
- media: "(prefers-color-scheme)"
|
||||
toggle:
|
||||
icon: material/link
|
||||
name: Switch to light mode
|
||||
- media: "(prefers-color-scheme: light)"
|
||||
scheme: default
|
||||
primary: indigo
|
||||
accent: indigo
|
||||
toggle:
|
||||
icon: material/toggle-switch
|
||||
name: Switch to dark mode
|
||||
- media: "(prefers-color-scheme: dark)"
|
||||
scheme: slate
|
||||
primary: black
|
||||
accent: indigo
|
||||
toggle:
|
||||
icon: material/toggle-switch-off-outline
|
||||
name: Switch to system preference
|
||||
font:
|
||||
text: Roboto
|
||||
code: Roboto Mono
|
||||
features:
|
||||
- navigation.tabs
|
||||
- navigation.top
|
||||
- navigation.footer
|
||||
- navigation.indexes
|
||||
- navigation.instant
|
||||
- navigation.tracking
|
||||
- search.suggest
|
||||
- search.highlight
|
||||
- content.code.copy
|
||||
- content.code.annotate
|
||||
- content.action.edit
|
||||
- content.action.view
|
||||
- content.tabs.link
|
||||
- toc.follow
|
||||
|
||||
icon:
|
||||
logo: material/book-open-page-variant
|
||||
repo: fontawesome/brands/git-alt
|
||||
admonition:
|
||||
note: octicons/tag-16
|
||||
abstract: octicons/checklist-16
|
||||
info: octicons/info-16
|
||||
tip: octicons/squirrel-16
|
||||
success: octicons/check-16
|
||||
question: octicons/question-16
|
||||
warning: octicons/alert-16
|
||||
failure: octicons/x-circle-16
|
||||
danger: octicons/zap-16
|
||||
bug: octicons/bug-16
|
||||
example: octicons/beaker-16
|
||||
quote: octicons/quote-16
|
||||
|
||||
extra:
|
||||
social:
|
||||
- icon: fontawesome/brands/github
|
||||
link: https://github.com/yakhyo
|
||||
- icon: fontawesome/brands/python
|
||||
link: https://pypi.org/project/uniface/
|
||||
- icon: fontawesome/brands/x-twitter
|
||||
link: https://x.com/y_valikhujaev
|
||||
analytics:
|
||||
provider: google
|
||||
property: G-FGEHR2K5ZE
|
||||
|
||||
extra_css:
|
||||
- stylesheets/extra.css
|
||||
|
||||
markdown_extensions:
|
||||
- admonition
|
||||
- footnotes
|
||||
- attr_list
|
||||
- md_in_html
|
||||
- def_list
|
||||
- tables
|
||||
- toc:
|
||||
permalink: false
|
||||
toc_depth: 3
|
||||
- pymdownx.superfences:
|
||||
custom_fences:
|
||||
- name: mermaid
|
||||
class: mermaid
|
||||
format: !!python/name:pymdownx.superfences.fence_code_format
|
||||
- pymdownx.details
|
||||
- pymdownx.highlight:
|
||||
anchor_linenums: true
|
||||
line_spans: __span
|
||||
pygments_lang_class: true
|
||||
- pymdownx.inlinehilite
|
||||
- pymdownx.snippets
|
||||
- pymdownx.tabbed:
|
||||
alternate_style: true
|
||||
- pymdownx.emoji:
|
||||
emoji_index: !!python/name:material.extensions.emoji.twemoji
|
||||
emoji_generator: !!python/name:material.extensions.emoji.to_svg
|
||||
- pymdownx.tasklist:
|
||||
custom_checkbox: true
|
||||
- pymdownx.keys
|
||||
- pymdownx.mark
|
||||
- pymdownx.critic
|
||||
- pymdownx.caret
|
||||
- pymdownx.tilde
|
||||
|
||||
plugins:
|
||||
- search
|
||||
- git-committers:
|
||||
repository: yakhyo/uniface
|
||||
branch: main
|
||||
token: !ENV MKDOCS_GIT_COMMITTERS_APIKEY
|
||||
- git-revision-date-localized:
|
||||
enable_creation_date: true
|
||||
type: timeago
|
||||
|
||||
nav:
|
||||
- Home: index.md
|
||||
- Getting Started:
|
||||
- Installation: installation.md
|
||||
- Quickstart: quickstart.md
|
||||
- Notebooks: notebooks.md
|
||||
- Model Zoo: models.md
|
||||
- Tutorials:
|
||||
- Image Pipeline: recipes/image-pipeline.md
|
||||
- Video & Webcam: recipes/video-webcam.md
|
||||
- Face Search: recipes/face-search.md
|
||||
- Batch Processing: recipes/batch-processing.md
|
||||
- Anonymize Stream: recipes/anonymize-stream.md
|
||||
- Custom Models: recipes/custom-models.md
|
||||
- API Reference:
|
||||
- Detection: modules/detection.md
|
||||
- Recognition: modules/recognition.md
|
||||
- Landmarks: modules/landmarks.md
|
||||
- Attributes: modules/attributes.md
|
||||
- Parsing: modules/parsing.md
|
||||
- Gaze: modules/gaze.md
|
||||
- Anti-Spoofing: modules/spoofing.md
|
||||
- Privacy: modules/privacy.md
|
||||
- Guides:
|
||||
- Overview: concepts/overview.md
|
||||
- Inputs & Outputs: concepts/inputs-outputs.md
|
||||
- Coordinate Systems: concepts/coordinate-systems.md
|
||||
- Execution Providers: concepts/execution-providers.md
|
||||
- Model Cache: concepts/model-cache-offline.md
|
||||
- Thresholds: concepts/thresholds-calibration.md
|
||||
- Resources:
|
||||
- Contributing: contributing.md
|
||||
- License: license-attribution.md
|
||||
- Releases: https://github.com/yakhyo/uniface/releases
|
||||
- Discussions: https://github.com/yakhyo/uniface/discussions
|
||||
@@ -1,15 +1,15 @@
|
||||
[project]
|
||||
name = "uniface"
|
||||
version = "2.0.0"
|
||||
version = "2.2.0"
|
||||
description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Landmark Analysis, Face Parsing, Gaze Estimation, Age, and Gender Detection"
|
||||
readme = "README.md"
|
||||
license = { text = "MIT" }
|
||||
license = "MIT"
|
||||
authors = [{ name = "Yakhyokhuja Valikhujaev", email = "yakhyo9696@gmail.com" }]
|
||||
maintainers = [
|
||||
{ name = "Yakhyokhuja Valikhujaev", email = "yakhyo9696@gmail.com" },
|
||||
]
|
||||
|
||||
requires-python = ">=3.11,<3.14"
|
||||
requires-python = ">=3.10,<3.14"
|
||||
keywords = [
|
||||
"face-detection",
|
||||
"face-recognition",
|
||||
@@ -31,9 +31,9 @@ classifiers = [
|
||||
"Development Status :: 4 - Beta",
|
||||
"Intended Audience :: Developers",
|
||||
"Intended Audience :: Science/Research",
|
||||
"License :: OSI Approved :: MIT License",
|
||||
"Operating System :: OS Independent",
|
||||
"Programming Language :: Python :: 3",
|
||||
"Programming Language :: Python :: 3.10",
|
||||
"Programming Language :: Python :: 3.11",
|
||||
"Programming Language :: Python :: 3.12",
|
||||
"Programming Language :: Python :: 3.13",
|
||||
@@ -72,7 +72,7 @@ uniface = ["py.typed"]
|
||||
|
||||
[tool.ruff]
|
||||
line-length = 120
|
||||
target-version = "py311"
|
||||
target-version = "py310"
|
||||
exclude = [
|
||||
".git",
|
||||
".ruff_cache",
|
||||
@@ -128,15 +128,6 @@ section-order = [
|
||||
[tool.ruff.lint.pydocstyle]
|
||||
convention = "google"
|
||||
|
||||
[tool.mypy]
|
||||
python_version = "3.11"
|
||||
warn_return_any = false
|
||||
warn_unused_ignores = true
|
||||
ignore_missing_imports = true
|
||||
exclude = ["tests/", "scripts/", "examples/"]
|
||||
# Disable strict return type checking for numpy operations
|
||||
disable_error_code = ["no-any-return"]
|
||||
|
||||
[tool.bandit]
|
||||
exclude_dirs = ["tests", "scripts", "examples"]
|
||||
skips = ["B101", "B614"] # B101: assert, B614: torch.jit.load (models are SHA256 verified)
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
@@ -18,7 +18,7 @@ from pathlib import Path
|
||||
|
||||
import cv2
|
||||
|
||||
from uniface.detection import SCRFD, RetinaFace, YOLOv5Face
|
||||
from uniface.detection import SCRFD, RetinaFace, YOLOv5Face, YOLOv8Face
|
||||
from uniface.visualization import draw_detections
|
||||
|
||||
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
|
||||
@@ -157,7 +157,9 @@ def run_camera(detector, camera_id: int = 0, threshold: float = 0.6):
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(description='Run face detection')
|
||||
parser.add_argument('--source', type=str, required=True, help='Image/video path or camera ID (0, 1, ...)')
|
||||
parser.add_argument('--method', type=str, default='retinaface', choices=['retinaface', 'scrfd', 'yolov5face'])
|
||||
parser.add_argument(
|
||||
'--method', type=str, default='retinaface', choices=['retinaface', 'scrfd', 'yolov5face', 'yolov8face']
|
||||
)
|
||||
parser.add_argument('--threshold', type=float, default=0.25, help='Visualization threshold')
|
||||
parser.add_argument('--save-dir', type=str, default='outputs', help='Output directory')
|
||||
args = parser.parse_args()
|
||||
@@ -167,10 +169,14 @@ def main():
|
||||
detector = RetinaFace()
|
||||
elif args.method == 'scrfd':
|
||||
detector = SCRFD()
|
||||
else:
|
||||
elif args.method == 'yolov5face':
|
||||
from uniface.constants import YOLOv5FaceWeights
|
||||
|
||||
detector = YOLOv5Face(model_name=YOLOv5FaceWeights.YOLOV5M)
|
||||
else: # yolov8face
|
||||
from uniface.constants import YOLOv8FaceWeights
|
||||
|
||||
detector = YOLOv8Face(model_name=YOLOv8FaceWeights.YOLOV8N)
|
||||
|
||||
# Determine source type and process
|
||||
source_type = get_source_type(args.source)
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
#
|
||||
# Licensed under the MIT License.
|
||||
# You may obtain a copy of the License at
|
||||
@@ -14,8 +14,8 @@
|
||||
"""UniFace: A comprehensive library for face analysis.
|
||||
|
||||
This library provides unified APIs for:
|
||||
- Face detection (RetinaFace, SCRFD, YOLOv5Face)
|
||||
- Face recognition (ArcFace, MobileFace, SphereFace)
|
||||
- Face detection (RetinaFace, SCRFD, YOLOv5Face, YOLOv8Face)
|
||||
- Face recognition (AdaFace, ArcFace, MobileFace, SphereFace)
|
||||
- Facial landmarks (106-point detection)
|
||||
- Face parsing (semantic segmentation)
|
||||
- Gaze estimation
|
||||
@@ -28,7 +28,7 @@ from __future__ import annotations
|
||||
|
||||
__license__ = 'MIT'
|
||||
__author__ = 'Yakhyokhuja Valikhujaev'
|
||||
__version__ = '2.0.0'
|
||||
__version__ = '2.2.0'
|
||||
|
||||
from uniface.face_utils import compute_similarity, face_alignment
|
||||
from uniface.log import Logger, enable_logging
|
||||
@@ -41,6 +41,7 @@ from .detection import (
|
||||
SCRFD,
|
||||
RetinaFace,
|
||||
YOLOv5Face,
|
||||
YOLOv8Face,
|
||||
create_detector,
|
||||
detect_faces,
|
||||
list_available_detectors,
|
||||
@@ -49,7 +50,7 @@ from .gaze import MobileGaze, create_gaze_estimator
|
||||
from .landmark import Landmark106, create_landmarker
|
||||
from .parsing import BiSeNet, create_face_parser
|
||||
from .privacy import BlurFace, anonymize_faces
|
||||
from .recognition import ArcFace, MobileFace, SphereFace, create_recognizer
|
||||
from .recognition import AdaFace, ArcFace, MobileFace, SphereFace, create_recognizer
|
||||
from .spoofing import MiniFASNet, create_spoofer
|
||||
from .types import AttributeResult, EmotionResult, Face, GazeResult, SpoofingResult
|
||||
|
||||
@@ -81,7 +82,9 @@ __all__ = [
|
||||
'RetinaFace',
|
||||
'SCRFD',
|
||||
'YOLOv5Face',
|
||||
'YOLOv8Face',
|
||||
# Recognition models
|
||||
'AdaFace',
|
||||
'ArcFace',
|
||||
'MobileFace',
|
||||
'SphereFace',
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
@@ -32,6 +32,15 @@ class ArcFaceWeights(str, Enum):
|
||||
MNET = "arcface_mnet"
|
||||
RESNET = "arcface_resnet"
|
||||
|
||||
|
||||
class AdaFaceWeights(str, Enum):
|
||||
"""
|
||||
AdaFace model weights trained on WebFace datasets.
|
||||
https://github.com/yakhyo/adaface-onnx
|
||||
"""
|
||||
IR_18 = "adaface_ir_18"
|
||||
IR_101 = "adaface_ir_101"
|
||||
|
||||
class RetinaFaceWeights(str, Enum):
|
||||
"""
|
||||
Trained on WIDER FACE dataset.
|
||||
@@ -70,6 +79,20 @@ class YOLOv5FaceWeights(str, Enum):
|
||||
YOLOV5M = "yolov5m"
|
||||
|
||||
|
||||
class YOLOv8FaceWeights(str, Enum):
|
||||
"""
|
||||
YOLOv8-Face models trained on WIDER FACE dataset.
|
||||
Uses anchor-free design with DFL (Distribution Focal Loss) for bbox regression.
|
||||
Exported to ONNX from: https://github.com/yakhyo/yolov8-face-onnx-inference
|
||||
|
||||
Model Performance (WIDER FACE):
|
||||
- YOLOV8_LITE_S: 7.4MB, 93.4% Easy / 91.2% Medium / 78.6% Hard (lightweight)
|
||||
- YOLOV8N: 12MB, 94.6% Easy / 92.3% Medium / 79.6% Hard (recommended)
|
||||
"""
|
||||
YOLOV8_LITE_S = "yolov8_lite_s"
|
||||
YOLOV8N = "yolov8n_face"
|
||||
|
||||
|
||||
class DDAMFNWeights(str, Enum):
|
||||
"""
|
||||
Trained on AffectNet dataset.
|
||||
@@ -160,6 +183,9 @@ MODEL_URLS: dict[Enum, str] = {
|
||||
# ArcFace
|
||||
ArcFaceWeights.MNET: 'https://github.com/yakhyo/uniface/releases/download/weights/w600k_mbf.onnx',
|
||||
ArcFaceWeights.RESNET: 'https://github.com/yakhyo/uniface/releases/download/weights/w600k_r50.onnx',
|
||||
# AdaFace
|
||||
AdaFaceWeights.IR_18: 'https://github.com/yakhyo/adaface-onnx/releases/download/weights/adaface_ir_18.onnx',
|
||||
AdaFaceWeights.IR_101: 'https://github.com/yakhyo/adaface-onnx/releases/download/weights/adaface_ir_101.onnx',
|
||||
# SCRFD
|
||||
SCRFDWeights.SCRFD_10G_KPS: 'https://github.com/yakhyo/uniface/releases/download/weights/scrfd_10g_kps.onnx',
|
||||
SCRFDWeights.SCRFD_500M_KPS: 'https://github.com/yakhyo/uniface/releases/download/weights/scrfd_500m_kps.onnx',
|
||||
@@ -167,6 +193,9 @@ MODEL_URLS: dict[Enum, str] = {
|
||||
YOLOv5FaceWeights.YOLOV5N: 'https://github.com/yakhyo/yolov5-face-onnx-inference/releases/download/weights/yolov5n_face.onnx',
|
||||
YOLOv5FaceWeights.YOLOV5S: 'https://github.com/yakhyo/yolov5-face-onnx-inference/releases/download/weights/yolov5s_face.onnx',
|
||||
YOLOv5FaceWeights.YOLOV5M: 'https://github.com/yakhyo/yolov5-face-onnx-inference/releases/download/weights/yolov5m_face.onnx',
|
||||
# YOLOv8-Face
|
||||
YOLOv8FaceWeights.YOLOV8_LITE_S: 'https://github.com/yakhyo/yolov8-face-onnx-inference/releases/download/weights/yolov8-lite-s.onnx',
|
||||
YOLOv8FaceWeights.YOLOV8N: 'https://github.com/yakhyo/yolov8-face-onnx-inference/releases/download/weights/yolov8n-face.onnx',
|
||||
# DDAFM
|
||||
DDAMFNWeights.AFFECNET7: 'https://github.com/yakhyo/uniface/releases/download/weights/affecnet7.script',
|
||||
DDAMFNWeights.AFFECNET8: 'https://github.com/yakhyo/uniface/releases/download/weights/affecnet8.script',
|
||||
@@ -209,6 +238,9 @@ MODEL_SHA256: dict[Enum, str] = {
|
||||
# ArcFace
|
||||
ArcFaceWeights.MNET: '9cc6e4a75f0e2bf0b1aed94578f144d15175f357bdc05e815e5c4a02b319eb4f',
|
||||
ArcFaceWeights.RESNET: '4c06341c33c2ca1f86781dab0e829f88ad5b64be9fba56e56bc9ebdefc619e43',
|
||||
# AdaFace
|
||||
AdaFaceWeights.IR_18: '6b6a35772fb636cdd4fa86520c1a259d0c41472a76f70f802b351837a00d9870',
|
||||
AdaFaceWeights.IR_101: 'f2eb07d03de0af560a82e1214df799fec5e09375d43521e2868f9dc387e5a43e',
|
||||
# SCRFD
|
||||
SCRFDWeights.SCRFD_10G_KPS: '5838f7fe053675b1c7a08b633df49e7af5495cee0493c7dcf6697200b85b5b91',
|
||||
SCRFDWeights.SCRFD_500M_KPS: '5e4447f50245bbd7966bd6c0fa52938c61474a04ec7def48753668a9d8b4ea3a',
|
||||
@@ -216,6 +248,9 @@ MODEL_SHA256: dict[Enum, str] = {
|
||||
YOLOv5FaceWeights.YOLOV5N: 'eb244a06e36999db732b317c2b30fa113cd6cfc1a397eaf738f2d6f33c01f640',
|
||||
YOLOv5FaceWeights.YOLOV5S: 'fc682801cd5880e1e296184a14aea0035486b5146ec1a1389d2e7149cb134bb2',
|
||||
YOLOv5FaceWeights.YOLOV5M: '04302ce27a15bde3e20945691b688e2dd018a10e92dd8932146bede6a49207b2',
|
||||
# YOLOv8-Face
|
||||
YOLOv8FaceWeights.YOLOV8_LITE_S: '11bc496be01356d2d960085bfd8abb8f103199900a034f239a8a1705a1b31dba',
|
||||
YOLOv8FaceWeights.YOLOV8N: '33f3951af7fc0c4d9b321b29cdcd8c9a59d0a29a8d4bdc01fcb5507d5c714809',
|
||||
# DDAFM
|
||||
DDAMFNWeights.AFFECNET7: '10535bf8b6afe8e9d6ae26cea6c3add9a93036e9addb6adebfd4a972171d015d',
|
||||
DDAMFNWeights.AFFECNET8: '8c66963bc71db42796a14dfcbfcd181b268b65a3fc16e87147d6a3a3d7e0f487',
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
@@ -14,6 +14,7 @@ from .base import BaseDetector
|
||||
from .retinaface import RetinaFace
|
||||
from .scrfd import SCRFD
|
||||
from .yolov5 import YOLOv5Face
|
||||
from .yolov8 import YOLOv8Face
|
||||
|
||||
# Global cache for detector instances (keyed by method name + config hash)
|
||||
_detector_cache: dict[str, BaseDetector] = {}
|
||||
@@ -27,7 +28,7 @@ def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs: Any) -
|
||||
|
||||
Args:
|
||||
image: Input image as numpy array with shape (H, W, C) in BGR format.
|
||||
method: Detection method to use. Options: 'retinaface', 'scrfd', 'yolov5face'.
|
||||
method: Detection method to use. Options: 'retinaface', 'scrfd', 'yolov5face', 'yolov8face'.
|
||||
**kwargs: Additional arguments passed to the detector.
|
||||
|
||||
Returns:
|
||||
@@ -66,6 +67,7 @@ def create_detector(method: str = 'retinaface', **kwargs: Any) -> BaseDetector:
|
||||
- 'retinaface': RetinaFace detector (default)
|
||||
- 'scrfd': SCRFD detector (fast and accurate)
|
||||
- 'yolov5face': YOLOv5-Face detector (accurate with landmarks)
|
||||
- 'yolov8face': YOLOv8-Face detector (anchor-free, accurate)
|
||||
**kwargs: Detector-specific parameters.
|
||||
|
||||
Returns:
|
||||
@@ -84,11 +86,9 @@ def create_detector(method: str = 'retinaface', **kwargs: Any) -> BaseDetector:
|
||||
... 'scrfd', model_name=SCRFDWeights.SCRFD_10G_KPS, confidence_threshold=0.8, input_size=(640, 640)
|
||||
... )
|
||||
|
||||
>>> # RetinaFace detector
|
||||
>>> from uniface.constants import RetinaFaceWeights
|
||||
>>> detector = create_detector(
|
||||
... 'retinaface', model_name=RetinaFaceWeights.MNET_V2, confidence_threshold=0.8, nms_threshold=0.4
|
||||
... )
|
||||
>>> # YOLOv8-Face detector
|
||||
>>> from uniface.constants import YOLOv8FaceWeights
|
||||
>>> detector = create_detector('yolov8face', model_name=YOLOv8FaceWeights.YOLOV8N, confidence_threshold=0.5)
|
||||
"""
|
||||
method = method.lower()
|
||||
|
||||
@@ -101,8 +101,11 @@ def create_detector(method: str = 'retinaface', **kwargs: Any) -> BaseDetector:
|
||||
elif method == 'yolov5face':
|
||||
return YOLOv5Face(**kwargs)
|
||||
|
||||
elif method == 'yolov8face':
|
||||
return YOLOv8Face(**kwargs)
|
||||
|
||||
else:
|
||||
available_methods = ['retinaface', 'scrfd', 'yolov5face']
|
||||
available_methods = ['retinaface', 'scrfd', 'yolov5face', 'yolov8face']
|
||||
raise ValueError(f"Unsupported detection method: '{method}'. Available methods: {available_methods}")
|
||||
|
||||
|
||||
@@ -147,6 +150,17 @@ def list_available_detectors() -> dict[str, dict[str, Any]]:
|
||||
'input_size': 640,
|
||||
},
|
||||
},
|
||||
'yolov8face': {
|
||||
'description': 'YOLOv8-Face detector - anchor-free design with DFL for accurate detection',
|
||||
'supports_landmarks': True,
|
||||
'paper': 'https://github.com/derronqi/yolov8-face',
|
||||
'default_params': {
|
||||
'model_name': 'yolov8n_face',
|
||||
'confidence_threshold': 0.5,
|
||||
'nms_threshold': 0.45,
|
||||
'input_size': 640,
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
@@ -155,6 +169,7 @@ __all__ = [
|
||||
'BaseDetector',
|
||||
'RetinaFace',
|
||||
'YOLOv5Face',
|
||||
'YOLOv8Face',
|
||||
'create_detector',
|
||||
'detect_faces',
|
||||
'list_available_detectors',
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
@@ -16,6 +16,15 @@ from uniface.types import Face
|
||||
|
||||
from .base import BaseDetector
|
||||
|
||||
# Optional torchvision import for faster NMS
|
||||
try:
|
||||
import torch
|
||||
import torchvision
|
||||
|
||||
TORCHVISION_AVAILABLE = True
|
||||
except ImportError:
|
||||
TORCHVISION_AVAILABLE = False
|
||||
|
||||
__all__ = ['YOLOv5Face']
|
||||
|
||||
|
||||
@@ -34,6 +43,8 @@ class YOLOv5Face(BaseDetector):
|
||||
nms_threshold (float): Non-Maximum Suppression threshold. Defaults to 0.5.
|
||||
input_size (int): Input image size. Defaults to 640.
|
||||
Note: ONNX model is fixed at 640. Changing this will cause inference errors.
|
||||
nms_mode (str): NMS calculation method. Options: 'torchvision' (faster, requires torch)
|
||||
or 'numpy' (no dependencies). Defaults to 'torchvision' if available.
|
||||
**kwargs: Advanced options:
|
||||
max_det (int): Maximum number of detections to return. Defaults to 750.
|
||||
|
||||
@@ -42,6 +53,7 @@ class YOLOv5Face(BaseDetector):
|
||||
confidence_threshold (float): Threshold used to filter low-confidence detections.
|
||||
nms_threshold (float): Threshold used during NMS to suppress overlapping boxes.
|
||||
input_size (int): Image size to which inputs are resized before inference.
|
||||
nms_mode (str): NMS calculation method being used.
|
||||
max_det (int): Maximum number of detections to return.
|
||||
_model_path (str): Absolute path to the downloaded/verified model weights.
|
||||
|
||||
@@ -57,6 +69,7 @@ class YOLOv5Face(BaseDetector):
|
||||
confidence_threshold: float = 0.6,
|
||||
nms_threshold: float = 0.5,
|
||||
input_size: int = 640,
|
||||
nms_mode: Literal['torchvision', 'numpy'] = 'numpy',
|
||||
**kwargs: Any,
|
||||
) -> None:
|
||||
super().__init__(
|
||||
@@ -64,6 +77,7 @@ class YOLOv5Face(BaseDetector):
|
||||
confidence_threshold=confidence_threshold,
|
||||
nms_threshold=nms_threshold,
|
||||
input_size=input_size,
|
||||
nms_mode=nms_mode,
|
||||
**kwargs,
|
||||
)
|
||||
self._supports_landmarks = True # YOLOv5-Face supports landmarks
|
||||
@@ -79,12 +93,19 @@ class YOLOv5Face(BaseDetector):
|
||||
self.nms_threshold = nms_threshold
|
||||
self.input_size = input_size
|
||||
|
||||
# Set NMS mode with automatic fallback
|
||||
if nms_mode == 'torchvision' and not TORCHVISION_AVAILABLE:
|
||||
Logger.warning('torchvision not available, falling back to numpy NMS')
|
||||
self.nms_mode = 'numpy'
|
||||
else:
|
||||
self.nms_mode = nms_mode
|
||||
|
||||
# Advanced options from kwargs
|
||||
self.max_det = kwargs.get('max_det', 750)
|
||||
|
||||
Logger.info(
|
||||
f'Initializing YOLOv5Face with model={self.model_name}, confidence_threshold={self.confidence_threshold}, '
|
||||
f'nms_threshold={self.nms_threshold}, input_size={self.input_size}'
|
||||
f'nms_threshold={self.nms_threshold}, input_size={self.input_size}, nms_mode={self.nms_mode}'
|
||||
)
|
||||
|
||||
# Get path to model weights
|
||||
@@ -205,9 +226,16 @@ class YOLOv5Face(BaseDetector):
|
||||
# Get landmarks (5 points, 10 coordinates)
|
||||
landmarks = predictions[:, 5:15].copy()
|
||||
|
||||
# Apply NMS
|
||||
detections_for_nms = np.hstack((boxes, scores[:, None])).astype(np.float32, copy=False)
|
||||
keep = non_max_suppression(detections_for_nms, self.nms_threshold)
|
||||
# Apply NMS based on selected mode
|
||||
if self.nms_mode == 'torchvision':
|
||||
keep = torchvision.ops.nms(
|
||||
torch.tensor(boxes, dtype=torch.float32),
|
||||
torch.tensor(scores, dtype=torch.float32),
|
||||
self.nms_threshold,
|
||||
).numpy()
|
||||
else:
|
||||
detections = np.hstack((boxes, scores[:, None])).astype(np.float32, copy=False)
|
||||
keep = non_max_suppression(detections, self.nms_threshold)
|
||||
|
||||
if len(keep) == 0:
|
||||
return np.array([]), np.array([])
|
||||
|
||||
419
uniface/detection/yolov8.py
Normal file
419
uniface/detection/yolov8.py
Normal file
@@ -0,0 +1,419 @@
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
"""
|
||||
YOLOv8-Face detector implementation.
|
||||
|
||||
Uses anchor-free design with DFL (Distribution Focal Loss) for bbox regression.
|
||||
Reference: https://github.com/yakhyo/yolov8-face-onnx-inference
|
||||
"""
|
||||
|
||||
from typing import Any, Literal
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
from uniface.common import non_max_suppression
|
||||
from uniface.constants import YOLOv8FaceWeights
|
||||
from uniface.log import Logger
|
||||
from uniface.model_store import verify_model_weights
|
||||
from uniface.onnx_utils import create_onnx_session
|
||||
from uniface.types import Face
|
||||
|
||||
from .base import BaseDetector
|
||||
|
||||
# Optional torchvision import for faster NMS
|
||||
try:
|
||||
import torch
|
||||
import torchvision
|
||||
|
||||
TORCHVISION_AVAILABLE = True
|
||||
except ImportError:
|
||||
TORCHVISION_AVAILABLE = False
|
||||
|
||||
__all__ = ['YOLOv8Face']
|
||||
|
||||
|
||||
class YOLOv8Face(BaseDetector):
|
||||
"""
|
||||
Face detector based on the YOLOv8-Face architecture.
|
||||
|
||||
Uses anchor-free design with DFL (Distribution Focal Loss) for bbox regression.
|
||||
Outputs 3 feature maps at different scales for multi-scale detection.
|
||||
|
||||
Reference: https://github.com/yakhyo/yolov8-face-onnx-inference
|
||||
|
||||
Args:
|
||||
model_name (YOLOv8FaceWeights): Predefined model enum (e.g., `YOLOV8N`).
|
||||
Specifies the YOLOv8-Face variant to load. Defaults to YOLOV8N.
|
||||
confidence_threshold (float): Confidence threshold for filtering detections. Defaults to 0.5.
|
||||
nms_threshold (float): Non-Maximum Suppression threshold. Defaults to 0.45.
|
||||
input_size (int): Input image size. Defaults to 640.
|
||||
Note: ONNX model is fixed at 640. Changing this will cause inference errors.
|
||||
nms_mode (str): NMS calculation method. Options: 'torchvision' (faster, requires torch)
|
||||
or 'numpy' (no dependencies). Defaults to 'torchvision' if available.
|
||||
**kwargs: Advanced options:
|
||||
max_det (int): Maximum number of detections to return. Defaults to 750.
|
||||
|
||||
Attributes:
|
||||
model_name (YOLOv8FaceWeights): Selected model variant.
|
||||
confidence_threshold (float): Threshold used to filter low-confidence detections.
|
||||
nms_threshold (float): Threshold used during NMS to suppress overlapping boxes.
|
||||
input_size (int): Image size to which inputs are resized before inference.
|
||||
nms_mode (str): NMS calculation method being used.
|
||||
max_det (int): Maximum number of detections to return.
|
||||
_model_path (str): Absolute path to the downloaded/verified model weights.
|
||||
|
||||
Raises:
|
||||
ValueError: If the model weights are invalid or not found.
|
||||
RuntimeError: If the ONNX model fails to load or initialize.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
*,
|
||||
model_name: YOLOv8FaceWeights = YOLOv8FaceWeights.YOLOV8N,
|
||||
confidence_threshold: float = 0.5,
|
||||
nms_threshold: float = 0.45,
|
||||
input_size: int = 640,
|
||||
nms_mode: Literal['torchvision', 'numpy'] = 'numpy',
|
||||
**kwargs: Any,
|
||||
) -> None:
|
||||
super().__init__(
|
||||
model_name=model_name,
|
||||
confidence_threshold=confidence_threshold,
|
||||
nms_threshold=nms_threshold,
|
||||
input_size=input_size,
|
||||
nms_mode=nms_mode,
|
||||
**kwargs,
|
||||
)
|
||||
self._supports_landmarks = True # YOLOv8-Face supports landmarks
|
||||
|
||||
# Validate input size
|
||||
if input_size != 640:
|
||||
raise ValueError(
|
||||
f'YOLOv8Face only supports input_size=640 (got {input_size}). The ONNX model has a fixed input shape.'
|
||||
)
|
||||
|
||||
self.model_name = model_name
|
||||
self.confidence_threshold = confidence_threshold
|
||||
self.nms_threshold = nms_threshold
|
||||
self.input_size = input_size
|
||||
|
||||
# Set NMS mode with automatic fallback
|
||||
if nms_mode == 'torchvision' and not TORCHVISION_AVAILABLE:
|
||||
Logger.warning('torchvision not available, falling back to numpy NMS')
|
||||
self.nms_mode = 'numpy'
|
||||
else:
|
||||
self.nms_mode = nms_mode
|
||||
|
||||
# Advanced options from kwargs
|
||||
self.max_det = kwargs.get('max_det', 750)
|
||||
|
||||
# YOLOv8 strides for 640x640 input (3 feature maps: 80x80, 40x40, 20x20)
|
||||
self.strides = [8, 16, 32]
|
||||
|
||||
Logger.info(
|
||||
f'Initializing YOLOv8Face with model={self.model_name}, confidence_threshold={self.confidence_threshold}, '
|
||||
f'nms_threshold={self.nms_threshold}, input_size={self.input_size}, nms_mode={self.nms_mode}'
|
||||
)
|
||||
|
||||
# Get path to model weights
|
||||
self._model_path = verify_model_weights(self.model_name)
|
||||
Logger.info(f'Verified model weights located at: {self._model_path}')
|
||||
|
||||
# Initialize model
|
||||
self._initialize_model(self._model_path)
|
||||
|
||||
def _initialize_model(self, model_path: str) -> None:
|
||||
"""
|
||||
Initializes an ONNX model session from the given path.
|
||||
|
||||
Args:
|
||||
model_path (str): The file path to the ONNX model.
|
||||
|
||||
Raises:
|
||||
RuntimeError: If the model fails to load, logs an error and raises an exception.
|
||||
"""
|
||||
try:
|
||||
self.session = create_onnx_session(model_path)
|
||||
self.input_names = self.session.get_inputs()[0].name
|
||||
self.output_names = [x.name for x in self.session.get_outputs()]
|
||||
Logger.info(f'Successfully initialized the model from {model_path}')
|
||||
except Exception as e:
|
||||
Logger.error(f"Failed to load model from '{model_path}': {e}", exc_info=True)
|
||||
raise RuntimeError(f"Failed to initialize model session for '{model_path}'") from e
|
||||
|
||||
def preprocess(self, image: np.ndarray) -> tuple[np.ndarray, float, tuple[int, int]]:
|
||||
"""
|
||||
Preprocess image for inference (letterbox resize with center padding).
|
||||
|
||||
Args:
|
||||
image (np.ndarray): Input image (BGR format)
|
||||
|
||||
Returns:
|
||||
Tuple[np.ndarray, float, Tuple[int, int]]: Preprocessed image, scale ratio, and padding (pad_w, pad_h)
|
||||
"""
|
||||
# Get original image shape
|
||||
img_h, img_w = image.shape[:2]
|
||||
|
||||
# Calculate scale ratio
|
||||
scale = min(self.input_size / img_h, self.input_size / img_w)
|
||||
new_h, new_w = int(img_h * scale), int(img_w * scale)
|
||||
|
||||
# Resize image
|
||||
img_resized = cv2.resize(image, (new_w, new_h), interpolation=cv2.INTER_LINEAR)
|
||||
|
||||
# Create padded image with gray background (114, 114, 114)
|
||||
img_padded = np.full((self.input_size, self.input_size, 3), 114, dtype=np.uint8)
|
||||
|
||||
# Calculate padding (center the image)
|
||||
pad_h = (self.input_size - new_h) // 2
|
||||
pad_w = (self.input_size - new_w) // 2
|
||||
|
||||
# Place resized image in center
|
||||
img_padded[pad_h : pad_h + new_h, pad_w : pad_w + new_w] = img_resized
|
||||
|
||||
# Convert BGR to RGB and normalize
|
||||
img_rgb = cv2.cvtColor(img_padded, cv2.COLOR_BGR2RGB)
|
||||
img_normalized = img_rgb.astype(np.float32) / 255.0
|
||||
|
||||
# Transpose to CHW format (HWC -> CHW) and add batch dimension
|
||||
img_transposed = np.transpose(img_normalized, (2, 0, 1))
|
||||
img_batch = np.expand_dims(img_transposed, axis=0)
|
||||
img_batch = np.ascontiguousarray(img_batch)
|
||||
|
||||
return img_batch, scale, (pad_w, pad_h)
|
||||
|
||||
def inference(self, input_tensor: np.ndarray) -> list[np.ndarray]:
|
||||
"""Perform model inference on the preprocessed image tensor.
|
||||
|
||||
Args:
|
||||
input_tensor (np.ndarray): Preprocessed input tensor.
|
||||
|
||||
Returns:
|
||||
List[np.ndarray]: Raw model outputs (3 feature maps).
|
||||
"""
|
||||
return self.session.run(self.output_names, {self.input_names: input_tensor})
|
||||
|
||||
@staticmethod
|
||||
def _softmax(x: np.ndarray, axis: int = -1) -> np.ndarray:
|
||||
"""Compute softmax values for array x along specified axis."""
|
||||
exp_x = np.exp(x - np.max(x, axis=axis, keepdims=True))
|
||||
return exp_x / np.sum(exp_x, axis=axis, keepdims=True)
|
||||
|
||||
def postprocess(
|
||||
self,
|
||||
predictions: list[np.ndarray],
|
||||
scale: float,
|
||||
padding: tuple[int, int],
|
||||
original_shape: tuple[int, int],
|
||||
) -> tuple[np.ndarray, np.ndarray]:
|
||||
"""
|
||||
Postprocess model predictions with DFL decoding and coordinate scaling.
|
||||
|
||||
Args:
|
||||
predictions (List[np.ndarray]): Raw model outputs (3 feature maps)
|
||||
scale (float): Scale ratio used in preprocessing
|
||||
padding (Tuple[int, int]): Padding (pad_w, pad_h) used in preprocessing
|
||||
original_shape (Tuple[int, int]): Original image shape (height, width)
|
||||
|
||||
Returns:
|
||||
Tuple[np.ndarray, np.ndarray]: Filtered detections and landmarks
|
||||
- detections: [N, 5] as [x1, y1, x2, y2, conf]
|
||||
- landmarks: [N, 5, 2] for each detection
|
||||
"""
|
||||
# YOLOv8-Face outputs 3 feature maps with Pose head
|
||||
# Each output: (1, 80, H, W) where 80 = 64 (bbox DFL) + 1 (class) + 15 (5 keypoints * 3)
|
||||
|
||||
boxes_list = []
|
||||
scores_list = []
|
||||
landmarks_list = []
|
||||
|
||||
for pred, stride in zip(predictions, self.strides, strict=False):
|
||||
# pred shape: (1, 80, H, W)
|
||||
batch_size, channels, height, width = pred.shape
|
||||
|
||||
# Reshape: (1, 80, H, W) -> (1, 80, H*W) -> (1, H*W, 80) -> (H*W, 80)
|
||||
pred = pred.reshape(batch_size, channels, -1).transpose(0, 2, 1)[0]
|
||||
|
||||
# Create grid with 0.5 offset (matching PyTorch's make_anchors)
|
||||
grid_y, grid_x = np.meshgrid(np.arange(height) + 0.5, np.arange(width) + 0.5, indexing='ij')
|
||||
grid_x = grid_x.flatten()
|
||||
grid_y = grid_y.flatten()
|
||||
|
||||
# Extract components
|
||||
bbox_pred = pred[:, :64] # DFL bbox prediction (64 channels = 4 * 16)
|
||||
cls_conf = pred[:, 64] # Class confidence (1 channel)
|
||||
kpt_pred = pred[:, 65:] # Keypoints (15 channels = 5 points * 3: x, y, visibility)
|
||||
|
||||
# Decode bounding boxes from DFL
|
||||
bbox_pred = bbox_pred.reshape(-1, 4, 16)
|
||||
bbox_dist = self._softmax(bbox_pred, axis=-1) @ np.arange(16)
|
||||
|
||||
# Convert distances to xyxy format
|
||||
x1 = (grid_x - bbox_dist[:, 0]) * stride
|
||||
y1 = (grid_y - bbox_dist[:, 1]) * stride
|
||||
x2 = (grid_x + bbox_dist[:, 2]) * stride
|
||||
y2 = (grid_y + bbox_dist[:, 3]) * stride
|
||||
boxes = np.stack([x1, y1, x2, y2], axis=-1)
|
||||
|
||||
# Decode keypoints: kpt = (kpt * 2.0 + grid) * stride
|
||||
kpt_grid_y, kpt_grid_x = np.meshgrid(np.arange(height), np.arange(width), indexing='ij')
|
||||
kpt_grid_x = kpt_grid_x.flatten()
|
||||
kpt_grid_y = kpt_grid_y.flatten()
|
||||
|
||||
kpt_pred = kpt_pred.reshape(-1, 5, 3) # 5 points * (x, y, visibility)
|
||||
kpt_x = (kpt_pred[:, :, 0] * 2.0 + kpt_grid_x[:, None]) * stride
|
||||
kpt_y = (kpt_pred[:, :, 1] * 2.0 + kpt_grid_y[:, None]) * stride
|
||||
# Ignore visibility (kpt_pred[:, :, 2]) for uniface compatibility
|
||||
landmarks = np.stack([kpt_x, kpt_y], axis=-1).reshape(-1, 10)
|
||||
|
||||
# Apply sigmoid to class confidence
|
||||
scores = 1 / (1 + np.exp(-cls_conf))
|
||||
|
||||
boxes_list.append(boxes)
|
||||
scores_list.append(scores)
|
||||
landmarks_list.append(landmarks)
|
||||
|
||||
# Concatenate all predictions from all feature maps
|
||||
boxes = np.concatenate(boxes_list, axis=0)
|
||||
scores = np.concatenate(scores_list, axis=0)
|
||||
landmarks = np.concatenate(landmarks_list, axis=0)
|
||||
|
||||
# Filter by confidence threshold
|
||||
mask = scores >= self.confidence_threshold
|
||||
boxes = boxes[mask]
|
||||
scores = scores[mask]
|
||||
landmarks = landmarks[mask]
|
||||
|
||||
if len(boxes) == 0:
|
||||
return np.array([]), np.array([])
|
||||
|
||||
# Apply NMS based on selected mode
|
||||
if self.nms_mode == 'torchvision':
|
||||
keep = torchvision.ops.nms(
|
||||
torch.tensor(boxes, dtype=torch.float32),
|
||||
torch.tensor(scores, dtype=torch.float32),
|
||||
self.nms_threshold,
|
||||
).numpy()
|
||||
else:
|
||||
detections_for_nms = np.hstack((boxes, scores[:, None])).astype(np.float32, copy=False)
|
||||
keep = non_max_suppression(detections_for_nms, self.nms_threshold)
|
||||
|
||||
if len(keep) == 0:
|
||||
return np.array([]), np.array([])
|
||||
|
||||
# Limit to max_det
|
||||
keep = keep[: self.max_det]
|
||||
boxes = boxes[keep]
|
||||
scores = scores[keep]
|
||||
landmarks = landmarks[keep]
|
||||
|
||||
# === SCALE TO ORIGINAL IMAGE COORDINATES ===
|
||||
pad_w, pad_h = padding
|
||||
|
||||
# Scale boxes back to original image coordinates
|
||||
boxes[:, [0, 2]] = (boxes[:, [0, 2]] - pad_w) / scale
|
||||
boxes[:, [1, 3]] = (boxes[:, [1, 3]] - pad_h) / scale
|
||||
|
||||
# Clip boxes to image boundaries
|
||||
boxes[:, [0, 2]] = np.clip(boxes[:, [0, 2]], 0, original_shape[1])
|
||||
boxes[:, [1, 3]] = np.clip(boxes[:, [1, 3]], 0, original_shape[0])
|
||||
|
||||
# Scale landmarks back to original image coordinates
|
||||
landmarks[:, 0::2] = (landmarks[:, 0::2] - pad_w) / scale # x coordinates
|
||||
landmarks[:, 1::2] = (landmarks[:, 1::2] - pad_h) / scale # y coordinates
|
||||
|
||||
# Reshape landmarks to (N, 5, 2)
|
||||
landmarks = landmarks.reshape(-1, 5, 2)
|
||||
|
||||
# Combine box and score
|
||||
detections = np.concatenate([boxes, scores[:, None]], axis=1)
|
||||
|
||||
return detections, landmarks
|
||||
|
||||
def detect(
|
||||
self,
|
||||
image: np.ndarray,
|
||||
*,
|
||||
max_num: int = 0,
|
||||
metric: Literal['default', 'max'] = 'max',
|
||||
center_weight: float = 2.0,
|
||||
) -> list[Face]:
|
||||
"""
|
||||
Perform face detection on an input image and return bounding boxes and facial landmarks.
|
||||
|
||||
Args:
|
||||
image (np.ndarray): Input image as a NumPy array of shape (H, W, C) in BGR format.
|
||||
max_num (int): Maximum number of detections to return. Use 0 to return all detections. Defaults to 0.
|
||||
metric (Literal["default", "max"]): Metric for ranking detections when `max_num` is limited.
|
||||
- "default": Prioritize detections closer to the image center.
|
||||
- "max": Prioritize detections with larger bounding box areas.
|
||||
center_weight (float): Weight for penalizing detections farther from the image center
|
||||
when using the "default" metric. Defaults to 2.0.
|
||||
|
||||
Returns:
|
||||
List[Face]: List of Face objects, each containing:
|
||||
- bbox (np.ndarray): Bounding box coordinates with shape (4,) as [x1, y1, x2, y2]
|
||||
- confidence (float): Detection confidence score (0.0 to 1.0)
|
||||
- landmarks (np.ndarray): 5-point facial landmarks with shape (5, 2)
|
||||
|
||||
Example:
|
||||
>>> faces = detector.detect(image)
|
||||
>>> for face in faces:
|
||||
... bbox = face.bbox # np.ndarray with shape (4,)
|
||||
... confidence = face.confidence # float
|
||||
... landmarks = face.landmarks # np.ndarray with shape (5, 2)
|
||||
"""
|
||||
original_height, original_width = image.shape[:2]
|
||||
|
||||
# Preprocess
|
||||
image_tensor, scale, padding = self.preprocess(image)
|
||||
|
||||
# ONNXRuntime inference
|
||||
outputs = self.inference(image_tensor)
|
||||
|
||||
# Postprocess with original image shape for clipping
|
||||
detections, landmarks = self.postprocess(outputs, scale, padding, (original_height, original_width))
|
||||
|
||||
# Handle case when no faces are detected
|
||||
if len(detections) == 0:
|
||||
return []
|
||||
|
||||
if 0 < max_num < detections.shape[0]:
|
||||
# Calculate area of detections
|
||||
area = (detections[:, 2] - detections[:, 0]) * (detections[:, 3] - detections[:, 1])
|
||||
|
||||
# Calculate offsets from image center
|
||||
center = (original_height // 2, original_width // 2)
|
||||
offsets = np.vstack(
|
||||
[
|
||||
(detections[:, 0] + detections[:, 2]) / 2 - center[1],
|
||||
(detections[:, 1] + detections[:, 3]) / 2 - center[0],
|
||||
]
|
||||
)
|
||||
|
||||
# Calculate scores based on the chosen metric
|
||||
offset_dist_squared = np.sum(np.power(offsets, 2.0), axis=0)
|
||||
if metric == 'max':
|
||||
values = area
|
||||
else:
|
||||
values = area - offset_dist_squared * center_weight
|
||||
|
||||
# Sort by scores and select top `max_num`
|
||||
sorted_indices = np.argsort(values)[::-1][:max_num]
|
||||
detections = detections[sorted_indices]
|
||||
landmarks = landmarks[sorted_indices]
|
||||
|
||||
faces = []
|
||||
for i in range(detections.shape[0]):
|
||||
face = Face(
|
||||
bbox=detections[i, :4],
|
||||
confidence=float(detections[i, 4]),
|
||||
landmarks=landmarks[i],
|
||||
)
|
||||
faces.append(face)
|
||||
|
||||
return faces
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user