8 Commits

Author SHA1 Message Date
Yakhyokhuja Valikhujaev
3682a2124f release: Release UniFace version v3.1.0 (#91)
* release: Release UniFace version v3.1.0

* docs: Change classifiers to stable from beta
2026-03-11 12:21:33 +09:00
Yakhyokhuja Valikhujaev
2ef6a1ebe8 refactor: Use dataclass-based model info in model management (#90)
- Refactor model management section: Using data classes for more robust model management.
2026-03-11 12:05:43 +09:00
Yakhyokhuja Valikhujaev
78a2dba7c7 feat: Add FAISS vectore database for fast face search (#88) 2026-03-05 22:46:03 +09:00
Yakhyokhuja Valikhujaev
87e496d1f5 feat: Add FAISS vector DB support for fast search (#86)
* feat: Add FAISS: VectorDB for face embedding search

* docs: Update Documentation
2026-03-03 12:12:05 +09:00
Yakhyokhuja Valikhujaev
5604ebf4f1 docs: Add datasets information in the docs (#85) 2026-02-18 16:02:37 +09:00
Yakhyokhuja Valikhujaev
971775b2e8 feat: Update API format and gaze estimation models (#82)
* docs: Update documentation

* fix: Update several missing docs and tests

* docs: Clean up and remove redundants

* fix: Fix the gaze output formula and change the output order

* chore: Update model weights for gaze estimation

* release: Update release version to v3.0.0
2026-02-14 23:54:51 +09:00
Yakhyokhuja Valikhujaev
c520ea2df2 faet: Add ByteTrack - Multi-Object Tracking by Associating Every Detection Box (#81)
* feat: Add BYTETrack for face/person tracking

* docs: Update documentation

* ref: Update tools folder file naming and imports

* docs: Update jupyter notebook examples

* ref: Rename the file and remove duplicate codes

* docs: Update README.md

* chore: Update description in mkdocs, add keywords for face tracking

* docs: Add announcement section

* feat: Remove expand bbox for tracking and update docs
2026-02-12 00:20:23 +09:00
Yakhyokhuja Valikhujaev
2a8cb54d31 feat: Add get and set for cache dir (#80) 2026-02-09 23:32:02 +09:00
112 changed files with 5286 additions and 2590 deletions

Binary file not shown.

Before

Width:  |  Height:  |  Size: 716 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 673 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 826 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 563 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 33 KiB

View File

Before

Width:  |  Height:  |  Size: 427 KiB

After

Width:  |  Height:  |  Size: 427 KiB

View File

Before

Width:  |  Height:  |  Size: 1.7 MiB

After

Width:  |  Height:  |  Size: 1.7 MiB

View File

Before

Width:  |  Height:  |  Size: 1.8 MiB

After

Width:  |  Height:  |  Size: 1.8 MiB

View File

Before

Width:  |  Height:  |  Size: 1.9 MiB

After

Width:  |  Height:  |  Size: 1.9 MiB

View File

Before

Width:  |  Height:  |  Size: 872 KiB

After

Width:  |  Height:  |  Size: 872 KiB

View File

Before

Width:  |  Height:  |  Size: 62 KiB

After

Width:  |  Height:  |  Size: 62 KiB

View File

@@ -82,23 +82,23 @@ def process(items: List[str], config: Optional[Dict[str, int]] = None) -> Tuple[
Use [Google-style docstrings](https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings) for all public APIs:
```python
def detect_faces(image: np.ndarray, threshold: float = 0.5) -> list[Face]:
"""Detect faces in an image.
def create_detector(method: str = 'retinaface', **kwargs: Any) -> BaseDetector:
"""Factory function to create face detectors.
Args:
image: Input image as a numpy array with shape (H, W, C) in BGR format.
threshold: Confidence threshold for filtering detections. Defaults to 0.5.
method: Detection method. Options: 'retinaface', 'scrfd', 'yolov5face', 'yolov8face'.
**kwargs: Detector-specific parameters.
Returns:
List of Face objects containing bounding boxes, confidence scores,
and facial landmarks.
Initialized detector instance.
Raises:
ValueError: If the input image has invalid dimensions.
ValueError: If method is not supported.
Example:
>>> from uniface import detect_faces
>>> faces = detect_faces(image, threshold=0.8)
>>> from uniface import create_detector
>>> detector = create_detector('retinaface', confidence_threshold=0.8)
>>> faces = detector.detect(image)
>>> print(f"Found {len(faces)} faces")
"""
```
@@ -174,16 +174,16 @@ When adding a new model or feature:
Example notebooks demonstrating library usage:
| Example | Notebook |
|---------|----------|
| Face Detection | [01_face_detection.ipynb](examples/01_face_detection.ipynb) |
| Face Alignment | [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) |
| Face Verification | [03_face_verification.ipynb](examples/03_face_verification.ipynb) |
| Face Search | [04_face_search.ipynb](examples/04_face_search.ipynb) |
| Face Analyzer | [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) |
| Face Parsing | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
| Example | Notebook |
| ------------------ | ------------------------------------------------------------------- |
| Face Detection | [01_face_detection.ipynb](examples/01_face_detection.ipynb) |
| Face Alignment | [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) |
| Face Verification | [03_face_verification.ipynb](examples/03_face_verification.ipynb) |
| Face Search | [04_face_search.ipynb](examples/04_face_search.ipynb) |
| Face Analyzer | [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) |
| Face Parsing | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
| Face Anonymization | [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) |
| Gaze Estimation | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
| Gaze Estimation | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
## Questions?

View File

@@ -9,11 +9,12 @@
[![PyPI Downloads](https://static.pepy.tech/personalized-badge/uniface?period=total&units=INTERNATIONAL_SYSTEM&left_color=GRAY&right_color=BLUE&left_text=Downloads)](https://pepy.tech/projects/uniface)
[![UniFace Documentation](https://img.shields.io/badge/Docs-UniFace-blue.svg)](https://yakhyo.github.io/uniface/)
[![Kaggle Badge](https://img.shields.io/badge/Notebooks-Kaggle?label=Kaggle&color=blue)](https://www.kaggle.com/yakhyokhuja/code)
[![Discord](https://img.shields.io/badge/Discord-Join%20Server-5865F2?logo=discord&logoColor=white)](https://discord.gg/wdzrjr7R5j)
</div>
<div align="center">
<img src="https://raw.githubusercontent.com/yakhyo/uniface/main/.github/logos/new/uniface_rounded_q80.webp" width="90%" alt="UniFace - All-in-One Open-Source Face Analysis Library">
<img src="https://raw.githubusercontent.com/yakhyo/uniface/main/.github/logos/uniface_rounded_q80.webp" width="90%" alt="UniFace - All-in-One Open-Source Face Analysis Library">
</div>
---
@@ -26,10 +27,12 @@
- **Face Detection** — RetinaFace, SCRFD, YOLOv5-Face, and YOLOv8-Face with 5-point landmarks
- **Face Recognition** — ArcFace, MobileFace, and SphereFace embeddings
- **Face Tracking** — Multi-object tracking with [BYTETracker](https://github.com/yakhyo/bytetrack-tracker) for persistent IDs across video frames
- **Facial Landmarks** — 106-point landmark localization module (separate from 5-point detector landmarks)
- **Face Parsing** — BiSeNet semantic segmentation (19 classes), XSeg face masking
- **Gaze Estimation** — Real-time gaze direction with MobileGaze
- **Attribute Analysis** — Age, gender, race (FairFace), and emotion
- **Vector Indexing** — FAISS-backed embedding store for fast multi-identity search
- **Anti-Spoofing** — Face liveness detection with MiniFASNet
- **Face Anonymization** — 5 blur methods for privacy protection
- **Hardware Acceleration** — ARM64 (Apple Silicon), CUDA (NVIDIA), CPU
@@ -57,6 +60,12 @@ git clone https://github.com/yakhyo/uniface.git
cd uniface && pip install -e .
```
**FAISS vector indexing**
```bash
pip install faiss-cpu # or faiss-gpu for CUDA
```
**Optional dependencies**
- Emotion model uses TorchScript and requires `torch`:
`pip install torch` (choose the correct build for your OS/CUDA)
@@ -71,7 +80,18 @@ Models are downloaded automatically on first use and verified via SHA-256.
Default cache location: `~/.uniface/models`
You can override it with `UNIFACE_CACHE_DIR=/your/cache/path`
Override with the programmatic API or environment variable:
```python
from uniface.model_store import get_cache_dir, set_cache_dir
set_cache_dir('/data/models')
print(get_cache_dir()) # /data/models
```
```bash
export UNIFACE_CACHE_DIR=/data/models
```
---
@@ -79,7 +99,7 @@ You can override it with `UNIFACE_CACHE_DIR=/your/cache/path`
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
detector = RetinaFace()
@@ -106,7 +126,9 @@ for face in faces:
```python
import cv2
from uniface import RetinaFace, ArcFace, FaceAnalyzer
from uniface.analyzer import FaceAnalyzer
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
detector = RetinaFace()
recognizer = ArcFace()
@@ -128,7 +150,7 @@ for face in faces:
## Execution Providers (ONNX Runtime)
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
# Force CPU-only inference
detector = RetinaFace(providers=["CPUExecutionProvider"])
@@ -150,6 +172,23 @@ Full documentation: https://yakhyo.github.io/uniface/
| [API Reference](https://yakhyo.github.io/uniface/modules/detection/) | Detailed module documentation |
| [Tutorials](https://yakhyo.github.io/uniface/recipes/image-pipeline/) | Step-by-step workflow examples |
| [Guides](https://yakhyo.github.io/uniface/concepts/overview/) | Architecture and design principles |
| [Datasets](https://yakhyo.github.io/uniface/datasets/) | Training data and evaluation benchmarks |
---
## Datasets
| Task | Training Dataset | Models |
|------|-----------------|--------|
| Detection | WIDER FACE | RetinaFace, SCRFD, YOLOv5-Face, YOLOv8-Face |
| Recognition | MS1MV2 | MobileFace, SphereFace |
| Recognition | WebFace600K | ArcFace |
| Recognition | WebFace4M / 12M | AdaFace |
| Gaze | Gaze360 | MobileGaze |
| Parsing | CelebAMask-HQ | BiSeNet |
| Attributes | CelebA, FairFace, AffectNet | AgeGender, FairFace, Emotion |
> See [Datasets documentation](https://yakhyo.github.io/uniface/datasets/) for download links, benchmarks, and details.
---
@@ -166,6 +205,7 @@ Full documentation: https://yakhyo.github.io/uniface/
| [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | Privacy-preserving blur |
| [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | Gaze direction estimation |
| [09_face_segmentation.ipynb](examples/09_face_segmentation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | Face segmentation with XSeg |
| [10_face_vector_store.ipynb](examples/10_face_vector_store.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/10_face_vector_store.ipynb) | FAISS-backed face database |
---
@@ -189,6 +229,7 @@ If you plan commercial use, verify model license compatibility.
| Detection | [retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) | ✓ | RetinaFace PyTorch Training & Export |
| Detection | [yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) | - | YOLOv5-Face ONNX Inference |
| Detection | [yolov8-face-onnx-inference](https://github.com/yakhyo/yolov8-face-onnx-inference) | - | YOLOv8-Face ONNX Inference |
| Tracking | [bytetrack-tracker](https://github.com/yakhyo/bytetrack-tracker) | - | BYTETracker Multi-Object Tracking |
| Recognition | [face-recognition](https://github.com/yakhyo/face-recognition) | ✓ | MobileFace, SphereFace Training |
| Parsing | [face-parsing](https://github.com/yakhyo/face-parsing) | ✓ | BiSeNet Face Parsing |
| Parsing | [face-segmentation](https://github.com/yakhyo/face-segmentation) | - | XSeg Face Segmentation |
@@ -209,6 +250,7 @@ Contributions are welcome. Please see [CONTRIBUTING.md](CONTRIBUTING.md).
If you find this project useful, consider giving it a ⭐ on GitHub — it helps others discover it!
Questions or feedback:
- Discord: https://discord.gg/wdzrjr7R5j
- GitHub Issues: https://github.com/yakhyo/uniface/issues
- DeepWiki Q&A: https://deepwiki.com/yakhyo/uniface

BIN
assets/einstein/img_0.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 99 KiB

View File

@@ -93,7 +93,7 @@ landmarks = face.landmarks # Shape: (5, 2)
Returned by `Landmark106`:
```python
from uniface import Landmark106
from uniface.landmark import Landmark106
landmarker = Landmark106()
landmarks = landmarker.get_landmarks(image, face.bbox)
@@ -174,7 +174,7 @@ yaw = -90° ────┼──── yaw = +90°
Face alignment uses 5-point landmarks to normalize face orientation:
```python
from uniface import face_alignment
from uniface.face_utils import face_alignment
# Align face to standard template
aligned_face = face_alignment(image, face.landmarks)

View File

@@ -9,7 +9,7 @@ UniFace uses ONNX Runtime for model inference, which supports multiple hardware
UniFace automatically selects the optimal execution provider based on available hardware:
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
# Automatically uses best available provider
detector = RetinaFace()
@@ -28,7 +28,8 @@ detector = RetinaFace()
You can specify which execution provider to use by passing the `providers` parameter:
```python
from uniface import RetinaFace, ArcFace
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
# Force CPU execution (even if GPU is available)
detector = RetinaFace(providers=['CPUExecutionProvider'])
@@ -174,7 +175,7 @@ pip install uniface[gpu]
Smaller input sizes are faster but may reduce accuracy:
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
# Faster, lower accuracy
detector = RetinaFace(input_size=(320, 320))

View File

@@ -53,6 +53,7 @@ class Face:
race: str | None = None # "East Asian", etc.
emotion: str | None = None # "Happy", etc.
emotion_confidence: float | None = None
track_id: int | None = None # Persistent ID from tracker
```
### Properties
@@ -177,7 +178,7 @@ print(f"Norm: {np.linalg.norm(embedding):.4f}") # ~1.0
### Similarity Computation
```python
from uniface import compute_similarity
from uniface.face_utils import compute_similarity
similarity = compute_similarity(embedding1, embedding2)
# Returns: float between -1 and 1 (cosine similarity)

View File

@@ -9,7 +9,7 @@ UniFace automatically downloads and caches models. This page explains how model
Models are downloaded on first use:
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
# First run: downloads model to cache
detector = RetinaFace() # ~3.5 MB download
@@ -44,44 +44,57 @@ Default cache directory:
## Custom Cache Directory
Specify a custom cache location:
Use the programmatic API to change the cache location at runtime:
```python
from uniface.model_store import verify_model_weights
from uniface.constants import RetinaFaceWeights
from uniface.model_store import get_cache_dir, set_cache_dir
# Download to custom directory
model_path = verify_model_weights(
RetinaFaceWeights.MNET_V2,
root='./my_models'
)
print(f"Model at: {model_path}")
# Set a custom cache directory
set_cache_dir('/data/models')
# Verify the current path
print(get_cache_dir()) # /data/models
# All subsequent model loads use the new directory
from uniface.detection import RetinaFace
detector = RetinaFace() # Downloads to /data/models/
```
Or set the `UNIFACE_CACHE_DIR` environment variable (see [Environment Variables](#environment-variables) below).
---
## Pre-Download Models
Download models before deployment:
Download models before deployment using the concurrent downloader:
```python
from uniface.model_store import verify_model_weights
from uniface.model_store import download_models
from uniface.constants import (
RetinaFaceWeights,
ArcFaceWeights,
AgeGenderWeights,
)
# Download all needed models
models = [
# Download multiple models concurrently (up to 4 threads by default)
paths = download_models([
RetinaFaceWeights.MNET_V2,
ArcFaceWeights.MNET,
AgeGenderWeights.DEFAULT,
]
])
for model in models:
path = verify_model_weights(model)
print(f"Downloaded: {path}")
for model, path in paths.items():
print(f"{model.value} -> {path}")
```
Or download one at a time:
```python
from uniface.model_store import verify_model_weights
from uniface.constants import RetinaFaceWeights
path = verify_model_weights(RetinaFaceWeights.MNET_V2)
print(f"Downloaded: {path}")
```
Or use the CLI tool:
@@ -115,11 +128,20 @@ print(f"Copy from: {path}")
scp -r ~/.uniface/models/ user@offline-machine:~/.uniface/models/
```
### 3. Use normally
### 3. Point to the cache (if non-default location)
```python
from uniface.model_store import set_cache_dir
# Only needed if the models are not at ~/.uniface/models/
set_cache_dir('/path/to/copied/models')
```
### 4. Use normally
```python
# Models load from local cache
from uniface import RetinaFace
from uniface.detection import RetinaFace
detector = RetinaFace() # No network required
```
@@ -182,7 +204,12 @@ If a model fails verification, it's re-downloaded automatically.
## Clear Cache
Remove cached models:
Find and remove cached models:
```python
from uniface.model_store import get_cache_dir
print(get_cache_dir()) # shows the active cache path
```
```bash
# Remove all cached models
@@ -198,20 +225,35 @@ Models will be re-downloaded on next use.
## Environment Variables
Set custom cache location via environment variable:
There are three equivalent ways to configure the cache directory:
```bash
export UNIFACE_CACHE_DIR=/path/to/custom/cache
**1. Programmatic API (recommended)**
```python
from uniface.model_store import get_cache_dir, set_cache_dir
set_cache_dir('/path/to/custom/cache')
print(get_cache_dir()) # /path/to/custom/cache
```
**2. Direct environment variable (Python)**
```python
import os
os.environ['UNIFACE_CACHE_DIR'] = '/path/to/custom/cache'
from uniface import RetinaFace
from uniface.detection import RetinaFace
detector = RetinaFace() # Uses custom cache
```
**3. Shell environment variable**
```bash
export UNIFACE_CACHE_DIR=/path/to/custom/cache
```
All three methods set the same `UNIFACE_CACHE_DIR` environment variable under the hood. `get_cache_dir()` always returns the resolved path.
---
## Next Steps

View File

@@ -28,6 +28,14 @@ graph TB
PRIV[Privacy]
end
subgraph Tracking
TRK[BYTETracker]
end
subgraph Indexing
IDX[FAISS Vector Store]
end
subgraph Output
FACE[Face Objects]
end
@@ -40,9 +48,12 @@ graph TB
DET --> PARSE
DET --> SPOOF
DET --> PRIV
DET --> TRK
REC --> IDX
REC --> FACE
LMK --> FACE
ATTR --> FACE
TRK --> FACE
```
---
@@ -51,12 +62,14 @@ graph TB
### 1. ONNX-First
All models use ONNX Runtime for inference:
UniFace runs inference primarily via ONNX Runtime for core components:
- **Cross-platform**: Same models work on macOS, Linux, Windows
- **Hardware acceleration**: Automatic selection of optimal provider
- **Production-ready**: No Python-only dependencies for inference
Some optional components (e.g., emotion TorchScript, torchvision NMS) require PyTorch.
### 2. Minimal Dependencies
Core dependencies are kept minimal:
@@ -74,12 +87,14 @@ tqdm # Progress bars
Factory functions and direct instantiation:
```python
# Factory function
detector = create_detector('retinaface')
from uniface.detection import RetinaFace
# Direct instantiation (recommended)
from uniface import RetinaFace
detector = RetinaFace()
# Or via factory function
from uniface.detection import create_detector
detector = create_detector('retinaface')
```
### 4. Type Safety
@@ -99,17 +114,19 @@ def detect(self, image: np.ndarray) -> list[Face]:
uniface/
├── detection/ # Face detection (RetinaFace, SCRFD, YOLOv5Face, YOLOv8Face)
├── recognition/ # Face recognition (AdaFace, ArcFace, MobileFace, SphereFace)
├── tracking/ # Multi-object tracking (BYTETracker)
├── landmark/ # 106-point landmarks
├── attribute/ # Age, gender, emotion, race
├── parsing/ # Face semantic segmentation
├── gaze/ # Gaze estimation
├── spoofing/ # Anti-spoofing
├── privacy/ # Face anonymization
├── indexing/ # Vector indexing (FAISS)
├── types.py # Dataclasses (Face, GazeResult, etc.)
├── constants.py # Model weights and URLs
├── model_store.py # Model download and caching
├── onnx_utils.py # ONNX Runtime utilities
└── visualization.py # Drawing utilities
└── draw.py # Drawing utilities
```
---
@@ -120,7 +137,9 @@ A typical face analysis workflow:
```python
import cv2
from uniface import RetinaFace, ArcFace, AgeGender
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
# 1. Initialize models
detector = RetinaFace()
@@ -151,7 +170,10 @@ for face in faces:
For convenience, `FaceAnalyzer` combines multiple modules:
```python
from uniface import FaceAnalyzer, RetinaFace, ArcFace, AgeGender, FairFace
from uniface.analyzer import FaceAnalyzer
from uniface.attribute import AgeGender, FairFace
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
detector = RetinaFace()
recognizer = ArcFace()
@@ -176,7 +198,7 @@ for face in faces:
## Model Lifecycle
1. **First use**: Model is downloaded from GitHub releases
2. **Cached**: Stored in `~/.uniface/models/`
2. **Cached**: Stored in `~/.uniface/models/` (configurable via `set_cache_dir()` or `UNIFACE_CACHE_DIR`)
3. **Verified**: SHA-256 checksum validation
4. **Loaded**: ONNX Runtime session created
5. **Inference**: Hardware-accelerated execution
@@ -185,6 +207,11 @@ for face in faces:
# Models auto-download on first use
detector = RetinaFace() # Downloads if not cached
# Optionally configure cache location
from uniface.model_store import get_cache_dir, set_cache_dir
set_cache_dir('/data/models')
print(get_cache_dir()) # /data/models
# Or manually pre-download
from uniface.model_store import verify_model_weights
from uniface.constants import RetinaFaceWeights

View File

@@ -11,7 +11,7 @@ This page explains how to tune detection and recognition thresholds for your use
Controls minimum confidence for face detection:
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
# Default (balanced)
detector = RetinaFace(confidence_threshold=0.5)
@@ -81,7 +81,7 @@ For identity verification (same person check):
```python
import numpy as np
from uniface import compute_similarity
from uniface.face_utils import compute_similarity
similarity = compute_similarity(embedding1, embedding2)
@@ -199,7 +199,7 @@ else:
For drawing detections, filter by confidence:
```python
from uniface.visualization import draw_detections
from uniface.draw import draw_detections
# Only draw high-confidence detections
bboxes = [f.bbox for f in faces if f.confidence > 0.7]

324
docs/datasets.md Normal file
View File

@@ -0,0 +1,324 @@
# Datasets
Overview of all training datasets and evaluation benchmarks used by UniFace models.
---
## Quick Reference
| Task | Dataset | Scale | Models |
| ----------- | ------------------------------------------------ | ---------------------- | ------------------------------------------- |
| Detection | [WIDER FACE](#wider-face) | 32K images | RetinaFace, SCRFD, YOLOv5-Face, YOLOv8-Face |
| Recognition | [MS1MV2](#ms1mv2) | 5.8M images, 85.7K IDs | MobileFace, SphereFace |
| Recognition | [WebFace600K](#webface600k) | 600K images | ArcFace |
| Recognition | [WebFace4M / WebFace12M](#webface4m--webface12m) | 4M / 12M images | AdaFace |
| Gaze | [Gaze360](#gaze360) | 238 subjects | MobileGaze |
| Parsing | [CelebAMask-HQ](#celebamask-hq) | 30K images | BiSeNet |
| Attributes | [CelebA](#celeba) | 200K images | AgeGender |
| Attributes | [FairFace](#fairface) | Balanced demographics | FairFace |
| Attributes | [AffectNet](#affectnet) | Emotion labels | Emotion |
---
## Training Datasets
### Face Detection
#### WIDER FACE
Large-scale face detection benchmark with images across 61 event categories. Contains faces with a high degree of variability in scale, pose, occlusion, expression, and illumination.
| Property | Value |
| -------- | ------------------------------------------- |
| Images | ~32,000 (train/val/test split) |
| Faces | ~394,000 annotated |
| Subsets | Easy, Medium, Hard |
| Used by | RetinaFace, SCRFD, YOLOv5-Face, YOLOv8-Face |
!!! info "Download & References"
**Paper**: [WIDER FACE: A Face Detection Benchmark](https://arxiv.org/abs/1511.06523)
**Download**: [http://shuoyang1213.me/WIDERFACE/](http://shuoyang1213.me/WIDERFACE/)
---
### Face Recognition
#### MS1MV2
Refined version of the MS-Celeb-1M dataset, cleaned by InsightFace. Widely used for training face recognition models.
| Property | Value |
| ---------- | ------------------------------ |
| Identities | 85.7K |
| Images | 5.8M |
| Format | Aligned and cropped to 112x112 |
| Used by | MobileFace, SphereFace |
!!! info "Download"
**Kaggle (aligned 112x112)**: [ms1m-arcface-dataset](https://www.kaggle.com/datasets/yakhyokhuja/ms1m-arcface-dataset) (from InsightFace)
**Training code**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition)
---
#### WebFace600K
Medium-scale face recognition dataset from the WebFace series.
| Property | Value |
| -------- | ------- |
| Images | ~600K |
| Used by | ArcFace |
!!! info "Source"
**Origin**: [InsightFace](https://github.com/deepinsight/insightface)
**Paper**: [ArcFace: Additive Angular Margin Loss for Deep Face Recognition](https://arxiv.org/abs/1801.07698)
---
#### WebFace4M / WebFace12M
Large-scale face recognition datasets from the WebFace260M collection. Used for training AdaFace models with adaptive quality-aware margin.
| Property | WebFace4M | WebFace12M |
| -------- | ------------- | -------------- |
| Images | ~4M | ~12M |
| Used by | AdaFace IR_18 | AdaFace IR_101 |
!!! info "Source"
**Paper**: [AdaFace: Quality Adaptive Margin for Face Recognition](https://arxiv.org/abs/2204.00964)
**Original code**: [mk-minchul/AdaFace](https://github.com/mk-minchul/AdaFace)
---
#### CASIA-WebFace
Smaller-scale face recognition dataset suitable for academic research and lighter training runs.
| Property | Value |
| ---------- | ------------------------------ |
| Identities | 10.6K |
| Images | 491K |
| Format | Aligned and cropped to 112x112 |
| Used by | Alternative training set |
!!! info "Download"
**Kaggle (aligned 112x112)**: [webface-112x112](https://www.kaggle.com/datasets/yakhyokhuja/webface-112x112) (from OpenSphere)
---
#### VGGFace2
Large-scale dataset with wide variations in pose, age, illumination, ethnicity, and profession.
| Property | Value |
| ---------- | ------------------------------ |
| Identities | 8.6K |
| Images | 3.1M |
| Format | Aligned and cropped to 112x112 |
| Used by | Alternative training set |
!!! info "Download"
**Kaggle (aligned 112x112)**: [vggface2-112x112](https://www.kaggle.com/datasets/yakhyokhuja/vggface2-112x112) (from OpenSphere)
---
### Gaze Estimation
#### Gaze360
Large-scale gaze estimation dataset collected in indoor and outdoor environments with diverse head poses and wide gaze ranges (up to 360 degrees).
| Property | Value |
| ----------- | --------------------- |
| Subjects | 238 |
| Environment | Indoor and outdoor |
| Used by | All MobileGaze models |
!!! info "Download & Preprocessing"
**Download**: [gaze360.csail.mit.edu/download.php](https://gaze360.csail.mit.edu/download.php)
**Preprocessing**: [GazeHub - Gaze360](https://phi-ai.buaa.edu.cn/Gazehub/3D-dataset/#gaze360)
!!! note "UniFace Models"
All MobileGaze models shipped with UniFace are trained exclusively on Gaze360 for 200 epochs.
**Dataset structure:**
```
data/
└── Gaze360/
├── Image/
└── Label/
```
---
#### MPIIFaceGaze
Dataset for appearance-based gaze estimation from laptop webcam images of participants during everyday laptop usage. Supported by the gaze estimation training code but not used for the UniFace pretrained weights.
| Property | Value |
| ----------- | ---------------------------------------- |
| Subjects | 15 |
| Environment | Everyday laptop usage |
| Used by | Supported (not used for UniFace weights) |
!!! info "Download & Preprocessing"
**Download**: [MPIIFaceGaze download page](https://www.mpi-inf.mpg.de/departments/computer-vision-and-machine-learning/research/gaze-based-human-computer-interaction/its-written-all-over-your-face-full-face-appearance-based-gaze-estimation)
**Preprocessing**: [GazeHub - MPIIFaceGaze](https://phi-ai.buaa.edu.cn/Gazehub/3D-dataset/#mpiifacegaze)
**Dataset structure:**
```
data/
└── MPIIFaceGaze/
├── Image/
└── Label/
```
---
### Face Parsing
#### CelebAMask-HQ
High-quality face parsing dataset with pixel-level annotations for 19 facial component classes.
| Property | Value |
| ---------- | ---------------------------- |
| Images | 30,000 |
| Classes | 19 facial components |
| Resolution | High quality |
| Used by | BiSeNet (ResNet18, ResNet34) |
!!! info "Source"
**GitHub**: [switchablenorms/CelebAMask-HQ](https://github.com/switchablenorms/CelebAMask-HQ)
**Training code**: [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing)
**Dataset structure:**
```
dataset/
├── images/ # Input face images
│ ├── image1.jpg
│ └── ...
└── labels/ # Segmentation masks
├── image1.png
└── ...
```
---
### Attribute Analysis
#### CelebA
Large-scale face attributes dataset widely used for training age and gender prediction models.
| Property | Value |
| ---------- | -------------------- |
| Images | ~200K |
| Attributes | 40 binary attributes |
| Used by | AgeGender |
!!! info "Reference"
**Paper**: [Deep Learning Face Attributes in the Wild](https://arxiv.org/abs/1411.7766)
---
#### FairFace
Face attribute dataset designed for balanced representation across race, gender, and age groups. Provides more equitable predictions compared to imbalanced datasets.
| Property | Value |
| ---------- | ----------------------------------- |
| Attributes | Race (7), Gender (2), Age Group (9) |
| Used by | FairFace |
| License | CC BY 4.0 |
!!! info "Reference"
**Paper**: [FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age](https://arxiv.org/abs/1908.04913)
**ONNX inference**: [yakhyo/fairface-onnx](https://github.com/yakhyo/fairface-onnx)
---
#### AffectNet
Large-scale facial expression dataset for emotion recognition training.
| Property | Value |
| -------- | ----------------------------------------------------------------------- |
| Classes | 7 or 8 (Neutral, Happy, Sad, Surprise, Fear, Disgust, Angry + Contempt) |
| Used by | Emotion (AFFECNET7, AFFECNET8) |
!!! info "Reference"
**Paper**: [AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild](https://ieeexplore.ieee.org/document/8013713)
---
## Evaluation Benchmarks
### Face Detection
#### WIDER FACE Validation Set
The standard benchmark for face detection models. Results are reported across three difficulty subsets.
| Subset | Criteria |
| ------ | --------------------------------------------- |
| Easy | Large, clear, unoccluded faces |
| Medium | Moderate scale and occlusion |
| Hard | Small, heavily occluded, or challenging faces |
See [Model Zoo - Detection](models.md#face-detection-models) for per-model accuracy on each subset.
---
### Face Recognition
Recognition models are evaluated across multiple benchmarks. Aligned 112x112 validation datasets are available as a single download.
!!! info "Download"
**Kaggle**: [agedb-30-calfw-cplfw-lfw-aligned-112x112](https://www.kaggle.com/datasets/yakhyokhuja/agedb-30-calfw-cplfw-lfw-aligned-112x112)
| Benchmark | Description | Used by |
| ------------ | ----------------------------------------------------------------- | ------------------------------- |
| **LFW** | Labeled Faces in the Wild - standard face verification benchmark | ArcFace, MobileFace, SphereFace |
| **CALFW** | Cross-Age LFW - face verification across age gaps | MobileFace, SphereFace |
| **CPLFW** | Cross-Pose LFW - face verification across pose variations | MobileFace, SphereFace |
| **AgeDB-30** | Age database with 30-year age gaps | ArcFace, MobileFace, SphereFace |
| **CFP-FP** | Celebrities in Frontal-Profile - frontal vs. profile verification | ArcFace |
| **IJB-B** | IARPA Janus Benchmark B - TAR@FAR=0.01% | AdaFace |
| **IJB-C** | IARPA Janus Benchmark C - TAR@FAR=1e-4 | AdaFace, ArcFace |
See [Model Zoo - Recognition](models.md#face-recognition-models) for per-model accuracy on each benchmark.
---
### Gaze Estimation
| Benchmark | Metric | Description |
| -------------------- | ------------- | -------------------------------------------- |
| **Gaze360 test set** | MAE (degrees) | Mean Absolute Error in gaze angle prediction |
See [Model Zoo - Gaze](models.md#gaze-estimation-models) for per-model MAE scores.
---
## Training Repositories
For training your own models or reproducing results, see the following repositories:
| Task | Repository | Datasets Supported |
| ----------- | ------------------------------------------------------------------------- | ------------------------------- |
| Detection | [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) | WIDER FACE |
| Recognition | [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) | MS1MV2, CASIA-WebFace, VGGFace2 |
| Gaze | [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) | Gaze360, MPIIFaceGaze |
| Parsing | [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) | CelebAMask-HQ |

View File

@@ -18,8 +18,9 @@ template: home.html
[![Github Build Status](https://github.com/yakhyo/uniface/actions/workflows/ci.yml/badge.svg)](https://github.com/yakhyo/uniface/actions)
[![PyPI Downloads](https://static.pepy.tech/personalized-badge/uniface?period=total&units=INTERNATIONAL_SYSTEM&left_color=GRAY&right_color=BLUE&left_text=Downloads)](https://pepy.tech/projects/uniface)
[![Kaggle Badge](https://img.shields.io/badge/Notebooks-Kaggle?label=Kaggle&color=blue)](https://www.kaggle.com/yakhyokhuja/code)
[![Discord](https://img.shields.io/badge/Discord-Join%20Server-5865F2?logo=discord&logoColor=white)](https://discord.gg/wdzrjr7R5j)
<img src="https://raw.githubusercontent.com/yakhyo/uniface/main/.github/logos/new/uniface_rounded_q80.webp" alt="UniFace - All-in-One Open-Source Face Analysis Library" style="max-width: 90%; margin: 1rem 0;">
<!-- <img src="https://raw.githubusercontent.com/yakhyo/uniface/main/.github/logos/uniface_rounded_q80.webp" alt="UniFace - All-in-One Open-Source Face Analysis Library" style="max-width: 70%; margin: 1rem 0;"> -->
[Get Started](quickstart.md){ .md-button .md-button--primary }
[View on GitHub](https://github.com/yakhyo/uniface){ .md-button }
@@ -58,6 +59,11 @@ BiSeNet semantic segmentation with 19 facial component classes.
Real-time gaze direction prediction with MobileGaze models.
</div>
<div class="feature-card" markdown>
### :material-motion-play: Tracking
Multi-object tracking with BYTETracker for persistent face IDs across video frames.
</div>
<div class="feature-card" markdown>
### :material-shield-check: Anti-Spoofing
Face liveness detection with MiniFASNet to prevent fraud.
@@ -68,31 +74,35 @@ Face liveness detection with MiniFASNet to prevent fraud.
Face anonymization with 5 blur methods for privacy protection.
</div>
<div class="feature-card" markdown>
### :material-database-search: Vector Indexing
FAISS-backed embedding store for fast multi-identity face search.
</div>
</div>
---
## Installation
=== "Standard"
UniFace runs inference primarily via **ONNX Runtime**; some optional components (e.g., emotion TorchScript, torchvision NMS) require **PyTorch**.
```bash
pip install uniface
```
**Standard**
```bash
pip install uniface
```
=== "GPU (CUDA)"
**GPU (CUDA)**
```bash
pip install uniface[gpu]
```
```bash
pip install uniface[gpu]
```
=== "From Source"
```bash
git clone https://github.com/yakhyo/uniface.git
cd uniface
pip install -e .
```
**From Source**
```bash
git clone https://github.com/yakhyo/uniface.git
cd uniface
pip install -e .
```
---

View File

@@ -55,11 +55,10 @@ pip install uniface[gpu]
**Requirements:**
- CUDA 11.x or 12.x
- cuDNN 8.x
- `uniface[gpu]` automatically installs `onnxruntime-gpu`. Requirements depend on the ORT version and execution provider.
!!! info "CUDA Compatibility"
See [ONNX Runtime GPU requirements](https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html) for detailed compatibility matrix.
See the [ONNX Runtime GPU compatibility matrix](https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html) for matching CUDA and cuDNN versions.
Verify GPU installation:
@@ -71,6 +70,19 @@ print("Available providers:", ort.get_available_providers())
---
### FAISS Vector Indexing
For fast multi-identity face search using a FAISS index:
```bash
pip install faiss-cpu # CPU
pip install faiss-gpu # NVIDIA GPU (CUDA)
```
See the [Indexing module](modules/indexing.md) for usage.
---
### CPU-Only (All Platforms)
```bash
@@ -107,12 +119,20 @@ UniFace has minimal dependencies:
|---------|---------|
| `numpy` | Array operations |
| `opencv-python` | Image processing |
| `onnx` | ONNX model format support |
| `onnxruntime` | Model inference |
| `scikit-image` | Geometric transforms |
| `requests` | Model download |
| `tqdm` | Progress bars |
**Optional:**
| Package | Install extra | Purpose |
|---------|---------------|---------|
| `faiss-cpu` / `faiss-gpu` | `pip install faiss-cpu` | FAISS vector indexing |
| `onnxruntime-gpu` | `uniface[gpu]` | CUDA acceleration |
| `torch` | `pip install torch` | Emotion model uses TorchScript |
| `torchvision` | `pip install torchvision` | Faster NMS for YOLO detectors |
---
## Verify Installation
@@ -128,7 +148,7 @@ import onnxruntime as ort
print(f"Available providers: {ort.get_available_providers()}")
# Quick test
from uniface import RetinaFace
from uniface.detection import RetinaFace
detector = RetinaFace()
print("Installation successful!")
```

View File

@@ -8,7 +8,7 @@ Complete guide to all available models and their performance characteristics.
### RetinaFace Family
RetinaFace models are trained on the WIDER FACE dataset.
RetinaFace models are trained on the [WIDER FACE](datasets.md#wider-face) dataset.
| Model Name | Params | Size | Easy | Medium | Hard |
| -------------- | ------ | ----- | ------ | ------ | ------ |
@@ -22,13 +22,13 @@ RetinaFace models are trained on the WIDER FACE dataset.
!!! info "Accuracy & Benchmarks"
**Accuracy**: WIDER FACE validation set (Easy/Medium/Hard subsets) - from [RetinaFace paper](https://arxiv.org/abs/1905.00641)
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image>`
**Speed**: Benchmark on your own hardware using `python tools/detect.py --source <image>`
---
### SCRFD Family
SCRFD (Sample and Computation Redistribution for Efficient Face Detection) models trained on WIDER FACE dataset.
SCRFD (Sample and Computation Redistribution for Efficient Face Detection) models trained on [WIDER FACE](datasets.md#wider-face) dataset.
| Model Name | Params | Size | Easy | Medium | Hard |
| ---------------- | ------ | ----- | ------ | ------ | ------ |
@@ -38,13 +38,13 @@ SCRFD (Sample and Computation Redistribution for Efficient Face Detection) model
!!! info "Accuracy & Benchmarks"
**Accuracy**: WIDER FACE validation set - from [SCRFD paper](https://arxiv.org/abs/2105.04714)
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image>`
**Speed**: Benchmark on your own hardware using `python tools/detect.py --source <image>`
---
### YOLOv5-Face Family
YOLOv5-Face models provide detection with 5-point facial landmarks, trained on WIDER FACE dataset.
YOLOv5-Face models provide detection with 5-point facial landmarks, trained on [WIDER FACE](datasets.md#wider-face) dataset.
| Model Name | Size | Easy | Medium | Hard |
| -------------- | ---- | ------ | ------ | ------ |
@@ -55,7 +55,7 @@ YOLOv5-Face models provide detection with 5-point facial landmarks, trained on W
!!! info "Accuracy & Benchmarks"
**Accuracy**: WIDER FACE validation set - from [YOLOv5-Face paper](https://arxiv.org/abs/2105.12931)
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image>`
**Speed**: Benchmark on your own hardware using `python tools/detect.py --source <image>`
!!! note "Fixed Input Size"
All YOLOv5-Face models use a fixed input size of 640×640.
@@ -74,7 +74,7 @@ YOLOv8-Face models use anchor-free design with DFL (Distribution Focal Loss) for
!!! info "Accuracy & Benchmarks"
**Accuracy**: WIDER FACE validation set (Easy/Medium/Hard subsets)
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --method yolov8face`
**Speed**: Benchmark on your own hardware using `python tools/detect.py --source <image> --method yolov8face`
!!! note "Fixed Input Size"
All YOLOv8-Face models use a fixed input size of 640×640.
@@ -93,7 +93,7 @@ Face recognition using adaptive margin based on image quality.
| `IR_101` | IR-101 | WebFace12M | 249 MB | - | 97.66% |
!!! info "Training Data & Accuracy"
**Dataset**: WebFace4M (4M images) / WebFace12M (12M images)
**Dataset**: [WebFace4M / WebFace12M](datasets.md#webface4m--webface12m) (4M / 12M images)
**Accuracy**: IJB-B and IJB-C benchmarks, TAR@FAR=0.01%
@@ -113,7 +113,7 @@ Face recognition using additive angular margin loss.
| `RESNET` | ResNet50 | 43.6M | 166MB | 99.83% | 99.33% | 98.23% | 97.25% |
!!! info "Training Data"
**Dataset**: Trained on WebFace600K (600K images)
**Dataset**: Trained on [WebFace600K](datasets.md#webface600k) (600K images)
**Accuracy**: IJB-C accuracy reported as TAR@FAR=1e-4
@@ -131,7 +131,7 @@ Lightweight face recognition models with MobileNet backbones.
| `MNET_V3_LARGE` | MobileNetV3-L | 3.52M | 10MB | 99.53% | 94.56% | 86.79% | 95.13% |
!!! info "Training Data"
**Dataset**: Trained on MS1M-V2 (5.8M images, 85K identities)
**Dataset**: Trained on [MS1MV2](datasets.md#ms1mv2) (5.8M images, 85K identities)
**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks
@@ -147,7 +147,7 @@ Face recognition using angular softmax loss.
| `SPHERE36` | Sphere36 | 34.6M | 92MB | 99.72% | 95.64% | 89.92% | 96.83% |
!!! info "Training Data"
**Dataset**: Trained on MS1M-V2 (5.8M images, 85K identities)
**Dataset**: Trained on [MS1MV2](datasets.md#ms1mv2) (5.8M images, 85K identities)
**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks
@@ -187,7 +187,7 @@ Facial landmark localization model.
| `AgeGender` | Age, Gender | 2.1M | 8MB |
!!! info "Training Data"
**Dataset**: Trained on CelebA
**Dataset**: Trained on [CelebA](datasets.md#celeba)
!!! warning "Accuracy Note"
Accuracy varies by demographic and image quality. Test on your specific use case.
@@ -201,7 +201,7 @@ Facial landmark localization model.
| `FairFace` | Race, Gender, Age Group | - | 44MB |
!!! info "Training Data"
**Dataset**: Trained on FairFace dataset with balanced demographics
**Dataset**: Trained on [FairFace](datasets.md#fairface) dataset with balanced demographics
!!! tip "Equitable Predictions"
FairFace provides more equitable predictions across different racial and gender groups.
@@ -219,12 +219,12 @@ Facial landmark localization model.
| `AFFECNET7` | 7 | 0.5M | 2MB |
| `AFFECNET8` | 8 | 0.5M | 2MB |
**Classes (7)**: Neutral, Happy, Sad, Surprise, Fear, Disgust, Anger
**Classes (7)**: Neutral, Happy, Sad, Surprise, Fear, Disgust, Angry
**Classes (8)**: Above + Contempt
!!! info "Training Data"
**Dataset**: Trained on AffectNet
**Dataset**: Trained on [AffectNet](datasets.md#affectnet)
!!! note "Accuracy Note"
Emotion detection accuracy depends heavily on facial expression clarity and cultural context.
@@ -235,7 +235,7 @@ Facial landmark localization model.
### MobileGaze Family
Gaze direction prediction models trained on Gaze360 dataset. Returns pitch (vertical) and yaw (horizontal) angles in radians.
Gaze direction prediction models trained on [Gaze360](datasets.md#gaze360) dataset. Returns pitch (vertical) and yaw (horizontal) angles in radians.
| Model Name | Params | Size | MAE* |
| -------------- | ------ | ------- | ----- |
@@ -248,7 +248,7 @@ Gaze direction prediction models trained on Gaze360 dataset. Returns pitch (vert
*MAE (Mean Absolute Error) in degrees on Gaze360 test set - lower is better
!!! info "Training Data"
**Dataset**: Trained on Gaze360 (indoor/outdoor scenes with diverse head poses)
**Dataset**: Trained on [Gaze360](datasets.md#gaze360) (indoor/outdoor scenes with diverse head poses)
**Training**: 200 epochs with classification-based approach (binned angles)
@@ -269,7 +269,7 @@ BiSeNet (Bilateral Segmentation Network) models for semantic face parsing. Segme
| `RESNET34` | 24.1M | 89.2 MB | 19 |
!!! info "Training Data"
**Dataset**: Trained on CelebAMask-HQ
**Dataset**: Trained on [CelebAMask-HQ](datasets.md#celebamask-hq)
**Architecture**: BiSeNet with ResNet backbone
@@ -279,13 +279,13 @@ BiSeNet (Bilateral Segmentation Network) models for semantic face parsing. Segme
| # | Class | # | Class | # | Class |
|---|-------|---|-------|---|-------|
| 1 | Background | 8 | Left Ear | 15 | Neck |
| 2 | Skin | 9 | Right Ear | 16 | Neck Lace |
| 3 | Left Eyebrow | 10 | Ear Ring | 17 | Cloth |
| 4 | Right Eyebrow | 11 | Nose | 18 | Hair |
| 5 | Left Eye | 12 | Mouth | 19 | Hat |
| 6 | Right Eye | 13 | Upper Lip | | |
| 7 | Eye Glasses | 14 | Lower Lip | | |
| 0 | Background | 7 | Left Ear | 14 | Neck |
| 1 | Skin | 8 | Right Ear | 15 | Neck Lace |
| 2 | Left Eyebrow | 9 | Ear Ring | 16 | Cloth |
| 3 | Right Eyebrow | 10 | Nose | 17 | Hair |
| 4 | Left Eye | 11 | Mouth | 18 | Hat |
| 5 | Right Eye | 12 | Upper Lip | | |
| 6 | Eye Glasses | 13 | Lower Lip | | |
**Applications:**
@@ -349,10 +349,14 @@ Face anti-spoofing models for liveness detection. Detect if a face is real (live
Models are automatically downloaded and cached on first use.
- **Cache location**: `~/.uniface/models/`
- **Cache location**: `~/.uniface/models/` (configurable via `set_cache_dir()` or `UNIFACE_CACHE_DIR` env var)
- **Inspect cache path**: `get_cache_dir()` returns the resolved active path
- **Verification**: Models are verified with SHA-256 checksums
- **Concurrent download**: `download_models([...])` fetches multiple models in parallel
- **Manual download**: Use `python tools/download_model.py` to pre-download models
See [Model Cache & Offline Use](concepts/model-cache-offline.md) for full details.
---
## References

View File

@@ -21,7 +21,8 @@ Predicts exact age and binary gender.
### Basic Usage
```python
from uniface import RetinaFace, AgeGender
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
detector = RetinaFace()
age_gender = AgeGender()
@@ -54,7 +55,8 @@ Predicts gender, age group, and race with balanced demographics.
### Basic Usage
```python
from uniface import RetinaFace, FairFace
from uniface.attribute import FairFace
from uniface.detection import RetinaFace
detector = RetinaFace()
fairface = FairFace()
@@ -120,12 +122,12 @@ Predicts facial emotions. Requires PyTorch.
### Basic Usage
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.attribute import Emotion
from uniface.constants import DDAMFNWeights
detector = RetinaFace()
emotion = Emotion(model_weights=DDAMFNWeights.AFFECNET7)
emotion = Emotion(model_name=DDAMFNWeights.AFFECNET7)
faces = detector.detect(image)
@@ -147,7 +149,7 @@ for face in faces:
| Surprise |
| Fear |
| Disgust |
| Anger |
| Angry |
=== "8-Class (AFFECNET8)"
@@ -159,7 +161,7 @@ for face in faces:
| Surprise |
| Fear |
| Disgust |
| Anger |
| Angry |
| Contempt |
### Model Variants
@@ -169,10 +171,10 @@ from uniface.attribute import Emotion
from uniface.constants import DDAMFNWeights
# 7-class emotion
emotion = Emotion(model_weights=DDAMFNWeights.AFFECNET7)
emotion = Emotion(model_name=DDAMFNWeights.AFFECNET7)
# 8-class emotion
emotion = Emotion(model_weights=DDAMFNWeights.AFFECNET8)
emotion = Emotion(model_name=DDAMFNWeights.AFFECNET8)
```
---
@@ -182,7 +184,8 @@ emotion = Emotion(model_weights=DDAMFNWeights.AFFECNET8)
### Full Attribute Analysis
```python
from uniface import RetinaFace, AgeGender, FairFace
from uniface.attribute import AgeGender, FairFace
from uniface.detection import RetinaFace
detector = RetinaFace()
age_gender = AgeGender()
@@ -206,7 +209,9 @@ for face in faces:
### Using FaceAnalyzer
```python
from uniface import FaceAnalyzer, RetinaFace, AgeGender
from uniface.analyzer import FaceAnalyzer
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
analyzer = FaceAnalyzer(
RetinaFace(),

View File

@@ -24,7 +24,7 @@ Single-shot face detector with multi-scale feature pyramid.
### Basic Usage
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
detector = RetinaFace()
faces = detector.detect(image)
@@ -38,7 +38,7 @@ for face in faces:
### Model Variants
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.constants import RetinaFaceWeights
# Lightweight (mobile/edge)
@@ -82,7 +82,7 @@ State-of-the-art detection with excellent accuracy-speed tradeoff.
### Basic Usage
```python
from uniface import SCRFD
from uniface.detection import SCRFD
detector = SCRFD()
faces = detector.detect(image)
@@ -91,7 +91,7 @@ faces = detector.detect(image)
### Model Variants
```python
from uniface import SCRFD
from uniface.detection import SCRFD
from uniface.constants import SCRFDWeights
# Real-time (lightweight)
@@ -127,7 +127,7 @@ YOLO-based detection optimized for faces.
### Basic Usage
```python
from uniface import YOLOv5Face
from uniface.detection import YOLOv5Face
detector = YOLOv5Face()
faces = detector.detect(image)
@@ -136,7 +136,7 @@ faces = detector.detect(image)
### Model Variants
```python
from uniface import YOLOv5Face
from uniface.detection import YOLOv5Face
from uniface.constants import YOLOv5FaceWeights
# Lightweight
@@ -179,7 +179,7 @@ Anchor-free detection with DFL (Distribution Focal Loss) for accurate bbox regre
### Basic Usage
```python
from uniface import YOLOv8Face
from uniface.detection import YOLOv8Face
detector = YOLOv8Face()
faces = detector.detect(image)
@@ -188,7 +188,7 @@ faces = detector.detect(image)
### Model Variants
```python
from uniface import YOLOv8Face
from uniface.detection import YOLOv8Face
from uniface.constants import YOLOv8FaceWeights
# Lightweight
@@ -225,7 +225,7 @@ detector = YOLOv8Face(
Create detectors dynamically:
```python
from uniface import create_detector
from uniface.detection import create_detector
detector = create_detector('retinaface')
# or
@@ -238,22 +238,6 @@ detector = create_detector('yolov8face')
---
## High-Level API
One-line detection:
```python
from uniface import detect_faces
# Using RetinaFace (default)
faces = detect_faces(image, method='retinaface', confidence_threshold=0.5)
# Using YOLOv8-Face
faces = detect_faces(image, method='yolov8face', confidence_threshold=0.5)
```
---
## Output Format
All detectors return `list[Face]`:
@@ -276,7 +260,7 @@ for face in faces:
## Visualization
```python
from uniface.visualization import draw_detections
from uniface.draw import draw_detections
draw_detections(
image=image,
@@ -296,7 +280,7 @@ cv2.imwrite("result.jpg", image)
Benchmark on your hardware:
```bash
python tools/detection.py --source image.jpg
python tools/detect.py --source image.jpg
```
---

View File

@@ -23,7 +23,8 @@ Gaze estimation predicts where a person is looking (pitch and yaw angles).
```python
import cv2
import numpy as np
from uniface import RetinaFace, MobileGaze
from uniface.detection import RetinaFace
from uniface.gaze import MobileGaze
detector = RetinaFace()
gaze_estimator = MobileGaze()
@@ -52,7 +53,7 @@ for face in faces:
## Model Variants
```python
from uniface import MobileGaze
from uniface.gaze import MobileGaze
from uniface.constants import GazeWeights
# Default (ResNet34, recommended)
@@ -102,7 +103,7 @@ yaw = -90° ────┼──── yaw = +90°
## Visualization
```python
from uniface.visualization import draw_gaze
from uniface.draw import draw_gaze
# Detect faces
faces = detector.detect(image)
@@ -154,8 +155,9 @@ def draw_gaze_custom(image, bbox, pitch, yaw, length=100, color=(0, 255, 0)):
```python
import cv2
import numpy as np
from uniface import RetinaFace, MobileGaze
from uniface.visualization import draw_gaze
from uniface.detection import RetinaFace
from uniface.gaze import MobileGaze
from uniface.draw import draw_gaze
detector = RetinaFace()
gaze_estimator = MobileGaze()
@@ -256,7 +258,7 @@ print(f"Looking: {direction}")
## Factory Function
```python
from uniface import create_gaze_estimator
from uniface.gaze import create_gaze_estimator
gaze = create_gaze_estimator() # Returns MobileGaze
```

172
docs/modules/indexing.md Normal file
View File

@@ -0,0 +1,172 @@
# Indexing
FAISS-backed vector store for fast similarity search over embeddings.
!!! info "Optional dependency"
```bash
pip install faiss-cpu
```
---
## FAISS
```python
from uniface.indexing import FAISS
```
A thin wrapper around a FAISS `IndexFlatIP` (inner-product) index. Vectors
**must** be L2-normalised before adding so that inner product equals cosine
similarity. The store does not normalise internally.
Each vector is paired with a metadata `dict` that can carry any
JSON-serialisable payload (person ID, name, source path, etc.).
### Constructor
```python
store = FAISS(embedding_size=512, db_path="./vector_index")
```
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `embedding_size` | `int` | `512` | Dimension of embedding vectors |
| `db_path` | `str` | `"./vector_index"` | Directory for persisting index and metadata |
---
### Methods
#### `add(embedding, metadata)`
Add a single embedding with associated metadata.
```python
store.add(embedding, {"person_id": "alice", "source": "photo.jpg"})
```
| Parameter | Type | Description |
|-----------|------|-------------|
| `embedding` | `np.ndarray` | L2-normalised embedding vector |
| `metadata` | `dict[str, Any]` | Arbitrary JSON-serialisable key-value pairs |
---
#### `search(embedding, threshold=0.4)`
Find the closest match for a query embedding.
```python
result, similarity = store.search(query_embedding, threshold=0.4)
if result:
print(result["person_id"], similarity)
```
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `embedding` | `np.ndarray` | — | L2-normalised query vector |
| `threshold` | `float` | `0.4` | Minimum cosine similarity to accept a match |
**Returns:** `(metadata, similarity)` if a match is found, or `(None, similarity)` when below threshold or the index is empty.
---
#### `remove(key, value)`
Remove all entries where `metadata[key] == value` and rebuild the index.
```python
removed = store.remove("person_id", "bob")
print(f"Removed {removed} entries")
```
| Parameter | Type | Description |
|-----------|------|-------------|
| `key` | `str` | Metadata key to match |
| `value` | `Any` | Value to match |
**Returns:** Number of entries removed.
---
#### `save()`
Persist the FAISS index and metadata to disk.
```python
store.save()
```
Writes two files to `db_path`:
- `faiss_index.bin` — binary FAISS index
- `metadata.json` — JSON array of metadata dicts
---
#### `load()`
Load a previously saved index and metadata.
```python
store = FAISS(db_path="./vector_index")
loaded = store.load() # True if files exist
```
**Returns:** `True` if loaded successfully, `False` if files are missing.
**Raises:** `RuntimeError` if files exist but cannot be read.
---
### Properties
| Property | Type | Description |
|----------|------|-------------|
| `size` | `int` | Number of vectors in the index |
| `len(store)` | `int` | Same as `size` |
---
## Example: End-to-End
```python
import cv2
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.indexing import FAISS
detector = RetinaFace()
recognizer = ArcFace()
# Build
store = FAISS(db_path="./my_index")
image = cv2.imread("alice.jpg")
faces = detector.detect(image)
embedding = recognizer.get_normalized_embedding(image, faces[0].landmarks)
store.add(embedding, {"person_id": "alice"})
store.save()
# Search
store2 = FAISS(db_path="./my_index")
store2.load()
query = cv2.imread("unknown.jpg")
faces = detector.detect(query)
emb = recognizer.get_normalized_embedding(query, faces[0].landmarks)
result, sim = store2.search(emb)
if result:
print(f"Matched: {result['person_id']} (similarity: {sim:.3f})")
else:
print(f"No match (similarity: {sim:.3f})")
```
---
## See Also
- [Face Search Recipe](../recipes/face-search.md) - Building and querying indexes
- [Recognition Module](recognition.md) - Embedding extraction
- [Thresholds Guide](../concepts/thresholds-calibration.md) - Tuning similarity thresholds

View File

@@ -20,7 +20,8 @@ Facial landmark detection provides precise localization of facial features.
### Basic Usage
```python
from uniface import RetinaFace, Landmark106
from uniface.detection import RetinaFace
from uniface.landmark import Landmark106
detector = RetinaFace()
landmarker = Landmark106()
@@ -78,7 +79,7 @@ mouth = landmarks[87:106]
All detection models provide 5-point landmarks:
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
detector = RetinaFace()
faces = detector.detect(image)
@@ -152,7 +153,7 @@ def draw_landmarks_with_connections(image, landmarks):
### Face Alignment
```python
from uniface import face_alignment
from uniface.face_utils import face_alignment
# Align face using 5-point landmarks
aligned = face_alignment(image, faces[0].landmarks)
@@ -236,7 +237,7 @@ def estimate_head_pose(landmarks, image_shape):
## Factory Function
```python
from uniface import create_landmarker
from uniface.landmark import create_landmarker
landmarker = create_landmarker() # Returns Landmark106
```

View File

@@ -19,7 +19,7 @@ Face parsing segments faces into semantic components or face regions.
```python
import cv2
from uniface.parsing import BiSeNet
from uniface.visualization import vis_parsing_maps
from uniface.draw import vis_parsing_maps
# Initialize parser
parser = BiSeNet()
@@ -85,9 +85,9 @@ parser = BiSeNet(model_name=ParsingWeights.RESNET34)
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.parsing import BiSeNet
from uniface.visualization import vis_parsing_maps
from uniface.draw import vis_parsing_maps
detector = RetinaFace()
parser = BiSeNet()
@@ -177,23 +177,19 @@ def apply_lip_color(image, mask, color=(180, 50, 50)):
"""Apply lip color using parsing mask."""
result = image.copy()
# Get lip mask (upper + lower lip)
lip_mask = ((mask == 13) | (mask == 14)).astype(np.uint8)
# Get lip mask (upper lip=12, lower lip=13)
lip_mask = ((mask == 12) | (mask == 13)).astype(np.uint8)
# Create color overlay
overlay = np.zeros_like(image)
overlay[:] = color
# Blend with original
lip_region = cv2.bitwise_and(overlay, overlay, mask=lip_mask)
non_lip = cv2.bitwise_and(result, result, mask=1 - lip_mask)
# Combine with alpha blending
# Alpha blend lip region
alpha = 0.4
result = cv2.addWeighted(result, 1 - alpha * lip_mask[:,:,np.newaxis] / 255,
lip_region, alpha, 0)
mask_3ch = lip_mask[:, :, np.newaxis]
result = np.where(mask_3ch, (image * (1 - alpha) + overlay * alpha).astype(np.uint8), result)
return result.astype(np.uint8)
return result
```
### Background Replacement
@@ -234,7 +230,7 @@ def get_hair_mask(mask):
## Visualization Options
```python
from uniface.visualization import vis_parsing_maps
from uniface.draw import vis_parsing_maps
# Default visualization
vis_result = vis_parsing_maps(face_rgb, mask)
@@ -257,7 +253,7 @@ XSeg outputs a mask for face regions. Unlike BiSeNet which works on bbox crops,
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.parsing import XSeg
detector = RetinaFace()
@@ -268,7 +264,7 @@ faces = detector.detect(image)
for face in faces:
if face.landmarks is not None:
mask = parser.parse(image, face.landmarks)
mask = parser.parse(image, landmarks=face.landmarks)
print(f"Mask shape: {mask.shape}") # (H, W), values in [0, 1]
```
@@ -296,7 +292,7 @@ parser = XSeg(
```python
# Full pipeline: align -> segment -> warp back to original space
mask = parser.parse(image, landmarks)
mask = parser.parse(image, landmarks=landmarks)
# For pre-aligned face crops
mask = parser.parse_aligned(face_crop)
@@ -318,7 +314,7 @@ mask, face_crop, inverse_matrix = parser.parse_with_inverse(image, landmarks)
## Factory Function
```python
from uniface import create_face_parser
from uniface.parsing import create_face_parser
from uniface.constants import ParsingWeights, XSegWeights
# BiSeNet (default)

View File

@@ -18,25 +18,8 @@ Face anonymization protects privacy by blurring or obscuring faces in images and
## Quick Start
### One-Line Anonymization
```python
from uniface.privacy import anonymize_faces
import cv2
image = cv2.imread("group_photo.jpg")
anonymized = anonymize_faces(image, method='pixelate')
cv2.imwrite("anonymized.jpg", anonymized)
```
---
## BlurFace Class
For more control, use the `BlurFace` class:
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
import cv2
@@ -59,7 +42,7 @@ cv2.imwrite("anonymized.jpg", anonymized)
Blocky pixelation effect (common in news media):
```python
blurrer = BlurFace(method='pixelate', pixel_blocks=10)
blurrer = BlurFace(method='pixelate', pixel_blocks=15)
```
| Parameter | Default | Description |
@@ -137,7 +120,7 @@ result = blurrer.anonymize(image, faces, inplace=True)
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
detector = RetinaFace()
@@ -166,7 +149,7 @@ cv2.destroyAllWindows()
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
detector = RetinaFace()
@@ -238,7 +221,7 @@ def anonymize_low_confidence(image, faces, blurrer, confidence_threshold=0.8):
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
detector = RetinaFace()
@@ -259,13 +242,13 @@ for method in methods:
```bash
# Anonymize image with pixelation
python tools/face_anonymize.py --source photo.jpg
python tools/anonymize.py --source photo.jpg
# Real-time webcam
python tools/face_anonymize.py --source 0 --method gaussian
python tools/anonymize.py --source 0 --method gaussian
# Custom blur strength
python tools/face_anonymize.py --source photo.jpg --method gaussian --blur-strength 5.0
python tools/anonymize.py --source photo.jpg --method gaussian --blur-strength 5.0
```
---

View File

@@ -22,7 +22,8 @@ Face recognition using adaptive margin based on image quality.
### Basic Usage
```python
from uniface import RetinaFace, AdaFace
from uniface.detection import RetinaFace
from uniface.recognition import AdaFace
detector = RetinaFace()
recognizer = AdaFace()
@@ -39,7 +40,7 @@ if faces:
### Model Variants
```python
from uniface import AdaFace
from uniface.recognition import AdaFace
from uniface.constants import AdaFaceWeights
# Lightweight (default)
@@ -69,7 +70,8 @@ Face recognition using additive angular margin loss.
### Basic Usage
```python
from uniface import RetinaFace, ArcFace
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
detector = RetinaFace()
recognizer = ArcFace()
@@ -86,7 +88,7 @@ if faces:
### Model Variants
```python
from uniface import ArcFace
from uniface.recognition import ArcFace
from uniface.constants import ArcFaceWeights
# Lightweight (default)
@@ -118,7 +120,7 @@ Lightweight face recognition models with MobileNet backbones.
### Basic Usage
```python
from uniface import MobileFace
from uniface.recognition import MobileFace
recognizer = MobileFace()
embedding = recognizer.get_normalized_embedding(image, landmarks)
@@ -127,7 +129,7 @@ embedding = recognizer.get_normalized_embedding(image, landmarks)
### Model Variants
```python
from uniface import MobileFace
from uniface.recognition import MobileFace
from uniface.constants import MobileFaceWeights
# Ultra-lightweight
@@ -156,7 +158,7 @@ Face recognition using angular softmax loss (A-Softmax).
### Basic Usage
```python
from uniface import SphereFace
from uniface.recognition import SphereFace
from uniface.constants import SphereFaceWeights
recognizer = SphereFace(model_name=SphereFaceWeights.SPHERE20)
@@ -175,7 +177,7 @@ embedding = recognizer.get_normalized_embedding(image, landmarks)
### Compute Similarity
```python
from uniface import compute_similarity
from uniface.face_utils import compute_similarity
import numpy as np
# Extract embeddings
@@ -211,7 +213,7 @@ Recognition models require aligned faces. UniFace handles this internally:
embedding = recognizer.get_normalized_embedding(image, landmarks)
# Or manually align
from uniface import face_alignment
from uniface.face_utils import face_alignment
aligned_face = face_alignment(image, landmarks)
# Returns: 112x112 aligned face image
@@ -223,7 +225,8 @@ aligned_face = face_alignment(image, landmarks)
```python
import numpy as np
from uniface import RetinaFace, ArcFace
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
detector = RetinaFace()
recognizer = ArcFace()
@@ -282,7 +285,7 @@ else:
## Factory Function
```python
from uniface import create_recognizer
from uniface.recognition import create_recognizer
# Available methods: 'arcface', 'adaface', 'mobileface', 'sphereface'
recognizer = create_recognizer('arcface')

View File

@@ -17,7 +17,7 @@ Face anti-spoofing detects whether a face is real (live) or fake (photo, video r
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.spoofing import MiniFASNet
detector = RetinaFace()
@@ -128,7 +128,7 @@ cv2.imwrite("spoofing_result.jpg", image)
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.spoofing import MiniFASNet
detector = RetinaFace()
@@ -253,7 +253,7 @@ python tools/spoofing.py --source 0
## Factory Function
```python
from uniface import create_spoofer
from uniface.spoofing import create_spoofer
spoofer = create_spoofer() # Returns MiniFASNet
```

263
docs/modules/tracking.md Normal file
View File

@@ -0,0 +1,263 @@
# Tracking
Multi-object tracking using [BYTETracker](https://github.com/yakhyo/bytetrack-tracker) with Kalman filtering and IoU-based association. The tracker assigns persistent IDs to detected objects across video frames using a two-stage association strategy — first matching high-confidence detections, then low-confidence ones.
---
## How It Works
BYTETracker takes detection bounding boxes as input and returns tracked bounding boxes with persistent IDs. It does not depend on any specific detector — any source of `[x1, y1, x2, y2, score]` arrays will work.
Each frame, the tracker:
1. Splits detections into high-confidence and low-confidence groups
2. Matches high-confidence detections to existing tracks using IoU
3. Matches remaining tracks to low-confidence detections (second chance)
4. Starts new tracks for unmatched high-confidence detections
5. Removes tracks that have been lost for too long
The Kalman filter predicts where each track will be in the next frame, which helps maintain associations even when detections are noisy.
---
## Basic Usage
```python
import cv2
import numpy as np
from uniface.common import xyxy_to_cxcywh
from uniface.detection import SCRFD
from uniface.tracking import BYTETracker
from uniface.draw import draw_tracks
detector = SCRFD()
tracker = BYTETracker(track_thresh=0.5, track_buffer=30)
cap = cv2.VideoCapture("video.mp4")
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
# 1. Detect faces
faces = detector.detect(frame)
# 2. Build detections array: [x1, y1, x2, y2, score]
dets = np.array([[*f.bbox, f.confidence] for f in faces])
dets = dets if len(dets) > 0 else np.empty((0, 5))
# 3. Update tracker
tracks = tracker.update(dets)
# 4. Map track IDs back to face objects
if len(tracks) > 0 and len(faces) > 0:
face_bboxes = np.array([f.bbox for f in faces], dtype=np.float32)
track_ids = tracks[:, 4].astype(int)
face_centers = xyxy_to_cxcywh(face_bboxes)[:, :2]
track_centers = xyxy_to_cxcywh(tracks[:, :4])[:, :2]
for ti in range(len(tracks)):
dists = (track_centers[ti, 0] - face_centers[:, 0]) ** 2 + (track_centers[ti, 1] - face_centers[:, 1]) ** 2
faces[int(np.argmin(dists))].track_id = track_ids[ti]
# 5. Draw
tracked_faces = [f for f in faces if f.track_id is not None]
draw_tracks(image=frame, faces=tracked_faces)
cv2.imshow("Tracking", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
```
Each track ID gets a deterministic color via golden-ratio hue stepping, so the same person keeps the same color across the entire video.
---
## Webcam Tracking
```python
import cv2
import numpy as np
from uniface.common import xyxy_to_cxcywh
from uniface.detection import SCRFD
from uniface.tracking import BYTETracker
from uniface.draw import draw_tracks
detector = SCRFD()
tracker = BYTETracker(track_thresh=0.5, track_buffer=30)
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
if not ret:
break
faces = detector.detect(frame)
dets = np.array([[*f.bbox, f.confidence] for f in faces])
dets = dets if len(dets) > 0 else np.empty((0, 5))
tracks = tracker.update(dets)
if len(tracks) > 0 and len(faces) > 0:
face_bboxes = np.array([f.bbox for f in faces], dtype=np.float32)
track_ids = tracks[:, 4].astype(int)
face_centers = xyxy_to_cxcywh(face_bboxes)[:, :2]
track_centers = xyxy_to_cxcywh(tracks[:, :4])[:, :2]
for ti in range(len(tracks)):
dists = (track_centers[ti, 0] - face_centers[:, 0]) ** 2 + (track_centers[ti, 1] - face_centers[:, 1]) ** 2
faces[int(np.argmin(dists))].track_id = track_ids[ti]
draw_tracks(image=frame, faces=[f for f in faces if f.track_id is not None])
cv2.imshow("Face Tracking - Press 'q' to quit", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
```
---
## Parameters
```python
from uniface.tracking import BYTETracker
tracker = BYTETracker(
track_thresh=0.5,
track_buffer=30,
match_thresh=0.8,
low_thresh=0.1,
)
```
| Parameter | Default | Description |
|-----------|---------|-------------|
| `track_thresh` | 0.5 | Detections above this score go through first-pass association |
| `track_buffer` | 30 | How many frames to keep a lost track before removing it |
| `match_thresh` | 0.8 | IoU threshold for matching tracks to detections |
| `low_thresh` | 0.1 | Detections below this score are discarded entirely |
---
## Input / Output
**Input**`(N, 5)` numpy array with `[x1, y1, x2, y2, confidence]` per detection:
```python
detections = np.array([
[100, 50, 200, 160, 0.95],
[300, 80, 380, 200, 0.87],
])
```
**Output**`(M, 5)` numpy array with `[x1, y1, x2, y2, track_id]` per active track:
```python
tracks = tracker.update(detections)
# array([[101.2, 51.3, 199.8, 159.8, 1.],
# [300.5, 80.2, 379.7, 200.1, 2.]])
```
The output bounding boxes come from the Kalman filter prediction, so they may differ slightly from the input. Track IDs are integers that persist across frames for the same object.
---
## Resetting the Tracker
When switching to a different video or scene, reset the tracker to clear all internal state:
```python
tracker.reset()
```
This clears all active, lost, and removed tracks, resets the frame counter, and resets the ID counter back to zero.
---
## Visualization
`draw_tracks` draws bounding boxes color-coded by track ID:
```python
from uniface.draw import draw_tracks
draw_tracks(
image=frame,
faces=tracked_faces,
draw_landmarks=True,
draw_id=True,
corner_bbox=True,
)
```
---
## Small Face Performance
!!! warning "Tracking performance with small faces"
The tracker relies on IoU (Intersection over Union) to match detections across
frames. When faces occupy a small portion of the image — for example in
surveillance footage or wide-angle cameras — even slight movement between frames
can cause a large drop in IoU. This makes it harder for the tracker to maintain
consistent IDs, and you may see IDs switching or resetting more often than expected.
This is not specific to BYTETracker; it applies to any IoU-based tracker. A few
things that can help:
- **Lower `match_thresh`** (e.g. `0.5` or `0.6`) so the tracker accepts lower
overlap as a valid match.
- **Increase `track_buffer`** (e.g. `60` or higher) to hold onto lost tracks
longer before discarding them.
- **Use a higher-resolution input** if possible, so face bounding boxes are
larger in pixel terms.
```python
tracker = BYTETracker(
track_thresh=0.4,
track_buffer=60,
match_thresh=0.6,
)
```
---
## CLI Tool
```bash
# Track faces in a video
python tools/track.py --source video.mp4
# Webcam
python tools/track.py --source 0
# Save output
python tools/track.py --source video.mp4 --output tracked.mp4
# Use RetinaFace instead of SCRFD
python tools/track.py --source video.mp4 --detector retinaface
# Keep lost tracks longer
python tools/track.py --source video.mp4 --track-buffer 60
```
---
## References
- [yakhyo/bytetrack-tracker](https://github.com/yakhyo/bytetrack-tracker) — standalone BYTETracker implementation used in UniFace
- [ByteTrack paper](https://arxiv.org/abs/2110.06864) — Zhang et al., "ByteTrack: Multi-Object Tracking by Associating Every Detection Box"
---
## See Also
- [Detection](detection.md) — face detection models
- [Video & Webcam](../recipes/video-webcam.md) — video processing patterns
- [Inputs & Outputs](../concepts/inputs-outputs.md) — data types and formats

View File

@@ -17,6 +17,7 @@ Run UniFace examples directly in your browser with Google Colab, or download and
| [Face Anonymization](https://github.com/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | Privacy-preserving blur |
| [Gaze Estimation](https://github.com/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | Gaze direction estimation |
| [Face Segmentation](https://github.com/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | Face segmentation with XSeg |
| [Face Vector Store](https://github.com/yakhyo/uniface/blob/main/examples/10_face_vector_store.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/10_face_vector_store.ipynb) | FAISS-backed face database |
---

7
docs/overrides/main.html Normal file
View File

@@ -0,0 +1,7 @@
{% extends "base.html" %}
{% block announce %}
<a href="https://github.com/yakhyo/uniface" target="_blank" rel="noopener">
Support our work &mdash; give UniFace a <span class="twemoji">{% include ".icons/octicons/star-fill-16.svg" %}</span> on <strong>GitHub</strong> and help us reach more developers!
</a>
{% endblock %}

View File

@@ -10,7 +10,7 @@ Detect faces in an image:
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
# Load image
image = cv2.imread("photo.jpg")
@@ -46,8 +46,8 @@ Draw bounding boxes and landmarks:
```python
import cv2
from uniface import RetinaFace
from uniface.visualization import draw_detections
from uniface.detection import RetinaFace
from uniface.draw import draw_detections
# Detect faces
detector = RetinaFace()
@@ -81,7 +81,8 @@ Compare two faces:
```python
import cv2
import numpy as np
from uniface import RetinaFace, ArcFace
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
# Initialize models
detector = RetinaFace()
@@ -121,7 +122,8 @@ if faces1 and faces2:
```python
import cv2
from uniface import RetinaFace, AgeGender
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
# Initialize models
detector = RetinaFace()
@@ -152,7 +154,8 @@ Detect race, gender, and age group:
```python
import cv2
from uniface import RetinaFace, FairFace
from uniface.attribute import FairFace
from uniface.detection import RetinaFace
detector = RetinaFace()
fairface = FairFace()
@@ -178,7 +181,8 @@ Face 2: Female, 20-29, White
```python
import cv2
from uniface import RetinaFace, Landmark106
from uniface.detection import RetinaFace
from uniface.landmark import Landmark106
detector = RetinaFace()
landmarker = Landmark106()
@@ -204,8 +208,9 @@ if faces:
```python
import cv2
import numpy as np
from uniface import RetinaFace, MobileGaze
from uniface.visualization import draw_gaze
from uniface.detection import RetinaFace
from uniface.gaze import MobileGaze
from uniface.draw import draw_gaze
detector = RetinaFace()
gaze_estimator = MobileGaze()
@@ -237,7 +242,7 @@ Segment face into semantic components:
import cv2
import numpy as np
from uniface.parsing import BiSeNet
from uniface.visualization import vis_parsing_maps
from uniface.draw import vis_parsing_maps
parser = BiSeNet()
@@ -261,26 +266,24 @@ print(f"Detected {len(np.unique(mask))} facial components")
Blur faces for privacy protection:
```python
from uniface.privacy import anonymize_faces
import cv2
# One-liner: automatic detection and blurring
image = cv2.imread("group_photo.jpg")
anonymized = anonymize_faces(image, method='pixelate')
cv2.imwrite("anonymized.jpg", anonymized)
```
**Manual control:**
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
detector = RetinaFace()
blurrer = BlurFace(method='gaussian', blur_strength=5.0)
blurrer = BlurFace(method='pixelate')
image = cv2.imread("group_photo.jpg")
faces = detector.detect(image)
anonymized = blurrer.anonymize(image, faces)
cv2.imwrite("anonymized.jpg", anonymized)
```
**Custom blur settings:**
```python
blurrer = BlurFace(method='gaussian', blur_strength=5.0)
anonymized = blurrer.anonymize(image, faces)
```
**Available methods:**
@@ -301,7 +304,7 @@ Detect real vs. fake faces:
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.spoofing import MiniFASNet
detector = RetinaFace()
@@ -324,8 +327,8 @@ Real-time face detection:
```python
import cv2
from uniface import RetinaFace
from uniface.visualization import draw_detections
from uniface.detection import RetinaFace
from uniface.draw import draw_detections
detector = RetinaFace()
cap = cv2.VideoCapture(0)
@@ -355,6 +358,60 @@ cv2.destroyAllWindows()
---
## Face Tracking
Track faces across video frames with persistent IDs:
```python
import cv2
import numpy as np
from uniface.common import xyxy_to_cxcywh
from uniface.detection import SCRFD
from uniface.tracking import BYTETracker
from uniface.draw import draw_tracks
detector = SCRFD()
tracker = BYTETracker(track_thresh=0.5, track_buffer=30)
cap = cv2.VideoCapture("video.mp4")
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
faces = detector.detect(frame)
dets = np.array([[*f.bbox, f.confidence] for f in faces])
dets = dets if len(dets) > 0 else np.empty((0, 5))
tracks = tracker.update(dets)
# Assign track IDs to faces
if len(tracks) > 0 and len(faces) > 0:
face_bboxes = np.array([f.bbox for f in faces], dtype=np.float32)
track_ids = tracks[:, 4].astype(int)
face_centers = xyxy_to_cxcywh(face_bboxes)[:, :2]
track_centers = xyxy_to_cxcywh(tracks[:, :4])[:, :2]
for ti in range(len(tracks)):
dists = (track_centers[ti, 0] - face_centers[:, 0]) ** 2 + (track_centers[ti, 1] - face_centers[:, 1]) ** 2
faces[int(np.argmin(dists))].track_id = track_ids[ti]
tracked_faces = [f for f in faces if f.track_id is not None]
draw_tracks(image=frame, faces=tracked_faces)
cv2.imshow("Tracking", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
```
For more details, see the [Tracking module](modules/tracking.md).
---
## Model Selection
For detailed model comparisons and benchmarks, see the [Model Zoo](models.md).
@@ -365,6 +422,7 @@ For detailed model comparisons and benchmarks, see the [Model Zoo](models.md).
|------|------------------|
| Detection | `RetinaFace`, `SCRFD`, `YOLOv5Face`, `YOLOv8Face` |
| Recognition | `ArcFace`, `AdaFace`, `MobileFace`, `SphereFace` |
| Tracking | `BYTETracker` |
| Gaze | `MobileGaze` (ResNet18/34/50, MobileNetV2, MobileOneS0) |
| Parsing | `BiSeNet` (ResNet18/34) |
| Attributes | `AgeGender`, `FairFace`, `Emotion` |
@@ -407,13 +465,18 @@ python -c "import platform; print(platform.machine())"
### Import Errors
```python
# Correct imports
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.detection import RetinaFace, SCRFD
from uniface.recognition import ArcFace, AdaFace
from uniface.attribute import AgeGender, FairFace
from uniface.landmark import Landmark106
# Also works (re-exported at package level)
from uniface import RetinaFace, ArcFace, Landmark106
from uniface.gaze import MobileGaze
from uniface.parsing import BiSeNet, XSeg
from uniface.privacy import BlurFace
from uniface.spoofing import MiniFASNet
from uniface.tracking import BYTETracker
from uniface.analyzer import FaceAnalyzer
from uniface.indexing import FAISS # pip install faiss-cpu
from uniface.draw import draw_detections, draw_tracks
```
---

View File

@@ -11,7 +11,7 @@ Blur faces in real-time video streams for privacy protection.
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
detector = RetinaFace()
@@ -40,7 +40,7 @@ cv2.destroyAllWindows()
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
detector = RetinaFace()
@@ -67,14 +67,19 @@ out.release()
---
## One-Liner for Images
## Single Image
```python
from uniface.privacy import anonymize_faces
import cv2
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
detector = RetinaFace()
blurrer = BlurFace(method='pixelate')
image = cv2.imread("photo.jpg")
result = anonymize_faces(image, method='pixelate')
faces = detector.detect(image)
result = blurrer.anonymize(image, faces)
cv2.imwrite("anonymized.jpg", result)
```
@@ -84,7 +89,7 @@ cv2.imwrite("anonymized.jpg", result)
| Method | Usage |
|--------|-------|
| Pixelate | `BlurFace(method='pixelate', pixel_blocks=10)` |
| Pixelate | `BlurFace(method='pixelate', pixel_blocks=15)` |
| Gaussian | `BlurFace(method='gaussian', blur_strength=3.0)` |
| Blackout | `BlurFace(method='blackout', color=(0,0,0))` |
| Elliptical | `BlurFace(method='elliptical', margin=20)` |

View File

@@ -12,7 +12,7 @@ Process multiple images efficiently.
```python
import cv2
from pathlib import Path
from uniface import RetinaFace
from uniface.detection import RetinaFace
detector = RetinaFace()
@@ -54,7 +54,8 @@ for image_path in tqdm(image_files, desc="Processing"):
## Extract Embeddings
```python
from uniface import RetinaFace, ArcFace
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
import numpy as np
detector = RetinaFace()

View File

@@ -1,178 +1,166 @@
# Face Search
Build a face search system for finding people in images.
Find and identify people in images and video streams.
!!! note "Work in Progress"
This page contains example code patterns. Test thoroughly before using in production.
UniFace supports two search approaches:
| Approach | Use case | Tool |
| -------------------- | ------------------------------------------------ | ----------------------- |
| **Reference search** | "Is this specific person in the video?" | `tools/search.py` |
| **Vector search** | "Who is this?" against a database of known faces | `tools/faiss_search.py` |
---
## Basic Face Database
## Reference Search (single image)
Compare every detected face against a single reference photo:
```python
import cv2
import numpy as np
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.face_utils import compute_similarity
detector = RetinaFace()
recognizer = ArcFace()
ref_image = cv2.imread("reference.jpg")
ref_faces = detector.detect(ref_image)
ref_embedding = recognizer.get_normalized_embedding(ref_image, ref_faces[0].landmarks)
query_image = cv2.imread("group_photo.jpg")
faces = detector.detect(query_image)
for face in faces:
embedding = recognizer.get_normalized_embedding(query_image, face.landmarks)
sim = compute_similarity(ref_embedding, embedding)
label = f"Match ({sim:.2f})" if sim > 0.4 else f"Unknown ({sim:.2f})"
print(label)
```
**CLI tool:**
```bash
python tools/search.py --reference ref.jpg --source video.mp4
python tools/search.py --reference ref.jpg --source 0 # webcam
```
---
## Vector Search (FAISS index)
For identifying faces against a database of many known people, use the
[`FAISS`](../modules/indexing.md) vector store.
!!! info "Install extra"
`bash
pip install faiss-cpu
`
### Build an index
Organise face images in person sub-folders:
```
dataset/
├── alice/
│ ├── 001.jpg
│ └── 002.jpg
├── bob/
│ └── 001.jpg
└── charlie/
├── 001.jpg
└── 002.jpg
```
```python
import cv2
from pathlib import Path
from uniface import RetinaFace, ArcFace
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.indexing import FAISS
class FaceDatabase:
def __init__(self):
self.detector = RetinaFace()
self.recognizer = ArcFace()
self.embeddings = {}
detector = RetinaFace()
recognizer = ArcFace()
store = FAISS(db_path="./my_index")
def add_face(self, person_id, image):
"""Add a face to the database."""
faces = self.detector.detect(image)
if not faces:
raise ValueError(f"No face found for {person_id}")
for person_dir in sorted(Path("dataset").iterdir()):
if not person_dir.is_dir():
continue
for img_path in person_dir.glob("*.jpg"):
image = cv2.imread(str(img_path))
faces = detector.detect(image)
if faces:
emb = recognizer.get_normalized_embedding(image, faces[0].landmarks)
store.add(emb, {"person_id": person_dir.name, "source": str(img_path)})
face = max(faces, key=lambda f: f.confidence)
embedding = self.recognizer.get_normalized_embedding(image, face.landmarks)
self.embeddings[person_id] = embedding
return True
def search(self, image, threshold=0.6):
"""Search for faces in an image."""
faces = self.detector.detect(image)
results = []
for face in faces:
embedding = self.recognizer.get_normalized_embedding(image, face.landmarks)
best_match = None
best_similarity = -1
for person_id, db_embedding in self.embeddings.items():
similarity = np.dot(embedding, db_embedding.T)[0][0]
if similarity > best_similarity:
best_similarity = similarity
best_match = person_id
results.append({
'bbox': face.bbox,
'match': best_match if best_similarity >= threshold else None,
'similarity': best_similarity
})
return results
def save(self, path):
"""Save database to file."""
np.savez(path, embeddings=dict(self.embeddings))
def load(self, path):
"""Load database from file."""
data = np.load(path, allow_pickle=True)
self.embeddings = data['embeddings'].item()
# Usage
db = FaceDatabase()
# Add faces
for image_path in Path("known_faces/").glob("*.jpg"):
person_id = image_path.stem
image = cv2.imread(str(image_path))
try:
db.add_face(person_id, image)
print(f"Added: {person_id}")
except ValueError as e:
print(f"Skipped: {e}")
# Save database
db.save("face_database.npz")
# Search
query_image = cv2.imread("group_photo.jpg")
results = db.search(query_image)
for r in results:
if r['match']:
print(f"Found: {r['match']} (similarity: {r['similarity']:.3f})")
store.save()
print(f"Index saved: {store}")
```
---
**CLI tool:**
## Visualization
```bash
python tools/faiss_search.py build --faces-dir dataset/ --db-path ./my_index
```
### Search against the index
```python
import cv2
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.indexing import FAISS
def visualize_search_results(image, results):
"""Draw search results on image."""
for r in results:
x1, y1, x2, y2 = map(int, r['bbox'])
detector = RetinaFace()
recognizer = ArcFace()
if r['match']:
color = (0, 255, 0) # Green for match
label = f"{r['match']} ({r['similarity']:.2f})"
else:
color = (0, 0, 255) # Red for unknown
label = f"Unknown ({r['similarity']:.2f})"
store = FAISS(db_path="./my_index")
store.load()
cv2.rectangle(image, (x1, y1), (x2, y2), color, 2)
cv2.putText(image, label, (x1, y1 - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
image = cv2.imread("query.jpg")
faces = detector.detect(image)
return image
for face in faces:
embedding = recognizer.get_normalized_embedding(image, face.landmarks)
result, similarity = store.search(embedding, threshold=0.4)
# Usage
results = db.search(image)
annotated = visualize_search_results(image.copy(), results)
cv2.imwrite("search_result.jpg", annotated)
if result:
print(f"Matched: {result['person_id']} ({similarity:.2f})")
else:
print(f"Unknown ({similarity:.2f})")
```
---
**CLI tool:**
## Real-Time Search
```bash
python tools/faiss_search.py run --db-path ./my_index --source video.mp4
python tools/faiss_search.py run --db-path ./my_index --source 0 # webcam
```
### Manage the index
```python
import cv2
from uniface.indexing import FAISS
def realtime_search(db):
"""Real-time face search from webcam."""
cap = cv2.VideoCapture(0)
store = FAISS(db_path="./my_index")
store.load()
while True:
ret, frame = cap.read()
if not ret:
break
print(f"Total vectors: {len(store)}")
results = db.search(frame, threshold=0.5)
removed = store.remove("person_id", "bob")
print(f"Removed {removed} entries")
for r in results:
x1, y1, x2, y2 = map(int, r['bbox'])
if r['match']:
color = (0, 255, 0)
label = r['match']
else:
color = (0, 0, 255)
label = "Unknown"
cv2.rectangle(frame, (x1, y1), (x2, y2), color, 2)
cv2.putText(frame, label, (x1, y1 - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
cv2.imshow("Face Search", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
# Usage
db = FaceDatabase()
db.load("face_database.npz")
realtime_search(db)
store.save()
```
---
## See Also
- [Indexing Module](../modules/indexing.md) - Full `FAISS` API reference
- [Recognition Module](../modules/recognition.md) - Face recognition details
- [Batch Processing](batch-processing.md) - Process multiple files
- [Video & Webcam](video-webcam.md) - Real-time processing
- [Concepts: Thresholds](../concepts/thresholds-calibration.md) - Tuning similarity thresholds

View File

@@ -8,8 +8,10 @@ A complete pipeline for processing images with detection, recognition, and attri
```python
import cv2
from uniface import RetinaFace, ArcFace, AgeGender
from uniface.visualization import draw_detections
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.draw import draw_detections
# Initialize models
detector = RetinaFace()
@@ -67,7 +69,10 @@ cv2.imwrite("result.jpg", result_image)
For convenience, use the built-in `FaceAnalyzer`:
```python
from uniface import FaceAnalyzer, RetinaFace, ArcFace, AgeGender
from uniface.analyzer import FaceAnalyzer
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
import cv2
# Initialize with desired modules
@@ -101,13 +106,14 @@ Complete pipeline with all modules:
```python
import cv2
import numpy as np
from uniface import (
RetinaFace, ArcFace, AgeGender, FairFace,
Landmark106, MobileGaze
)
from uniface.attribute import AgeGender, FairFace
from uniface.detection import RetinaFace
from uniface.gaze import MobileGaze
from uniface.landmark import Landmark106
from uniface.recognition import ArcFace
from uniface.parsing import BiSeNet
from uniface.spoofing import MiniFASNet
from uniface.visualization import draw_detections, draw_gaze
from uniface.draw import draw_detections, draw_gaze
class FaceAnalysisPipeline:
def __init__(self):
@@ -193,8 +199,10 @@ for i, r in enumerate(results):
```python
import cv2
import numpy as np
from uniface import RetinaFace, AgeGender, MobileGaze
from uniface.visualization import draw_detections, draw_gaze
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
from uniface.gaze import MobileGaze
from uniface.draw import draw_detections, draw_gaze
def visualize_analysis(image_path, output_path):
"""Create annotated visualization of face analysis."""

View File

@@ -11,8 +11,8 @@ Real-time face analysis for video streams.
```python
import cv2
from uniface import RetinaFace
from uniface.visualization import draw_detections
from uniface.detection import RetinaFace
from uniface.draw import draw_detections
detector = RetinaFace()
cap = cv2.VideoCapture(0)
@@ -48,7 +48,7 @@ cv2.destroyAllWindows()
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
def process_video(input_path, output_path):
"""Process a video file."""
@@ -83,6 +83,57 @@ process_video("input.mp4", "output.mp4")
---
## Webcam Tracking
To track faces across frames with persistent IDs, pair a detector with `BYTETracker`:
```python
import cv2
import numpy as np
from uniface.common import xyxy_to_cxcywh
from uniface.detection import SCRFD
from uniface.tracking import BYTETracker
from uniface.draw import draw_tracks
detector = SCRFD()
tracker = BYTETracker(track_thresh=0.5, track_buffer=30)
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
if not ret:
break
faces = detector.detect(frame)
dets = np.array([[*f.bbox, f.confidence] for f in faces])
dets = dets if len(dets) > 0 else np.empty((0, 5))
tracks = tracker.update(dets)
if len(tracks) > 0 and len(faces) > 0:
face_bboxes = np.array([f.bbox for f in faces], dtype=np.float32)
track_ids = tracks[:, 4].astype(int)
face_centers = xyxy_to_cxcywh(face_bboxes)[:, :2]
track_centers = xyxy_to_cxcywh(tracks[:, :4])[:, :2]
for ti in range(len(tracks)):
dists = (track_centers[ti, 0] - face_centers[:, 0]) ** 2 + (track_centers[ti, 1] - face_centers[:, 1]) ** 2
faces[int(np.argmin(dists))].track_id = track_ids[ti]
draw_tracks(image=frame, faces=[f for f in faces if f.track_id is not None])
cv2.imshow("Face Tracking", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
```
For more details on tracker parameters and tuning, see [Tracking](../modules/tracking.md).
---
## Performance Tips
### Skip Frames
@@ -119,7 +170,8 @@ while True:
## See Also
- [Tracking Module](../modules/tracking.md) - Face tracking with BYTETracker
- [Anonymize Stream](anonymize-stream.md) - Privacy protection in video
- [Batch Processing](batch-processing.md) - Process multiple files
- [Detection Module](../modules/detection.md) - Detection options
- [Gaze Module](../modules/gaze.md) - Gaze tracking
- [Gaze Module](../modules/gaze.md) - Gaze estimation

View File

@@ -51,7 +51,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"2.0.0\n"
"3.0.0\n"
]
}
],
@@ -62,7 +62,7 @@
"\n",
"import uniface\n",
"from uniface.detection import RetinaFace\n",
"from uniface.visualization import draw_detections\n",
"from uniface.draw import draw_detections\n",
"\n",
"print(uniface.__version__)"
]
@@ -162,7 +162,7 @@
"landmarks = [f.landmarks for f in faces]\n",
"\n",
"# Draw detections\n",
"draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, fancy_bbox=True)\n",
"draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, corner_bbox=True)\n",
"\n",
"# Display result\n",
"output_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n",
@@ -214,7 +214,7 @@
"scores = [f.confidence for f in faces]\n",
"landmarks = [f.landmarks for f in faces]\n",
"\n",
"draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, fancy_bbox=True)\n",
"draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, corner_bbox=True)\n",
"\n",
"output_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n",
"display.display(Image.fromarray(output_image))"
@@ -261,7 +261,7 @@
"scores = [f.confidence for f in faces]\n",
"landmarks = [f.landmarks for f in faces]\n",
"\n",
"draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, fancy_bbox=True)\n",
"draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, corner_bbox=True)\n",
"\n",
"output_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n",
"display.display(Image.fromarray(output_image))"

View File

@@ -55,7 +55,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"2.0.0\n"
"3.0.0\n"
]
}
],
@@ -67,7 +67,7 @@
"import uniface\n",
"from uniface.detection import RetinaFace\n",
"from uniface.face_utils import face_alignment\n",
"from uniface.visualization import draw_detections\n",
"from uniface.draw import draw_detections\n",
"\n",
"print(uniface.__version__)"
]
@@ -142,7 +142,7 @@
" bboxes = [f.bbox for f in faces]\n",
" scores = [f.confidence for f in faces]\n",
" landmarks = [f.landmarks for f in faces]\n",
" draw_detections(image=bbox_image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, fancy_bbox=True)\n",
" draw_detections(image=bbox_image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, corner_bbox=True)\n",
"\n",
" # Align first detected face (returns aligned image and inverse transform matrix)\n",
" first_landmarks = faces[0].landmarks\n",

View File

@@ -44,7 +44,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"2.0.0\n"
"3.0.0\n"
]
}
],
@@ -53,7 +53,7 @@
"import matplotlib.pyplot as plt\n",
"\n",
"import uniface\n",
"from uniface import FaceAnalyzer\n",
"from uniface.analyzer import FaceAnalyzer\n",
"from uniface.detection import RetinaFace\n",
"from uniface.recognition import ArcFace\n",
"\n",

View File

@@ -11,7 +11,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 1,
"metadata": {},
"outputs": [
{
@@ -49,7 +49,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"2.0.0\n"
"3.0.0\n"
]
}
],
@@ -58,7 +58,7 @@
"import matplotlib.pyplot as plt\n",
"\n",
"import uniface\n",
"from uniface import FaceAnalyzer\n",
"from uniface.analyzer import FaceAnalyzer\n",
"from uniface.detection import RetinaFace\n",
"from uniface.recognition import ArcFace\n",
"\n",
@@ -69,16 +69,7 @@
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"✓ Model loaded (CoreML (Apple Silicon))\n",
"✓ Model loaded (CoreML (Apple Silicon))\n"
]
}
],
"outputs": [],
"source": [
"analyzer = FaceAnalyzer(\n",
" detector=RetinaFace(confidence_threshold=0.5),\n",

View File

@@ -51,7 +51,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"2.0.0\n"
"3.0.0\n"
]
}
],
@@ -60,11 +60,11 @@
"import matplotlib.pyplot as plt\n",
"\n",
"import uniface\n",
"from uniface import FaceAnalyzer\n",
"from uniface.analyzer import FaceAnalyzer\n",
"from uniface.detection import RetinaFace\n",
"from uniface.recognition import ArcFace\n",
"from uniface.attribute import AgeGender\n",
"from uniface.visualization import draw_detections\n",
"from uniface.draw import draw_detections\n",
"\n",
"print(uniface.__version__)"
]
@@ -148,7 +148,7 @@
" bboxes = [f.bbox for f in faces]\n",
" scores = [f.confidence for f in faces]\n",
" landmarks = [f.landmarks for f in faces]\n",
" draw_detections(image=vis_image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.5, fancy_bbox=True)\n",
" draw_detections(image=vis_image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.5, corner_bbox=True)\n",
"\n",
" results.append((image_path, cv2.cvtColor(vis_image, cv2.COLOR_BGR2RGB), faces))"
]

View File

@@ -15,7 +15,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 1,
"metadata": {},
"outputs": [
{
@@ -53,7 +53,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"UniFace version: 2.0.0\n"
"UniFace version: 3.0.0\n"
]
}
],
@@ -66,7 +66,7 @@
"import uniface\n",
"from uniface.parsing import BiSeNet\n",
"from uniface.constants import ParsingWeights\n",
"from uniface.visualization import vis_parsing_maps\n",
"from uniface.draw import vis_parsing_maps\n",
"\n",
"print(f\"UniFace version: {uniface.__version__}\")"
]
@@ -82,15 +82,7 @@
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"✓ Model loaded (CoreML (Apple Silicon))\n"
]
}
],
"outputs": [],
"source": [
"# Initialize face parser (uses ResNet18 by default)\n",
"parser = BiSeNet(model_name=ParsingWeights.RESNET34) # use resnet34 for better accuracy"

File diff suppressed because one or more lines are too long

View File

@@ -51,7 +51,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"UniFace version: 2.0.0\n"
"UniFace version: 3.0.0\n"
]
}
],
@@ -65,7 +65,7 @@
"import uniface\n",
"from uniface.detection import RetinaFace\n",
"from uniface.gaze import MobileGaze\n",
"from uniface.visualization import draw_gaze\n",
"from uniface.draw import draw_gaze\n",
"\n",
"print(f\"UniFace version: {uniface.__version__}\")"
]
@@ -110,19 +110,19 @@
"text": [
"Processing: image0.jpg\n",
" Detected 1 face(s)\n",
" Face 1: pitch=-0.0°, yaw=7.1°\n",
" Face 1: pitch=7.1°, yaw=-0.0°\n",
"Processing: image1.jpg\n",
" Detected 1 face(s)\n",
" Face 1: pitch=-3.3°, yaw=-5.6°\n",
" Face 1: pitch=-5.6°, yaw=-3.3°\n",
"Processing: image2.jpg\n",
" Detected 1 face(s)\n",
" Face 1: pitch=-3.9°, yaw=-0.3°\n",
" Face 1: pitch=-0.3°, yaw=-3.9°\n",
"Processing: image3.jpg\n",
" Detected 1 face(s)\n",
" Face 1: pitch=-22.1°, yaw=1.0°\n",
" Face 1: pitch=1.0°, yaw=-22.1°\n",
"Processing: image4.jpg\n",
" Detected 1 face(s)\n",
" Face 1: pitch=2.1°, yaw=5.0°\n",
" Face 1: pitch=5.0°, yaw=2.1°\n",
"\n",
"Processed 5 images\n"
]

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -48,6 +48,7 @@ theme:
- content.action.edit
- content.action.view
- content.tabs.link
- announce.dismiss
- toc.follow
icon:
@@ -134,6 +135,7 @@ nav:
- Quickstart: quickstart.md
- Notebooks: notebooks.md
- Model Zoo: models.md
- Datasets: datasets.md
- Tutorials:
- Image Pipeline: recipes/image-pipeline.md
- Video & Webcam: recipes/video-webcam.md
@@ -144,12 +146,14 @@ nav:
- API Reference:
- Detection: modules/detection.md
- Recognition: modules/recognition.md
- Tracking: modules/tracking.md
- Landmarks: modules/landmarks.md
- Attributes: modules/attributes.md
- Parsing: modules/parsing.md
- Gaze: modules/gaze.md
- Anti-Spoofing: modules/spoofing.md
- Privacy: modules/privacy.md
- Indexing: modules/indexing.md
- Guides:
- Overview: concepts/overview.md
- Inputs & Outputs: concepts/inputs-outputs.md

View File

@@ -1,7 +1,7 @@
[project]
name = "uniface"
version = "2.3.0"
description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Landmark Analysis, Face Parsing, Gaze Estimation, Age, and Gender Detection"
version = "3.1.0"
description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Tracking, Landmark Analysis, Face Parsing, Gaze Estimation, Age, and Gender Detection"
readme = "README.md"
license = "MIT"
authors = [{ name = "Yakhyokhuja Valikhujaev", email = "yakhyo9696@gmail.com" }]
@@ -13,6 +13,7 @@ requires-python = ">=3.10,<3.14"
keywords = [
"face-detection",
"face-recognition",
"face-tracking",
"facial-landmarks",
"face-parsing",
"face-segmentation",
@@ -28,7 +29,7 @@ keywords = [
]
classifiers = [
"Development Status :: 4 - Beta",
"Development Status :: 5 - Production/Stable",
"Intended Audience :: Developers",
"Intended Audience :: Science/Research",
"Operating System :: OS Independent",
@@ -42,9 +43,9 @@ classifiers = [
dependencies = [
"numpy>=1.21.0",
"opencv-python>=4.5.0",
"onnx>=1.12.0",
"onnxruntime>=1.16.0",
"scikit-image>=0.19.0",
"scipy>=1.7.0",
"requests>=2.28.0",
"tqdm>=4.64.0",
]
@@ -56,9 +57,9 @@ gpu = ["onnxruntime-gpu>=1.16.0"]
[project.urls]
Homepage = "https://github.com/yakhyo/uniface"
Repository = "https://github.com/yakhyo/uniface"
Documentation = "https://github.com/yakhyo/uniface/blob/main/README.md"
"Quick Start" = "https://github.com/yakhyo/uniface/blob/main/QUICKSTART.md"
"Model Zoo" = "https://github.com/yakhyo/uniface/blob/main/MODELS.md"
Documentation = "https://yakhyo.github.io/uniface"
"Quick Start" = "https://yakhyo.github.io/uniface/quickstart/"
"Model Zoo" = "https://yakhyo.github.io/uniface/models/"
[build-system]
requires = ["setuptools>=64", "wheel"]

View File

@@ -1,8 +1,7 @@
numpy>=1.21.0
opencv-python>=4.5.0
onnx>=1.12.0
onnxruntime>=1.16.0
scikit-image>=0.19.0
scipy>=1.7.0
requests>=2.28.0
pytest>=7.0.0
tqdm>=4.64.0

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for AgeGender attribute predictor."""
from __future__ import annotations

61
tests/test_draw.py Normal file
View File

@@ -0,0 +1,61 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
import numpy as np
from uniface.draw import draw_gaze
def _compute_gaze_delta(bbox: np.ndarray, pitch: float, yaw: float) -> tuple[int, int]:
"""Replicate draw_gaze dx/dy math for verification."""
x_min, _, x_max, _ = map(int, bbox[:4])
length = x_max - x_min
dx = int(-length * np.sin(yaw) * np.cos(pitch))
dy = int(-length * np.sin(pitch))
return dx, dy
def test_draw_gaze_yaw_only_moves_horizontally():
"""Yaw-only input (pitch=0) should produce horizontal displacement only."""
image = np.zeros((200, 200, 3), dtype=np.uint8)
bbox = np.array([50, 50, 150, 150], dtype=np.float32)
yaw = 0.5
pitch = 0.0
dx, dy = _compute_gaze_delta(bbox, pitch, yaw)
assert dx != 0, 'Yaw-only should produce horizontal displacement'
assert dy == 0, 'Yaw-only should produce zero vertical displacement'
# Should not raise
draw_gaze(image, bbox, pitch, yaw, draw_bbox=False, draw_angles=False)
def test_draw_gaze_pitch_only_moves_vertically():
"""Pitch-only input (yaw=0) should produce vertical displacement only."""
image = np.zeros((200, 200, 3), dtype=np.uint8)
bbox = np.array([50, 50, 150, 150], dtype=np.float32)
yaw = 0.0
pitch = 0.5
dx, dy = _compute_gaze_delta(bbox, pitch, yaw)
assert dx == 0, 'Pitch-only should produce zero horizontal displacement'
assert dy != 0, 'Pitch-only should produce vertical displacement'
# Should not raise
draw_gaze(image, bbox, pitch, yaw, draw_bbox=False, draw_angles=False)
def test_draw_gaze_modifies_image():
"""draw_gaze should modify the image in place."""
image = np.zeros((200, 200, 3), dtype=np.uint8)
bbox = np.array([50, 50, 150, 150], dtype=np.float32)
original = image.copy()
draw_gaze(image, bbox, 0.3, 0.3)
assert not np.array_equal(image, original), 'draw_gaze should modify the image'

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for factory functions (create_detector, create_recognizer, etc.)."""
from __future__ import annotations
@@ -13,10 +12,10 @@ from uniface import (
create_detector,
create_landmarker,
create_recognizer,
detect_faces,
list_available_detectors,
)
from uniface.constants import RetinaFaceWeights, SCRFDWeights
from uniface.spoofing import MiniFASNet, create_spoofer
# create_detector tests
@@ -123,62 +122,6 @@ def test_create_landmarker_invalid_method():
create_landmarker('invalid_method')
# detect_faces tests
def test_detect_faces_retinaface():
"""
Test high-level detect_faces function with RetinaFace.
"""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = detect_faces(mock_image, method='retinaface')
assert isinstance(faces, list), 'detect_faces should return a list'
def test_detect_faces_scrfd():
"""
Test high-level detect_faces function with SCRFD.
"""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = detect_faces(mock_image, method='scrfd')
assert isinstance(faces, list), 'detect_faces should return a list'
def test_detect_faces_with_threshold():
"""
Test detect_faces with custom confidence threshold.
"""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = detect_faces(mock_image, method='retinaface', confidence_threshold=0.8)
assert isinstance(faces, list), 'detect_faces should return a list'
# All detections should respect threshold
for face in faces:
assert face.confidence >= 0.8, 'All detections should meet confidence threshold'
def test_detect_faces_default_method():
"""
Test detect_faces with default method (should use retinaface).
"""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = detect_faces(mock_image) # No method specified
assert isinstance(faces, list), 'detect_faces should return a list with default method'
def test_detect_faces_empty_image():
"""
Test detect_faces on a blank image.
"""
empty_image = np.zeros((640, 640, 3), dtype=np.uint8)
faces = detect_faces(empty_image, method='retinaface')
assert isinstance(faces, list), 'Should return a list even for empty image'
assert len(faces) == 0, 'Should detect no faces in blank image'
# list_available_detectors tests
def test_list_available_detectors():
"""
@@ -280,3 +223,16 @@ def test_factory_returns_correct_types():
assert isinstance(detector, RetinaFace), 'Should return RetinaFace instance'
assert isinstance(recognizer, ArcFace), 'Should return ArcFace instance'
assert isinstance(landmarker, Landmark106), 'Should return Landmark106 instance'
# create_spoofer tests
def test_create_spoofer_default():
"""Test creating a spoofer with default parameters."""
spoofer = create_spoofer()
assert isinstance(spoofer, MiniFASNet), 'Should return MiniFASNet instance'
def test_create_spoofer_with_providers():
"""Test that create_spoofer forwards providers kwarg without TypeError."""
spoofer = create_spoofer(providers=['CPUExecutionProvider'])
assert isinstance(spoofer, MiniFASNet), 'Should return MiniFASNet instance'

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for 106-point facial landmark detector."""
from __future__ import annotations

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for face parsing models (BiSeNet and XSeg)."""
from __future__ import annotations
@@ -209,7 +208,7 @@ def test_xseg_parse_with_landmarks():
)
# Parse
mask = parser.parse(image, landmarks)
mask = parser.parse(image, landmarks=landmarks)
assert mask.shape == (480, 640)
assert mask.dtype == np.float32
@@ -226,7 +225,7 @@ def test_xseg_parse_invalid_landmarks():
invalid_landmarks = np.array([[0, 0], [1, 1], [2, 2]])
with pytest.raises(ValueError, match='Landmarks must have shape'):
parser.parse(image, invalid_landmarks)
parser.parse(image, landmarks=invalid_landmarks)
def test_xseg_parse_with_inverse():

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for face recognition models (ArcFace, MobileFace, SphereFace)."""
from __future__ import annotations

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for RetinaFace detector."""
from __future__ import annotations

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for SCRFD detector."""
from __future__ import annotations

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for UniFace type definitions (dataclasses)."""
from __future__ import annotations

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for utility functions (compute_similarity, face_alignment, etc.)."""
from __future__ import annotations

View File

@@ -6,27 +6,27 @@ CLI utilities for testing and running UniFace features.
| Tool | Description |
|------|-------------|
| `detection.py` | Face detection on image, video, or webcam |
| `face_anonymize.py` | Face anonymization/blurring for privacy |
| `age_gender.py` | Age and gender prediction |
| `face_emotion.py` | Emotion detection (7 or 8 emotions) |
| `gaze_estimation.py` | Gaze direction estimation |
| `detect.py` | Face detection on image, video, or webcam |
| `track.py` | Face tracking on video with ByteTrack |
| `analyze.py` | Complete face analysis (detection + recognition + attributes) |
| `anonymize.py` | Face anonymization/blurring for privacy |
| `emotion.py` | Emotion detection (7 or 8 emotions) |
| `gaze.py` | Gaze direction estimation |
| `landmarks.py` | 106-point facial landmark detection |
| `recognition.py` | Face embedding extraction and comparison |
| `face_analyzer.py` | Complete face analysis (detection + recognition + attributes) |
| `face_search.py` | Real-time face matching against reference |
| `recognize.py` | Face embedding extraction and comparison |
| `search.py` | Real-time face matching against reference |
| `fairface.py` | FairFace attribute prediction (race, gender, age) |
| `attribute.py` | Age and gender prediction |
| `spoofing.py` | Face anti-spoofing detection |
| `face_parsing.py` | Face semantic segmentation (BiSeNet) |
| `parse.py` | Face semantic segmentation (BiSeNet) |
| `xseg.py` | Face segmentation (XSeg) |
| `video_detection.py` | Face detection on video files with progress bar |
| `batch_process.py` | Batch process folder of images |
| `download_model.py` | Download model weights |
| `sha256_generate.py` | Generate SHA256 hash for model files |
## Unified `--source` Pattern
All tools use a unified `--source` argument that accepts:
Most tools use a unified `--source` argument that accepts:
- **Image path**: `--source photo.jpg`
- **Video path**: `--source video.mp4`
- **Camera ID**: `--source 0` (default webcam), `--source 1` (external camera)
@@ -35,26 +35,31 @@ All tools use a unified `--source` argument that accepts:
```bash
# Face detection
python tools/detection.py --source assets/test.jpg # image
python tools/detection.py --source video.mp4 # video
python tools/detection.py --source 0 # webcam
python tools/detect.py --source assets/test.jpg # image
python tools/detect.py --source video.mp4 # video
python tools/detect.py --source 0 # webcam
# Face tracking
python tools/track.py --source video.mp4
python tools/track.py --source video.mp4 --output tracked.mp4
python tools/track.py --source 0 # webcam
# Face anonymization
python tools/face_anonymize.py --source assets/test.jpg --method pixelate
python tools/face_anonymize.py --source video.mp4 --method gaussian
python tools/face_anonymize.py --source 0 --method pixelate
python tools/anonymize.py --source assets/test.jpg --method pixelate
python tools/anonymize.py --source video.mp4 --method gaussian
python tools/anonymize.py --source 0 --method pixelate
# Age and gender
python tools/age_gender.py --source assets/test.jpg
python tools/age_gender.py --source 0
python tools/attribute.py --source assets/test.jpg
python tools/attribute.py --source 0
# Emotion detection
python tools/face_emotion.py --source assets/test.jpg
python tools/face_emotion.py --source 0
python tools/emotion.py --source assets/test.jpg
python tools/emotion.py --source 0
# Gaze estimation
python tools/gaze_estimation.py --source assets/test.jpg
python tools/gaze_estimation.py --source 0
python tools/gaze.py --source assets/test.jpg
python tools/gaze.py --source 0
# Landmarks
python tools/landmarks.py --source assets/test.jpg
@@ -65,8 +70,8 @@ python tools/fairface.py --source assets/test.jpg
python tools/fairface.py --source 0
# Face parsing (BiSeNet)
python tools/face_parsing.py --source assets/test.jpg
python tools/face_parsing.py --source 0
python tools/parse.py --source assets/test.jpg
python tools/parse.py --source 0
# Face segmentation (XSeg)
python tools/xseg.py --source assets/test.jpg
@@ -77,22 +82,18 @@ python tools/spoofing.py --source assets/test.jpg
python tools/spoofing.py --source 0
# Face analyzer
python tools/face_analyzer.py --source assets/test.jpg
python tools/face_analyzer.py --source 0
python tools/analyze.py --source assets/test.jpg
python tools/analyze.py --source 0
# Face recognition (extract embedding)
python tools/recognition.py --image assets/test.jpg
python tools/recognize.py --image assets/test.jpg
# Face comparison
python tools/recognition.py --image1 face1.jpg --image2 face2.jpg
python tools/recognize.py --image1 face1.jpg --image2 face2.jpg
# Face search (match against reference)
python tools/face_search.py --reference person.jpg --source 0
python tools/face_search.py --reference person.jpg --source video.mp4
# Video processing with progress bar
python tools/video_detection.py --source video.mp4
python tools/video_detection.py --source video.mp4 --output output.mp4
python tools/search.py --reference person.jpg --source 0
python tools/search.py --reference person.jpg --source video.mp4
# Batch processing
python tools/batch_process.py --input images/ --output results/
@@ -122,5 +123,5 @@ python tools/download_model.py # downloads all
## Quick Test
```bash
python tools/detection.py --source assets/test.jpg
python tools/detect.py --source assets/test.jpg
```

29
tools/_common.py Normal file
View File

@@ -0,0 +1,29 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
from pathlib import Path
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera.
Args:
source: File path or camera ID string (e.g. ``"0"``).
Returns:
One of ``"image"``, ``"video"``, ``"camera"``, or ``"unknown"``.
"""
if source.isdigit():
return 'camera'
suffix = Path(source).suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
if suffix in VIDEO_EXTENSIONS:
return 'video'
return 'unknown'

View File

@@ -5,9 +5,9 @@
"""Face analysis using FaceAnalyzer.
Usage:
python tools/face_analyzer.py --source path/to/image.jpg
python tools/face_analyzer.py --source path/to/video.mp4
python tools/face_analyzer.py --source 0 # webcam
python tools/analyze.py --source path/to/image.jpg
python tools/analyze.py --source path/to/video.mp4
python tools/analyze.py --source 0 # webcam
"""
from __future__ import annotations
@@ -16,28 +16,15 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
import numpy as np
from uniface import AgeGender, ArcFace, FaceAnalyzer, RetinaFace
from uniface.visualization import draw_detections
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
from uniface.analyzer import FaceAnalyzer
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
from uniface.draw import draw_detections
from uniface.recognition import ArcFace
def draw_face_info(image, face, face_id):
@@ -111,7 +98,7 @@ def process_image(analyzer, image_path: str, save_dir: str = 'outputs', show_sim
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, fancy_bbox=True)
draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, corner_bbox=True)
for i, face in enumerate(faces, 1):
draw_face_info(image, face, i)
@@ -153,7 +140,7 @@ def process_video(analyzer, video_path: str, save_dir: str = 'outputs'):
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, fancy_bbox=True)
draw_detections(image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, corner_bbox=True)
for i, face in enumerate(faces, 1):
draw_face_info(frame, face, i)
@@ -189,7 +176,7 @@ def run_camera(analyzer, camera_id: int = 0):
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, fancy_bbox=True)
draw_detections(image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, corner_bbox=True)
for i, face in enumerate(faces, 1):
draw_face_info(frame, face, i)

View File

@@ -5,9 +5,9 @@
"""Face anonymization/blurring for privacy.
Usage:
python tools/face_anonymize.py --source path/to/image.jpg --method pixelate
python tools/face_anonymize.py --source path/to/video.mp4 --method gaussian
python tools/face_anonymize.py --source 0 --method pixelate # webcam
python tools/anonymize.py --source path/to/image.jpg --method pixelate
python tools/anonymize.py --source path/to/video.mp4 --method gaussian
python tools/anonymize.py --source 0 --method pixelate # webcam
"""
from __future__ import annotations
@@ -16,28 +16,12 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
def process_image(
detector,
@@ -56,7 +40,7 @@ def process_image(
print(f'Detected {len(faces)} face(s)')
if show_detections and faces:
from uniface.visualization import draw_detections
from uniface.draw import draw_detections
preview = image.copy()
bboxes = [face.bbox for face in faces]
@@ -171,19 +155,19 @@ def main():
epilog="""
Examples:
# Anonymize image with pixelation (default)
python run_anonymization.py --source photo.jpg
python tools/anonymize.py --source photo.jpg
# Use Gaussian blur with custom strength
python run_anonymization.py --source photo.jpg --method gaussian --blur-strength 5.0
python tools/anonymize.py --source photo.jpg --method gaussian --blur-strength 5.0
# Real-time webcam anonymization
python run_anonymization.py --source 0 --method pixelate
python tools/anonymize.py --source 0 --method pixelate
# Black boxes for maximum privacy
python run_anonymization.py --source photo.jpg --method blackout
python tools/anonymize.py --source photo.jpg --method blackout
# Custom pixelation intensity
python run_anonymization.py --source photo.jpg --method pixelate --pixel-blocks 5
python tools/anonymize.py --source photo.jpg --method pixelate --pixel-blocks 5
""",
)

View File

@@ -5,9 +5,9 @@
"""Age and gender prediction on detected faces.
Usage:
python tools/age_gender.py --source path/to/image.jpg
python tools/age_gender.py --source path/to/video.mp4
python tools/age_gender.py --source 0 # webcam
python tools/attribute.py --source path/to/image.jpg
python tools/attribute.py --source path/to/video.mp4
python tools/attribute.py --source 0 # webcam
"""
from __future__ import annotations
@@ -16,27 +16,12 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
from uniface import SCRFD, AgeGender, RetinaFace
from uniface.visualization import draw_detections
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
from uniface.attribute import AgeGender
from uniface.detection import SCRFD, RetinaFace
from uniface.draw import draw_detections
def draw_age_gender_label(image, bbox, sex: str, age: int):
@@ -71,7 +56,7 @@ def process_image(
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for i, face in enumerate(faces):
@@ -123,7 +108,7 @@ def process_video(
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for face in faces:
@@ -162,7 +147,7 @@ def run_camera(detector, age_gender, camera_id: int = 0, threshold: float = 0.6)
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for face in faces:

View File

@@ -14,8 +14,8 @@ from pathlib import Path
import cv2
from tqdm import tqdm
from uniface import SCRFD, RetinaFace
from uniface.visualization import draw_detections
from uniface.detection import SCRFD, RetinaFace
from uniface.draw import draw_detections
def get_image_files(input_dir: Path, extensions: tuple) -> list:
@@ -39,7 +39,7 @@ def process_image(detector, image_path: Path, output_path: Path, threshold: floa
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
cv2.putText(

View File

@@ -5,9 +5,9 @@
"""Face detection on image, video, or webcam.
Usage:
python tools/detection.py --source path/to/image.jpg
python tools/detection.py --source path/to/video.mp4
python tools/detection.py --source 0 # webcam
python tools/detect.py --source path/to/image.jpg
python tools/detect.py --source path/to/video.mp4
python tools/detect.py --source 0 # webcam
"""
from __future__ import annotations
@@ -15,28 +15,14 @@ from __future__ import annotations
import argparse
import os
from pathlib import Path
import time
from _common import get_source_type
import cv2
from tqdm import tqdm
from uniface.detection import SCRFD, RetinaFace, YOLOv5Face, YOLOv8Face
from uniface.visualization import draw_detections
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
from uniface.draw import draw_detections
def process_image(detector, image_path: str, threshold: float = 0.6, save_dir: str = 'outputs'):
@@ -52,7 +38,7 @@ def process_image(detector, image_path: str, threshold: float = 0.6, save_dir: s
bboxes = [face.bbox for face in faces]
scores = [face.confidence for face in faces]
landmarks = [face.landmarks for face in faces]
draw_detections(image, bboxes, scores, landmarks, vis_threshold=threshold)
draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold)
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{os.path.splitext(os.path.basename(image_path))[0]}_out.jpg')
@@ -60,34 +46,48 @@ def process_image(detector, image_path: str, threshold: float = 0.6, save_dir: s
print(f'Detected {len(faces)} face(s). Output saved: {output_path}')
def process_video(detector, video_path: str, threshold: float = 0.6, save_dir: str = 'outputs'):
"""Process a video file."""
cap = cv2.VideoCapture(video_path)
def process_video(
detector,
input_path: str,
output_path: str,
threshold: float = 0.6,
show_preview: bool = False,
):
"""Process a video file with progress bar."""
cap = cv2.VideoCapture(input_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{video_path}'")
print(f"Error: Cannot open video file '{input_path}'")
return
# Get video properties
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(video_path).stem}_out.mp4')
print(f'Input: {input_path} ({width}x{height}, {fps:.1f} fps, {total_frames} frames)')
print(f'Output: {output_path}')
Path(output_path).parent.mkdir(parents=True, exist_ok=True)
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
print(f'Processing video: {video_path} ({total_frames} frames)')
frame_count = 0
if not out.isOpened():
print(f"Error: Cannot create output video '{output_path}'")
cap.release()
return
while True:
frame_count = 0
total_faces = 0
for _ in tqdm(range(total_frames), desc='Processing', unit='frames'):
ret, frame = cap.read()
if not ret:
break
t0 = time.perf_counter()
frame_count += 1
faces = detector.detect(frame)
total_faces += len(faces)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
@@ -99,19 +99,28 @@ def process_video(detector, video_path: str, threshold: float = 0.6, save_dir: s
landmarks=landmarks,
vis_threshold=threshold,
draw_score=True,
fancy_bbox=True,
corner_bbox=True,
)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
inference_fps = 1.0 / max(time.perf_counter() - t0, 1e-9)
cv2.putText(frame, f'FPS: {inference_fps:.1f}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 65), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
out.write(frame)
# Show progress
if frame_count % 100 == 0:
print(f' Processed {frame_count}/{total_frames} frames...')
if show_preview:
cv2.imshow("Processing - Press 'q' to cancel", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
print('\nCancelled by user')
break
cap.release()
out.release()
print(f'Done! Output saved: {output_path}')
if show_preview:
cv2.destroyAllWindows()
avg_faces = total_faces / frame_count if frame_count > 0 else 0
print(f'\nDone! {frame_count} frames, {total_faces} faces ({avg_faces:.1f} avg/frame)')
print(f'Saved: {output_path}')
def run_camera(detector, camera_id: int = 0, threshold: float = 0.6):
@@ -123,9 +132,10 @@ def run_camera(detector, camera_id: int = 0, threshold: float = 0.6):
print("Press 'q' to quit")
prev_time = time.perf_counter()
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1) # mirror for natural interaction
frame = cv2.flip(frame, 1)
if not ret:
break
@@ -141,10 +151,14 @@ def run_camera(detector, camera_id: int = 0, threshold: float = 0.6):
landmarks=landmarks,
vis_threshold=threshold,
draw_score=True,
fancy_bbox=True,
corner_bbox=True,
)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
curr_time = time.perf_counter()
fps = 1.0 / max(curr_time - prev_time, 1e-9)
prev_time = curr_time
cv2.putText(frame, f'FPS: {fps:.1f}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 65), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.imshow('Face Detection', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
@@ -158,18 +172,24 @@ def main():
parser = argparse.ArgumentParser(description='Run face detection')
parser.add_argument('--source', type=str, required=True, help='Image/video path or camera ID (0, 1, ...)')
parser.add_argument(
'--method', type=str, default='retinaface', choices=['retinaface', 'scrfd', 'yolov5face', 'yolov8face']
'--detector',
'--method',
type=str,
default='retinaface',
choices=['retinaface', 'scrfd', 'yolov5face', 'yolov8face'],
)
parser.add_argument('--threshold', type=float, default=0.25, help='Visualization threshold')
parser.add_argument('--preview', action='store_true', help='Show live preview during video processing')
parser.add_argument('--save-dir', type=str, default='outputs', help='Output directory')
parser.add_argument('--output', type=str, default=None, help='Output video path (auto-generated if not specified)')
args = parser.parse_args()
# Initialize detector
if args.method == 'retinaface':
if args.detector == 'retinaface':
detector = RetinaFace()
elif args.method == 'scrfd':
elif args.detector == 'scrfd':
detector = SCRFD()
elif args.method == 'yolov5face':
elif args.detector == 'yolov5face':
from uniface.constants import YOLOv5FaceWeights
detector = YOLOv5Face(model_name=YOLOv5FaceWeights.YOLOV5M)
@@ -178,7 +198,6 @@ def main():
detector = YOLOv8Face(model_name=YOLOv8FaceWeights.YOLOV8N)
# Determine source type and process
source_type = get_source_type(args.source)
if source_type == 'camera':
@@ -192,7 +211,12 @@ def main():
if not os.path.exists(args.source):
print(f'Error: Video not found: {args.source}')
return
process_video(detector, args.source, args.threshold, args.save_dir)
if args.output:
output_path = args.output
else:
os.makedirs(args.save_dir, exist_ok=True)
output_path = os.path.join(args.save_dir, f'{Path(args.source).stem}_detected.mp4')
process_video(detector, args.source, output_path, args.threshold, args.preview)
else:
print(f"Error: Unknown source type for '{args.source}'")
print('Supported formats: images (.jpg, .png, ...), videos (.mp4, .avi, ...), or camera ID (0, 1, ...)')

View File

@@ -5,9 +5,9 @@
"""Emotion detection on detected faces.
Usage:
python tools/face_emotion.py --source path/to/image.jpg
python tools/face_emotion.py --source path/to/video.mp4
python tools/face_emotion.py --source 0 # webcam
python tools/emotion.py --source path/to/image.jpg
python tools/emotion.py --source path/to/video.mp4
python tools/emotion.py --source 0 # webcam
"""
from __future__ import annotations
@@ -16,27 +16,12 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
from uniface import SCRFD, Emotion, RetinaFace
from uniface.visualization import draw_detections
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
from uniface.attribute import Emotion
from uniface.detection import SCRFD, RetinaFace
from uniface.draw import draw_detections
def draw_emotion_label(image, bbox, emotion: str, confidence: float):
@@ -71,7 +56,7 @@ def process_image(
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for i, face in enumerate(faces):
@@ -123,7 +108,7 @@ def process_video(
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for face in faces:
@@ -162,7 +147,7 @@ def run_camera(detector, emotion_predictor, camera_id: int = 0, threshold: float
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for face in faces:

View File

@@ -16,28 +16,12 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
from uniface import SCRFD, RetinaFace
from uniface.attribute import FairFace
from uniface.visualization import draw_detections
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
from uniface.detection import SCRFD, RetinaFace
from uniface.draw import draw_detections
def draw_fairface_label(image, bbox, sex: str, age_group: str, race: str):
@@ -72,7 +56,7 @@ def process_image(
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for i, face in enumerate(faces):
@@ -124,7 +108,7 @@ def process_video(
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for face in faces:
@@ -163,7 +147,7 @@ def run_camera(detector, fairface, camera_id: int = 0, threshold: float = 0.6):
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for face in faces:

208
tools/faiss_search.py Normal file
View File

@@ -0,0 +1,208 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""FAISS index build and multi-identity face search.
Build a vector index from a directory of person sub-folders, then search
against it in a video or webcam stream.
Usage:
python tools/faiss_search.py build --faces-dir dataset/ --db-path ./vector_index
python tools/faiss_search.py run --db-path ./vector_index --source video.mp4
python tools/faiss_search.py run --db-path ./vector_index --source 0 # webcam
"""
from __future__ import annotations
import argparse
import os
from pathlib import Path
from _common import IMAGE_EXTENSIONS, get_source_type
import cv2
from uniface import create_detector, create_recognizer
from uniface.draw import draw_corner_bbox, draw_text_label
from uniface.indexing import FAISS
def _draw_face(image, bbox, text: str, color: tuple[int, int, int]) -> None:
x1, y1, x2, y2 = map(int, bbox[:4])
thickness = max(round(sum(image.shape[:2]) / 2 * 0.003), 2)
font_scale = max(0.4, min(0.7, (y2 - y1) / 200))
draw_corner_bbox(image, (x1, y1, x2, y2), color=color, thickness=thickness)
draw_text_label(image, text, x1, y1, bg_color=color, font_scale=font_scale)
def process_frame(frame, detector, recognizer, store: FAISS, threshold: float = 0.4):
faces = detector.detect(frame)
if not faces:
return frame
for face in faces:
embedding = recognizer.get_normalized_embedding(frame, face.landmarks)
result, sim = store.search(embedding, threshold=threshold)
text = f'{result["person_id"]} ({sim:.2f})' if result else f'Unknown ({sim:.2f})'
color = (0, 255, 0) if result else (0, 0, 255)
_draw_face(frame, face.bbox, text, color)
return frame
def process_video(detector, recognizer, store: FAISS, video_path: str, save_dir: str, threshold: float = 0.4):
cap = cv2.VideoCapture(video_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{video_path}'")
return
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(video_path).stem}_faiss_search.mp4')
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
print(f'Processing video: {video_path} ({total_frames} frames)')
frame_count = 0
while True:
ret, frame = cap.read()
if not ret:
break
frame_count += 1
frame = process_frame(frame, detector, recognizer, store, threshold)
out.write(frame)
if frame_count % 100 == 0:
print(f' Processed {frame_count}/{total_frames} frames...')
cap.release()
out.release()
print(f'Done! Output saved: {output_path}')
def run_camera(detector, recognizer, store: FAISS, camera_id: int = 0, threshold: float = 0.4):
cap = cv2.VideoCapture(camera_id)
if not cap.isOpened():
print(f'Cannot open camera {camera_id}')
return
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
frame = process_frame(frame, detector, recognizer, store, threshold)
cv2.imshow('Vector Search', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
def build(args: argparse.Namespace) -> None:
faces_dir = Path(args.faces_dir)
if not faces_dir.is_dir():
print(f"Error: '{faces_dir}' is not a directory")
return
detector = create_detector()
recognizer = create_recognizer()
store = FAISS(db_path=args.db_path)
persons = sorted(p.name for p in faces_dir.iterdir() if p.is_dir())
if not persons:
print(f"Error: No sub-folders found in '{faces_dir}'")
return
print(f'Found {len(persons)} persons: {", ".join(persons)}')
total_added = 0
for person_id in persons:
person_dir = faces_dir / person_id
images = [f for f in person_dir.iterdir() if f.suffix.lower() in IMAGE_EXTENSIONS]
added = 0
for img_path in images:
image = cv2.imread(str(img_path))
if image is None:
print(f' Warning: Failed to read {img_path}, skipping')
continue
faces = detector.detect(image)
if not faces:
print(f' Warning: No face detected in {img_path}, skipping')
continue
embedding = recognizer.get_normalized_embedding(image, faces[0].landmarks)
store.add(embedding, {'person_id': person_id, 'source': str(img_path)})
added += 1
total_added += added
if added:
print(f' {person_id}: {added} embeddings added')
else:
print(f' {person_id}: no valid faces found')
store.save()
print(f'\nIndex saved to {args.db_path} ({total_added} vectors, {len(persons)} persons)')
def run(args: argparse.Namespace) -> None:
detector = create_detector()
recognizer = create_recognizer()
store = FAISS(db_path=args.db_path)
if not store.load():
print(f"Error: No index found at '{args.db_path}'")
return
print(f'Loaded FAISS index: {store}')
source_type = get_source_type(args.source)
if source_type == 'camera':
run_camera(detector, recognizer, store, int(args.source), args.threshold)
elif source_type == 'video':
if not os.path.exists(args.source):
print(f'Error: Video not found: {args.source}')
return
process_video(detector, recognizer, store, args.source, args.save_dir, args.threshold)
else:
print(f"Error: Source must be a video file or camera ID, not '{args.source}'")
def main():
parser = argparse.ArgumentParser(description='FAISS vector search')
sub = parser.add_subparsers(dest='command', required=True)
build_p = sub.add_parser('build', help='Build a FAISS index from person sub-folders')
build_p.add_argument('--faces-dir', type=str, required=True, help='Directory with person sub-folders')
build_p.add_argument('--db-path', type=str, default='./vector_index', help='Where to save the index')
run_p = sub.add_parser('run', help='Search faces against a FAISS index')
run_p.add_argument('--db-path', type=str, required=True, help='Path to saved FAISS index')
run_p.add_argument('--source', type=str, required=True, help='Video path or camera ID')
run_p.add_argument('--threshold', type=float, default=0.4, help='Similarity threshold')
run_p.add_argument('--save-dir', type=str, default='outputs', help='Output directory')
args = parser.parse_args()
if args.command == 'build':
build(args)
elif args.command == 'run':
run(args)
if __name__ == '__main__':
main()

View File

@@ -5,9 +5,9 @@
"""Gaze estimation on detected faces.
Usage:
python tools/gaze_estimation.py --source path/to/image.jpg
python tools/gaze_estimation.py --source path/to/video.mp4
python tools/gaze_estimation.py --source 0 # webcam
python tools/gaze.py --source path/to/image.jpg
python tools/gaze.py --source path/to/video.mp4
python tools/gaze.py --source 0 # webcam
"""
from __future__ import annotations
@@ -16,29 +16,13 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
import numpy as np
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.draw import draw_gaze
from uniface.gaze import MobileGaze
from uniface.visualization import draw_gaze
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
def process_image(detector, gaze_estimator, image_path: str, save_dir: str = 'outputs'):

View File

@@ -16,26 +16,11 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
from uniface import SCRFD, Landmark106, RetinaFace
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
from uniface.detection import SCRFD, RetinaFace
from uniface.landmark import Landmark106
def process_image(detector, landmarker, image_path: str, save_dir: str = 'outputs'):

View File

@@ -5,9 +5,9 @@
"""Face parsing on detected faces.
Usage:
python tools/face_parsing.py --source path/to/image.jpg
python tools/face_parsing.py --source path/to/video.mp4
python tools/face_parsing.py --source 0 # webcam
python tools/parse.py --source path/to/image.jpg
python tools/parse.py --source path/to/video.mp4
python tools/parse.py --source 0 # webcam
"""
from __future__ import annotations
@@ -16,30 +16,14 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
import numpy as np
from uniface import RetinaFace
from uniface.constants import ParsingWeights
from uniface.detection import RetinaFace
from uniface.draw import vis_parsing_maps
from uniface.parsing import BiSeNet
from uniface.visualization import vis_parsing_maps
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
def expand_bbox(
@@ -225,7 +209,7 @@ def main():
args = parser_arg.parse_args()
detector = RetinaFace()
parser = BiSeNet(model_name=ParsingWeights.RESNET34)
parser = BiSeNet(model_name=args.model)
source_type = get_source_type(args.source)

View File

@@ -5,8 +5,8 @@
"""Face recognition: extract embeddings or compare two faces.
Usage:
python tools/recognition.py --image path/to/image.jpg
python tools/recognition.py --image1 face1.jpg --image2 face2.jpg
python tools/recognize.py --image path/to/image.jpg
python tools/recognize.py --image1 face1.jpg --image2 face2.jpg
"""
import argparse
@@ -41,7 +41,7 @@ def run_inference(detector, recognizer, image_path: str):
print(f'Detected {len(faces)} face(s). Extracting embedding for the first face...')
landmarks = faces[0]['landmarks'] # 5-point landmarks for alignment (already np.ndarray)
landmarks = faces[0].landmarks # 5-point landmarks for alignment (already np.ndarray)
embedding = recognizer.get_embedding(image, landmarks)
norm_embedding = recognizer.get_normalized_embedding(image, landmarks) # L2 normalized
@@ -65,8 +65,8 @@ def compare_faces(detector, recognizer, image1_path: str, image2_path: str, thre
print('Error: No faces detected in one or both images')
return
landmarks1 = faces1[0]['landmarks']
landmarks2 = faces2[0]['landmarks']
landmarks1 = faces1[0].landmarks
landmarks2 = faces2[0].landmarks
embedding1 = recognizer.get_normalized_embedding(img1, landmarks1)
embedding2 = recognizer.get_normalized_embedding(img2, landmarks2)

View File

@@ -2,11 +2,14 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Real-time face search: match faces against a reference image.
"""Single-reference face search on video or webcam.
Given a reference face image, detects faces in the source and shows
whether each face matches the reference.
Usage:
python tools/face_search.py --reference person.jpg --source 0 # webcam
python tools/face_search.py --reference person.jpg --source video.mp4
python tools/search.py --reference ref.jpg --source video.mp4
python tools/search.py --reference ref.jpg --source 0 # webcam
"""
from __future__ import annotations
@@ -15,43 +18,16 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
import numpy as np
from uniface.detection import SCRFD, RetinaFace
from uniface import create_detector, create_recognizer
from uniface.draw import draw_corner_bbox, draw_text_label
from uniface.face_utils import compute_similarity
from uniface.recognition import ArcFace, MobileFace, SphereFace
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
def get_recognizer(name: str):
"""Get recognizer by name."""
if name == 'arcface':
return ArcFace()
elif name == 'mobileface':
return MobileFace()
else:
return SphereFace()
def extract_reference_embedding(detector, recognizer, image_path: str) -> np.ndarray:
"""Extract embedding from reference image."""
image = cv2.imread(image_path)
if image is None:
raise RuntimeError(f'Failed to load image: {image_path}')
@@ -60,33 +36,34 @@ def extract_reference_embedding(detector, recognizer, image_path: str) -> np.nda
if not faces:
raise RuntimeError('No faces found in reference image.')
landmarks = faces[0].landmarks
return recognizer.get_normalized_embedding(image, landmarks)
return recognizer.get_normalized_embedding(image, faces[0].landmarks)
def _draw_face(image, bbox, text: str, color: tuple[int, int, int]) -> None:
x1, y1, x2, y2 = map(int, bbox[:4])
thickness = max(round(sum(image.shape[:2]) / 2 * 0.003), 2)
font_scale = max(0.4, min(0.7, (y2 - y1) / 200))
draw_corner_bbox(image, (x1, y1, x2, y2), color=color, thickness=thickness)
draw_text_label(image, text, x1, y1, bg_color=color, font_scale=font_scale)
def process_frame(frame, detector, recognizer, ref_embedding: np.ndarray, threshold: float = 0.4):
"""Process a single frame and return annotated frame."""
faces = detector.detect(frame)
for face in faces:
bbox = face.bbox
landmarks = face.landmarks
x1, y1, x2, y2 = map(int, bbox)
embedding = recognizer.get_normalized_embedding(frame, landmarks)
embedding = recognizer.get_normalized_embedding(frame, face.landmarks)
sim = compute_similarity(ref_embedding, embedding)
label = f'Match ({sim:.2f})' if sim > threshold else f'Unknown ({sim:.2f})'
text = f'Match ({sim:.2f})' if sim > threshold else f'Unknown ({sim:.2f})'
color = (0, 255, 0) if sim > threshold else (0, 0, 255)
cv2.rectangle(frame, (x1, y1), (x2, y2), color, 2)
cv2.putText(frame, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
_draw_face(frame, face.bbox, text, color)
return frame
def process_video(detector, recognizer, ref_embedding: np.ndarray, video_path: str, save_dir: str, threshold: float):
"""Process a video file."""
def process_video(
detector, recognizer, video_path: str, save_dir: str, ref_embedding: np.ndarray, threshold: float = 0.4
):
cap = cv2.VideoCapture(video_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{video_path}'")
@@ -123,7 +100,6 @@ def process_video(detector, recognizer, ref_embedding: np.ndarray, video_path: s
def run_camera(detector, recognizer, ref_embedding: np.ndarray, camera_id: int = 0, threshold: float = 0.4):
"""Run real-time face search on webcam."""
cap = cv2.VideoCapture(camera_id)
if not cap.isOpened():
print(f'Cannot open camera {camera_id}')
@@ -139,7 +115,7 @@ def run_camera(detector, recognizer, ref_embedding: np.ndarray, camera_id: int =
frame = process_frame(frame, detector, recognizer, ref_embedding, threshold)
cv2.imshow('Face Recognition', frame)
cv2.imshow('Face Search', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
@@ -148,17 +124,10 @@ def run_camera(detector, recognizer, ref_embedding: np.ndarray, camera_id: int =
def main():
parser = argparse.ArgumentParser(description='Face search using a reference image')
parser = argparse.ArgumentParser(description='Single-reference face search')
parser.add_argument('--reference', type=str, required=True, help='Reference face image')
parser.add_argument('--source', type=str, required=True, help='Video path or camera ID (0, 1, ...)')
parser.add_argument('--threshold', type=float, default=0.4, help='Match threshold')
parser.add_argument('--detector', type=str, default='scrfd', choices=['retinaface', 'scrfd'])
parser.add_argument(
'--recognizer',
type=str,
default='arcface',
choices=['arcface', 'mobileface', 'sphereface'],
)
parser.add_argument('--source', type=str, required=True, help='Video path or camera ID')
parser.add_argument('--threshold', type=float, default=0.4, help='Similarity threshold')
parser.add_argument('--save-dir', type=str, default='outputs', help='Output directory')
args = parser.parse_args()
@@ -166,8 +135,8 @@ def main():
print(f'Error: Reference image not found: {args.reference}')
return
detector = RetinaFace() if args.detector == 'retinaface' else SCRFD()
recognizer = get_recognizer(args.recognizer)
detector = create_detector()
recognizer = create_recognizer()
print(f'Loading reference: {args.reference}')
ref_embedding = extract_reference_embedding(detector, recognizer, args.reference)
@@ -180,10 +149,9 @@ def main():
if not os.path.exists(args.source):
print(f'Error: Video not found: {args.source}')
return
process_video(detector, recognizer, ref_embedding, args.source, args.save_dir, args.threshold)
process_video(detector, recognizer, args.source, args.save_dir, ref_embedding, args.threshold)
else:
print(f"Error: Source must be a video file or camera ID, not '{args.source}'")
print('Supported formats: videos (.mp4, .avi, ...) or camera ID (0, 1, ...)')
if __name__ == '__main__':

View File

@@ -16,30 +16,14 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
import numpy as np
from uniface import RetinaFace
from uniface.constants import MiniFASNetWeights
from uniface.detection import RetinaFace
from uniface.spoofing import create_spoofer
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
def draw_spoofing_result(
image: np.ndarray,

199
tools/track.py Normal file
View File

@@ -0,0 +1,199 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Face tracking on video files using ByteTrack.
Usage:
python tools/track.py --source video.mp4
python tools/track.py --source video.mp4 --output outputs/tracked.mp4
python tools/track.py --source 0 # webcam
"""
from __future__ import annotations
import argparse
import os
from pathlib import Path
from _common import VIDEO_EXTENSIONS
import cv2
import numpy as np
from tqdm import tqdm
from uniface.common import xyxy_to_cxcywh
from uniface.detection import SCRFD, RetinaFace
from uniface.draw import draw_tracks
from uniface.tracking import BYTETracker
def _assign_track_ids(faces, tracks) -> list:
"""Match tracker outputs back to Face objects by center distance."""
if len(tracks) == 0 or len(faces) == 0:
return []
face_bboxes = np.array([f.bbox for f in faces], dtype=np.float32)
track_ids = tracks[:, 4].astype(int)
face_centers = xyxy_to_cxcywh(face_bboxes)[:, :2] # (N, 2) -> [cx, cy]
track_centers = xyxy_to_cxcywh(tracks[:, :4])[:, :2] # (M, 2) -> [cx, cy]
for ti in range(len(tracks)):
dists = (track_centers[ti, 0] - face_centers[:, 0]) ** 2 + (track_centers[ti, 1] - face_centers[:, 1]) ** 2
faces[int(np.argmin(dists))].track_id = track_ids[ti]
return [f for f in faces if f.track_id is not None]
def process_video(
detector,
tracker: BYTETracker,
input_path: str,
output_path: str,
threshold: float = 0.5,
show_preview: bool = False,
):
"""Process a video file with face tracking."""
cap = cv2.VideoCapture(input_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{input_path}'")
return
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
print(f'Input: {input_path} ({width}x{height}, {fps:.1f} fps, {total_frames} frames)')
print(f'Output: {output_path}')
Path(output_path).parent.mkdir(parents=True, exist_ok=True)
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
if not out.isOpened():
print(f"Error: Cannot create output video '{output_path}'")
cap.release()
return
frame_count = 0
total_tracks = 0
for _ in tqdm(range(total_frames), desc='Tracking', unit='frames'):
ret, frame = cap.read()
if not ret:
break
frame_count += 1
# Detect faces
faces = detector.detect(frame)
dets = np.array([[*f.bbox, f.confidence] for f in faces if f.confidence >= threshold])
dets = dets if len(dets) > 0 else np.empty((0, 5))
# Update tracker
tracks = tracker.update(dets)
tracked_faces = _assign_track_ids(faces, tracks)
total_tracks += len(tracked_faces)
# Draw tracked faces
draw_tracks(image=frame, faces=tracked_faces)
cv2.putText(frame, f'Tracks: {len(tracked_faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
out.write(frame)
if show_preview:
cv2.imshow("Tracking - Press 'q' to cancel", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
print('\nCancelled by user')
break
cap.release()
out.release()
if show_preview:
cv2.destroyAllWindows()
avg_tracks = total_tracks / frame_count if frame_count > 0 else 0
print(f'\nDone! {frame_count} frames, {total_tracks} tracks ({avg_tracks:.1f} avg/frame)')
print(f'Saved: {output_path}')
def run_camera(
detector,
tracker: BYTETracker,
camera_id: int = 0,
threshold: float = 0.5,
):
"""Run real-time face tracking on webcam."""
cap = cv2.VideoCapture(camera_id)
if not cap.isOpened():
print(f'Cannot open camera {camera_id}')
return
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
# Detect faces
faces = detector.detect(frame)
dets = np.array([[*f.bbox, f.confidence] for f in faces if f.confidence >= threshold])
dets = dets if len(dets) > 0 else np.empty((0, 5))
# Update tracker
tracks = tracker.update(dets)
tracked_faces = _assign_track_ids(faces, tracks)
# Draw tracked faces
draw_tracks(image=frame, faces=tracked_faces)
cv2.putText(frame, f'Tracks: {len(tracked_faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.imshow('Face Tracking', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
def main():
parser = argparse.ArgumentParser(description='Face tracking on video using ByteTrack')
parser.add_argument('--source', type=str, required=True, help='Video path or camera ID (0, 1, ...)')
parser.add_argument('--output', type=str, default=None, help='Output video path')
parser.add_argument('--detector', type=str, default='scrfd', choices=['retinaface', 'scrfd'])
parser.add_argument('--threshold', type=float, default=0.5, help='Detection confidence threshold')
parser.add_argument('--track-buffer', type=int, default=30, help='Max frames to keep lost tracks')
parser.add_argument('--preview', action='store_true', help='Show live preview')
parser.add_argument('--save-dir', type=str, default='outputs', help='Output directory')
args = parser.parse_args()
detector = RetinaFace() if args.detector == 'retinaface' else SCRFD()
tracker = BYTETracker(track_thresh=args.threshold, track_buffer=args.track_buffer)
if args.source.isdigit():
run_camera(detector, tracker, int(args.source), args.threshold)
else:
if not os.path.exists(args.source):
print(f'Error: Video not found: {args.source}')
return
ext = Path(args.source).suffix.lower()
if ext not in VIDEO_EXTENSIONS:
print(f"Error: Unsupported format '{ext}'. Supported: {VIDEO_EXTENSIONS}")
return
if args.output:
output_path = args.output
else:
os.makedirs(args.save_dir, exist_ok=True)
output_path = os.path.join(args.save_dir, f'{Path(args.source).stem}_tracked.mp4')
process_video(detector, tracker, args.source, output_path, args.threshold, args.preview)
if __name__ == '__main__':
main()

View File

@@ -1,180 +0,0 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Face detection on video files with progress tracking.
Usage:
python tools/video_detection.py --source video.mp4
python tools/video_detection.py --source video.mp4 --output output.mp4
python tools/video_detection.py --source 0 # webcam
"""
from __future__ import annotations
import argparse
import os
from pathlib import Path
import cv2
from tqdm import tqdm
from uniface import SCRFD, RetinaFace
from uniface.visualization import draw_detections
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
def process_video(
detector,
input_path: str,
output_path: str,
threshold: float = 0.6,
show_preview: bool = False,
):
"""Process a video file with progress bar."""
cap = cv2.VideoCapture(input_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{input_path}'")
return
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
print(f'Input: {input_path} ({width}x{height}, {fps:.1f} fps, {total_frames} frames)')
print(f'Output: {output_path}')
Path(output_path).parent.mkdir(parents=True, exist_ok=True)
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
if not out.isOpened():
print(f"Error: Cannot create output video '{output_path}'")
cap.release()
return
frame_count = 0
total_faces = 0
for _ in tqdm(range(total_frames), desc='Processing', unit='frames'):
ret, frame = cap.read()
if not ret:
break
frame_count += 1
faces = detector.detect(frame)
total_faces += len(faces)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
out.write(frame)
if show_preview:
cv2.imshow("Processing - Press 'q' to cancel", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
print('\nCancelled by user')
break
cap.release()
out.release()
if show_preview:
cv2.destroyAllWindows()
avg_faces = total_faces / frame_count if frame_count > 0 else 0
print(f'\nDone! {frame_count} frames, {total_faces} faces ({avg_faces:.1f} avg/frame)')
print(f'Saved: {output_path}')
def run_camera(detector, camera_id: int = 0, threshold: float = 0.6):
"""Run real-time detection on webcam."""
cap = cv2.VideoCapture(camera_id)
if not cap.isOpened():
print(f'Cannot open camera {camera_id}')
return
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
faces = detector.detect(frame)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.imshow('Face Detection', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
def main():
parser = argparse.ArgumentParser(description='Process video with face detection')
parser.add_argument('--source', type=str, required=True, help='Video path or camera ID (0, 1, ...)')
parser.add_argument('--output', type=str, default=None, help='Output video path (auto-generated if not specified)')
parser.add_argument('--detector', type=str, default='retinaface', choices=['retinaface', 'scrfd'])
parser.add_argument('--threshold', type=float, default=0.6, help='Visualization threshold')
parser.add_argument('--preview', action='store_true', help='Show live preview')
parser.add_argument('--save-dir', type=str, default='outputs', help='Output directory (if --output not specified)')
args = parser.parse_args()
detector = RetinaFace() if args.detector == 'retinaface' else SCRFD()
source_type = get_source_type(args.source)
if source_type == 'camera':
run_camera(detector, int(args.source), args.threshold)
elif source_type == 'video':
if not os.path.exists(args.source):
print(f'Error: Video not found: {args.source}')
return
# Determine output path
if args.output:
output_path = args.output
else:
os.makedirs(args.save_dir, exist_ok=True)
output_path = os.path.join(args.save_dir, f'{Path(args.source).stem}_detected.mp4')
process_video(detector, args.source, output_path, args.threshold, args.preview)
else:
print(f"Error: Unknown source type for '{args.source}'")
print('Supported formats: videos (.mp4, .avi, ...) or camera ID (0, 1, ...)')
if __name__ == '__main__':
main()

View File

@@ -16,27 +16,12 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
import numpy as np
from uniface import RetinaFace, XSeg
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
from uniface.detection import RetinaFace
from uniface.parsing import XSeg
def apply_mask_visualization(image: np.ndarray, mask: np.ndarray, alpha: float = 0.5) -> np.ndarray:
@@ -74,7 +59,7 @@ def process_image(
print(f' Face {i + 1}: skipped (no landmarks)')
continue
mask = parser.parse(image, face.landmarks)
mask = parser.parse(image, landmarks=face.landmarks)
full_mask = np.maximum(full_mask, mask)
print(f' Face {i + 1}: done')
@@ -136,7 +121,7 @@ def process_video(
for face in faces:
if face.landmarks is None:
continue
mask = parser.parse(frame, face.landmarks)
mask = parser.parse(frame, landmarks=face.landmarks)
full_mask = np.maximum(full_mask, mask)
# Apply visualization
@@ -184,7 +169,7 @@ def run_camera(
for face in faces:
if face.landmarks is None:
continue
mask = parser.parse(frame, face.landmarks)
mask = parser.parse(frame, landmarks=face.landmarks)
full_mask = np.maximum(full_mask, mask)
# Apply visualization

View File

@@ -16,6 +16,7 @@
This library provides unified APIs for:
- Face detection (RetinaFace, SCRFD, YOLOv5Face, YOLOv8Face)
- Face recognition (AdaFace, ArcFace, MobileFace, SphereFace)
- Face tracking (ByteTrack with Kalman filtering)
- Facial landmarks (106-point detection)
- Face parsing (semantic segmentation)
- Gaze estimation
@@ -28,38 +29,36 @@ from __future__ import annotations
__license__ = 'MIT'
__author__ = 'Yakhyokhuja Valikhujaev'
__version__ = '2.3.0'
__version__ = '3.1.0'
import contextlib
from uniface.face_utils import compute_similarity, face_alignment
from uniface.log import Logger, enable_logging
from uniface.model_store import verify_model_weights
from uniface.visualization import draw_detections, vis_parsing_maps
from uniface.model_store import download_models, get_cache_dir, set_cache_dir, verify_model_weights
from .analyzer import FaceAnalyzer
from .attribute import AgeGender, FairFace
from .attribute import AgeGender, Emotion, FairFace
from .detection import (
SCRFD,
RetinaFace,
YOLOv5Face,
YOLOv8Face,
create_detector,
detect_faces,
list_available_detectors,
)
from .gaze import MobileGaze, create_gaze_estimator
from .landmark import Landmark106, create_landmarker
from .parsing import BiSeNet, XSeg, create_face_parser
from .privacy import BlurFace, anonymize_faces
from .privacy import BlurFace
from .recognition import AdaFace, ArcFace, MobileFace, SphereFace, create_recognizer
from .spoofing import MiniFASNet, create_spoofer
from .tracking import BYTETracker
from .types import AttributeResult, EmotionResult, Face, GazeResult, SpoofingResult
# Optional: Emotion requires PyTorch
Emotion: type | None
try:
from .attribute import Emotion
except ImportError:
Emotion = None
# Optional: FAISS vector store (requires `pip install faiss-cpu`)
with contextlib.suppress(ImportError):
from .indexing import FAISS
__all__ = [
# Metadata
@@ -76,7 +75,6 @@ __all__ = [
'create_landmarker',
'create_recognizer',
'create_spoofer',
'detect_faces',
'list_available_detectors',
# Detection models
'RetinaFace',
@@ -105,15 +103,19 @@ __all__ = [
# Spoofing models
'MiniFASNet',
'SpoofingResult',
# Tracking
'BYTETracker',
# Privacy
'BlurFace',
'anonymize_faces',
# Indexing (optional)
'FAISS',
# Utilities
'Logger',
'compute_similarity',
'draw_detections',
'download_models',
'enable_logging',
'face_alignment',
'get_cache_dir',
'set_cache_dir',
'verify_model_weights',
'vis_parsing_maps',
]

View File

@@ -12,18 +12,27 @@ from uniface.attribute.age_gender import AgeGender
from uniface.attribute.base import Attribute
from uniface.attribute.fairface import FairFace
from uniface.constants import AgeGenderWeights, DDAMFNWeights, FairFaceWeights
from uniface.types import AttributeResult, EmotionResult, Face
from uniface.types import AttributeResult, EmotionResult
# Emotion requires PyTorch - make it optional
try:
from uniface.attribute.emotion import Emotion
_EMOTION_AVAILABLE = True
except ImportError:
Emotion = None
_EMOTION_AVAILABLE = False
# Public API for the attribute module
class Emotion(Attribute): # type: ignore[no-redef]
"""Stub for Emotion when PyTorch is not installed."""
def __init__(self, *args: Any, **kwargs: Any) -> None:
raise ImportError("Emotion requires optional dependency 'torch'. Install with: pip install torch")
def _initialize_model(self) -> None: ...
def preprocess(self, image: np.ndarray, *args: Any) -> Any: ...
def postprocess(self, prediction: Any) -> Any: ...
def predict(self, image: np.ndarray, *args: Any) -> Any: ...
__all__ = [
'AgeGender',
'AttributeResult',
@@ -31,16 +40,13 @@ __all__ = [
'EmotionResult',
'FairFace',
'create_attribute_predictor',
'predict_attributes',
]
# A mapping from model enums to their corresponding attribute classes
_ATTRIBUTE_MODELS = {
**dict.fromkeys(AgeGenderWeights, AgeGender),
**dict.fromkeys(FairFaceWeights, FairFace),
}
# Add Emotion models only if PyTorch is available
if _EMOTION_AVAILABLE:
_ATTRIBUTE_MODELS.update(dict.fromkeys(DDAMFNWeights, Emotion))
@@ -48,21 +54,16 @@ if _EMOTION_AVAILABLE:
def create_attribute_predictor(
model_name: AgeGenderWeights | DDAMFNWeights | FairFaceWeights, **kwargs: Any
) -> Attribute:
"""
Factory function to create an attribute predictor instance.
This high-level API simplifies the creation of attribute models by
dynamically selecting the correct class based on the provided model enum.
"""Factory function to create an attribute predictor instance.
Args:
model_name: The enum corresponding to the desired attribute model
(e.g., AgeGenderWeights.DEFAULT, DDAMFNWeights.AFFECNET7,
or FairFaceWeights.DEFAULT).
**kwargs: Additional keyword arguments to pass to the model's constructor.
(e.g., AgeGenderWeights.DEFAULT, DDAMFNWeights.AFFECNET7,
or FairFaceWeights.DEFAULT).
**kwargs: Additional keyword arguments passed to the model constructor.
Returns:
An initialized instance of an Attribute predictor class
(e.g., AgeGender, FairFace, or Emotion).
An initialized Attribute predictor (AgeGender, FairFace, or Emotion).
Raises:
ValueError: If the provided model_name is not a supported enum.
@@ -75,40 +76,4 @@ def create_attribute_predictor(
f'Please choose from AgeGenderWeights, FairFaceWeights, or DDAMFNWeights.'
)
# Pass model_name to the constructor, as some classes might need it
return model_class(model_name=model_name, **kwargs)
def predict_attributes(image: np.ndarray, faces: list[Face], predictor: Attribute) -> list[Face]:
"""
High-level API to predict attributes for multiple detected faces.
This function iterates through a list of Face objects, runs the
specified attribute predictor on each one, and updates the Face
objects with the predicted attributes.
Args:
image (np.ndarray): The full input image in BGR format.
faces (List[Face]): A list of Face objects from face detection.
predictor (Attribute): An initialized attribute predictor instance,
created by `create_attribute_predictor`.
Returns:
List[Face]: The list of Face objects with updated attribute fields.
"""
for face in faces:
if isinstance(predictor, AgeGender):
result = predictor(image, face.bbox)
face.gender = result.gender
face.age = result.age
elif isinstance(predictor, FairFace):
result = predictor(image, face.bbox)
face.gender = result.gender
face.age_group = result.age_group
face.race = result.race
elif isinstance(predictor, Emotion):
result = predictor(image, face.landmarks)
face.emotion = result.emotion
face.emotion_confidence = result.confidence
return faces

View File

@@ -28,17 +28,17 @@ class Emotion(Attribute):
def __init__(
self,
model_weights: DDAMFNWeights = DDAMFNWeights.AFFECNET7,
model_name: DDAMFNWeights = DDAMFNWeights.AFFECNET7,
input_size: tuple[int, int] = (112, 112),
) -> None:
"""
Initializes the emotion recognition model.
Args:
model_weights (DDAMFNWeights): The enum for the model weights to load.
model_name (DDAMFNWeights): The enum for the model weights to load.
input_size (Tuple[int, int]): The expected input size for the model.
"""
Logger.info(f'Initializing Emotion with model={model_weights.name}')
Logger.info(f'Initializing Emotion with model={model_name.name}')
if torch.backends.mps.is_available():
self.device = torch.device('mps')
@@ -48,7 +48,7 @@ class Emotion(Attribute):
self.device = torch.device('cpu')
self.input_size = input_size
self.model_path = verify_model_weights(model_weights)
self.model_path = verify_model_weights(model_name)
# Define emotion labels based on the selected model
self.emotion_labels = [
@@ -60,7 +60,7 @@ class Emotion(Attribute):
'Disgust',
'Angry',
]
if model_weights == DDAMFNWeights.AFFECNET8:
if model_name == DDAMFNWeights.AFFECNET8:
self.emotion_labels.append('Contempt')
self._initialize_model()

View File

@@ -18,6 +18,7 @@ __all__ = [
'generate_anchors',
'non_max_suppression',
'resize_image',
'xyxy_to_cxcywh',
]
@@ -61,6 +62,23 @@ def resize_image(
return image, resize_factor
def xyxy_to_cxcywh(bboxes: np.ndarray) -> np.ndarray:
"""Convert bounding boxes from ``[x1, y1, x2, y2]`` to ``[cx, cy, w, h]``.
Args:
bboxes: Array of shape (N, 4) or (4,) with ``[x1, y1, x2, y2]`` coordinates.
Returns:
Array of the same shape with ``[cx, cy, w, h]`` coordinates.
"""
out = np.empty_like(bboxes)
out[..., 0] = (bboxes[..., 0] + bboxes[..., 2]) / 2 # cx
out[..., 1] = (bboxes[..., 1] + bboxes[..., 3]) / 2 # cy
out[..., 2] = bboxes[..., 2] - bboxes[..., 0] # w
out[..., 3] = bboxes[..., 3] - bboxes[..., 1] # h
return out
def generate_anchors(image_size: tuple[int, int] = (640, 640)) -> np.ndarray:
"""Generate anchor boxes for a given image size (RetinaFace specific).

View File

@@ -2,9 +2,25 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
from dataclasses import dataclass
from enum import Enum
@dataclass(frozen=True, slots=True)
class ModelInfo:
"""Model metadata including download URL and SHA-256 hash.
Attributes:
url: Direct download link to the model weights.
sha256: SHA-256 checksum for integrity verification.
"""
url: str
sha256: str
# fmt: off
class SphereFaceWeights(str, Enum):
"""
@@ -166,125 +182,202 @@ class MiniFASNetWeights(str, Enum):
https://github.com/yakhyo/face-anti-spoofing
Model Variants:
- V1SE: Uses scale=4.0 for face crop (squeese-and-excitation version)
- V1SE: Uses scale=4.0 for face crop (squeeze-and-excitation version)
- V2: Uses scale=2.7 for face crop (improved version)
"""
V1SE = "minifasnet_v1se"
V2 = "minifasnet_v2"
MODEL_URLS: dict[Enum, str] = {
# Centralized Model Registry
MODEL_REGISTRY: dict[Enum, ModelInfo] = {
# RetinaFace
RetinaFaceWeights.MNET_025: 'https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1_0.25.onnx',
RetinaFaceWeights.MNET_050: 'https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1_0.50.onnx',
RetinaFaceWeights.MNET_V1: 'https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1.onnx',
RetinaFaceWeights.MNET_V2: 'https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv2.onnx',
RetinaFaceWeights.RESNET18: 'https://github.com/yakhyo/uniface/releases/download/weights/retinaface_r18.onnx',
RetinaFaceWeights.RESNET34: 'https://github.com/yakhyo/uniface/releases/download/weights/retinaface_r34.onnx',
RetinaFaceWeights.MNET_025: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1_0.25.onnx',
sha256='b7a7acab55e104dce6f32cdfff929bd83946da5cd869b9e2e9bdffafd1b7e4a5'
),
RetinaFaceWeights.MNET_050: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1_0.50.onnx',
sha256='d8977186f6037999af5b4113d42ba77a84a6ab0c996b17c713cc3d53b88bfc37'
),
RetinaFaceWeights.MNET_V1: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1.onnx',
sha256='75c961aaf0aff03d13c074e9ec656e5510e174454dd4964a161aab4fe5f04153'
),
RetinaFaceWeights.MNET_V2: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv2.onnx',
sha256='3ca44c045651cabeed1193a1fae8946ad1f3a55da8fa74b341feab5a8319f757'
),
RetinaFaceWeights.RESNET18: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/retinaface_r18.onnx',
sha256='e8b5ddd7d2c3c8f7c942f9f10cec09d8e319f78f09725d3f709631de34fb649d'
),
RetinaFaceWeights.RESNET34: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/retinaface_r34.onnx',
sha256='bd0263dc2a465d32859555cb1741f2d98991eb0053696e8ee33fec583d30e630'
),
# MobileFace
MobileFaceWeights.MNET_025: 'https://github.com/yakhyo/uniface/releases/download/weights/mobilenetv1_0.25.onnx',
MobileFaceWeights.MNET_V2: 'https://github.com/yakhyo/uniface/releases/download/weights/mobilenetv2.onnx',
MobileFaceWeights.MNET_V3_SMALL: 'https://github.com/yakhyo/uniface/releases/download/weights/mobilenetv3_small.onnx',
MobileFaceWeights.MNET_V3_LARGE: 'https://github.com/yakhyo/uniface/releases/download/weights/mobilenetv3_large.onnx',
MobileFaceWeights.MNET_025: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/mobilenetv1_0.25.onnx',
sha256='eeda7d23d9c2b40cf77fa8da8e895b5697465192648852216074679657f8ee8b'
),
MobileFaceWeights.MNET_V2: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/mobilenetv2.onnx',
sha256='38b148284dd48cc898d5d4453104252fbdcbacc105fe3f0b80e78954d9d20d89'
),
MobileFaceWeights.MNET_V3_SMALL: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/mobilenetv3_small.onnx',
sha256='d4acafa1039a82957aa8a9a1dac278a401c353a749c39df43de0e29cc1c127c3'
),
MobileFaceWeights.MNET_V3_LARGE: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/mobilenetv3_large.onnx',
sha256='0e48f8e11f070211716d03e5c65a3db35a5e917cfb5bc30552358629775a142a'
),
# SphereFace
SphereFaceWeights.SPHERE20: 'https://github.com/yakhyo/uniface/releases/download/weights/sphere20.onnx',
SphereFaceWeights.SPHERE36: 'https://github.com/yakhyo/uniface/releases/download/weights/sphere36.onnx',
SphereFaceWeights.SPHERE20: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/sphere20.onnx',
sha256='c02878cf658eb1861f580b7e7144b0d27cc29c440bcaa6a99d466d2854f14c9d'
),
SphereFaceWeights.SPHERE36: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/sphere36.onnx',
sha256='13b3890cd5d7dec2b63f7c36fd7ce07403e5a0bbb701d9647c0289e6cbe7bb20'
),
# ArcFace
ArcFaceWeights.MNET: 'https://github.com/yakhyo/uniface/releases/download/weights/w600k_mbf.onnx',
ArcFaceWeights.RESNET: 'https://github.com/yakhyo/uniface/releases/download/weights/w600k_r50.onnx',
ArcFaceWeights.MNET: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/w600k_mbf.onnx',
sha256='9cc6e4a75f0e2bf0b1aed94578f144d15175f357bdc05e815e5c4a02b319eb4f'
),
ArcFaceWeights.RESNET: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/w600k_r50.onnx',
sha256='4c06341c33c2ca1f86781dab0e829f88ad5b64be9fba56e56bc9ebdefc619e43'
),
# AdaFace
AdaFaceWeights.IR_18: 'https://github.com/yakhyo/adaface-onnx/releases/download/weights/adaface_ir_18.onnx',
AdaFaceWeights.IR_101: 'https://github.com/yakhyo/adaface-onnx/releases/download/weights/adaface_ir_101.onnx',
AdaFaceWeights.IR_18: ModelInfo(
url='https://github.com/yakhyo/adaface-onnx/releases/download/weights/adaface_ir_18.onnx',
sha256='6b6a35772fb636cdd4fa86520c1a259d0c41472a76f70f802b351837a00d9870'
),
AdaFaceWeights.IR_101: ModelInfo(
url='https://github.com/yakhyo/adaface-onnx/releases/download/weights/adaface_ir_101.onnx',
sha256='f2eb07d03de0af560a82e1214df799fec5e09375d43521e2868f9dc387e5a43e'
),
# SCRFD
SCRFDWeights.SCRFD_10G_KPS: 'https://github.com/yakhyo/uniface/releases/download/weights/scrfd_10g_kps.onnx',
SCRFDWeights.SCRFD_500M_KPS: 'https://github.com/yakhyo/uniface/releases/download/weights/scrfd_500m_kps.onnx',
SCRFDWeights.SCRFD_10G_KPS: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/scrfd_10g_kps.onnx',
sha256='5838f7fe053675b1c7a08b633df49e7af5495cee0493c7dcf6697200b85b5b91'
),
SCRFDWeights.SCRFD_500M_KPS: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/scrfd_500m_kps.onnx',
sha256='5e4447f50245bbd7966bd6c0fa52938c61474a04ec7def48753668a9d8b4ea3a'
),
# YOLOv5-Face
YOLOv5FaceWeights.YOLOV5N: 'https://github.com/yakhyo/yolov5-face-onnx-inference/releases/download/weights/yolov5n_face.onnx',
YOLOv5FaceWeights.YOLOV5S: 'https://github.com/yakhyo/yolov5-face-onnx-inference/releases/download/weights/yolov5s_face.onnx',
YOLOv5FaceWeights.YOLOV5M: 'https://github.com/yakhyo/yolov5-face-onnx-inference/releases/download/weights/yolov5m_face.onnx',
YOLOv5FaceWeights.YOLOV5N: ModelInfo(
url='https://github.com/yakhyo/yolov5-face-onnx-inference/releases/download/weights/yolov5n_face.onnx',
sha256='eb244a06e36999db732b317c2b30fa113cd6cfc1a397eaf738f2d6f33c01f640'
),
YOLOv5FaceWeights.YOLOV5S: ModelInfo(
url='https://github.com/yakhyo/yolov5-face-onnx-inference/releases/download/weights/yolov5s_face.onnx',
sha256='fc682801cd5880e1e296184a14aea0035486b5146ec1a1389d2e7149cb134bb2'
),
YOLOv5FaceWeights.YOLOV5M: ModelInfo(
url='https://github.com/yakhyo/yolov5-face-onnx-inference/releases/download/weights/yolov5m_face.onnx',
sha256='04302ce27a15bde3e20945691b688e2dd018a10e92dd8932146bede6a49207b2'
),
# YOLOv8-Face
YOLOv8FaceWeights.YOLOV8_LITE_S: 'https://github.com/yakhyo/yolov8-face-onnx-inference/releases/download/weights/yolov8-lite-s.onnx',
YOLOv8FaceWeights.YOLOV8N: 'https://github.com/yakhyo/yolov8-face-onnx-inference/releases/download/weights/yolov8n-face.onnx',
YOLOv8FaceWeights.YOLOV8_LITE_S: ModelInfo(
url='https://github.com/yakhyo/yolov8-face-onnx-inference/releases/download/weights/yolov8-lite-s.onnx',
sha256='11bc496be01356d2d960085bfd8abb8f103199900a034f239a8a1705a1b31dba'
),
YOLOv8FaceWeights.YOLOV8N: ModelInfo(
url='https://github.com/yakhyo/yolov8-face-onnx-inference/releases/download/weights/yolov8n-face.onnx',
sha256='33f3951af7fc0c4d9b321b29cdcd8c9a59d0a29a8d4bdc01fcb5507d5c714809'
),
# DDAFM
DDAMFNWeights.AFFECNET7: 'https://github.com/yakhyo/uniface/releases/download/weights/affecnet7.script',
DDAMFNWeights.AFFECNET8: 'https://github.com/yakhyo/uniface/releases/download/weights/affecnet8.script',
DDAMFNWeights.AFFECNET7: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/affecnet7.script',
sha256='10535bf8b6afe8e9d6ae26cea6c3add9a93036e9addb6adebfd4a972171d015d'
),
DDAMFNWeights.AFFECNET8: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/affecnet8.script',
sha256='8c66963bc71db42796a14dfcbfcd181b268b65a3fc16e87147d6a3a3d7e0f487'
),
# AgeGender
AgeGenderWeights.DEFAULT: 'https://github.com/yakhyo/uniface/releases/download/weights/genderage.onnx',
AgeGenderWeights.DEFAULT: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/genderage.onnx',
sha256='4fde69b1c810857b88c64a335084f1c3fe8f01246c9a191b48c7bb756d6652fb'
),
# FairFace
FairFaceWeights.DEFAULT: 'https://github.com/yakhyo/fairface-onnx/releases/download/weights/fairface.onnx',
FairFaceWeights.DEFAULT: ModelInfo(
url='https://github.com/yakhyo/fairface-onnx/releases/download/weights/fairface.onnx',
sha256='9c8c47d437cd310538d233f2465f9ed0524cb7fb51882a37f74e8bc22437fdbf'
),
# Landmarks
LandmarkWeights.DEFAULT: 'https://github.com/yakhyo/uniface/releases/download/weights/2d106det.onnx',
LandmarkWeights.DEFAULT: ModelInfo(
url='https://github.com/yakhyo/uniface/releases/download/weights/2d106det.onnx',
sha256='f001b856447c413801ef5c42091ed0cd516fcd21f2d6b79635b1e733a7109dbf'
),
# Gaze (MobileGaze)
GazeWeights.RESNET18: 'https://github.com/yakhyo/gaze-estimation/releases/download/weights/resnet18_gaze.onnx',
GazeWeights.RESNET34: 'https://github.com/yakhyo/gaze-estimation/releases/download/weights/resnet34_gaze.onnx',
GazeWeights.RESNET50: 'https://github.com/yakhyo/gaze-estimation/releases/download/weights/resnet50_gaze.onnx',
GazeWeights.MOBILENET_V2: 'https://github.com/yakhyo/gaze-estimation/releases/download/weights/mobilenetv2_gaze.onnx',
GazeWeights.MOBILEONE_S0: 'https://github.com/yakhyo/gaze-estimation/releases/download/weights/mobileone_s0_gaze.onnx',
GazeWeights.RESNET18: ModelInfo(
url='https://github.com/yakhyo/gaze-estimation/releases/download/weights/resnet18_gaze.onnx',
sha256='404fec1efd07ff49f981e47f461c20c2627119e465ec441bbd1c067d3f16e657'
),
GazeWeights.RESNET34: ModelInfo(
url='https://github.com/yakhyo/gaze-estimation/releases/download/weights/resnet34_gaze.onnx',
sha256='c8e6b14f6095d2425241b9302aa663d9a23b7dfb9d43941352b718c91dc7f2cf'
),
GazeWeights.RESNET50: ModelInfo(
url='https://github.com/yakhyo/gaze-estimation/releases/download/weights/resnet50_gaze.onnx',
sha256='bb28d421565adc4dfb665742f8fc80bdef36dd8caa0c87e040e0937f9fdca9a6'
),
GazeWeights.MOBILENET_V2: ModelInfo(
url='https://github.com/yakhyo/gaze-estimation/releases/download/weights/mobilenetv2_gaze.onnx',
sha256='b81312df85c7ac1c1b5f78c573620d22c2719cb839650e15f12dc7eecb7744a4'
),
GazeWeights.MOBILEONE_S0: ModelInfo(
url='https://github.com/yakhyo/gaze-estimation/releases/download/weights/mobileone_s0_gaze.onnx',
sha256='8b4fdc4e3da44733c9a82e7776b411e4a39f94e8e285aee0fc85a548a55f7d9f'
),
# Parsing
ParsingWeights.RESNET18: 'https://github.com/yakhyo/face-parsing/releases/download/weights/resnet18.onnx',
ParsingWeights.RESNET34: 'https://github.com/yakhyo/face-parsing/releases/download/weights/resnet34.onnx',
ParsingWeights.RESNET18: ModelInfo(
url='https://github.com/yakhyo/face-parsing/releases/download/weights/resnet18.onnx',
sha256='0d9bd318e46987c3bdbfacae9e2c0f461cae1c6ac6ea6d43bbe541a91727e33f'
),
ParsingWeights.RESNET34: ModelInfo(
url='https://github.com/yakhyo/face-parsing/releases/download/weights/resnet34.onnx',
sha256='5b805bba7b5660ab7070b5a381dcf75e5b3e04199f1e9387232a77a00095102e'
),
# Anti-Spoofing (MiniFASNet)
MiniFASNetWeights.V1SE: 'https://github.com/yakhyo/face-anti-spoofing/releases/download/weights/MiniFASNetV1SE.onnx',
MiniFASNetWeights.V2: 'https://github.com/yakhyo/face-anti-spoofing/releases/download/weights/MiniFASNetV2.onnx',
MiniFASNetWeights.V1SE: ModelInfo(
url='https://github.com/yakhyo/face-anti-spoofing/releases/download/weights/MiniFASNetV1SE.onnx',
sha256='ebab7f90c7833fbccd46d3a555410e78d969db5438e169b6524be444862b3676'
),
MiniFASNetWeights.V2: ModelInfo(
url='https://github.com/yakhyo/face-anti-spoofing/releases/download/weights/MiniFASNetV2.onnx',
sha256='b32929adc2d9c34b9486f8c4c7bc97c1b69bc0ea9befefc380e4faae4e463907'
),
# XSeg
XSegWeights.DEFAULT: 'https://github.com/yakhyo/face-segmentation/releases/download/weights/xseg.onnx',
XSegWeights.DEFAULT: ModelInfo(
url='https://github.com/yakhyo/face-segmentation/releases/download/weights/xseg.onnx',
sha256='0b57328efcb839d85973164b617ceee9dfe6cfcb2c82e8a033bba9f4f09b27e5'
),
}
MODEL_SHA256: dict[Enum, str] = {
# RetinaFace
RetinaFaceWeights.MNET_025: 'b7a7acab55e104dce6f32cdfff929bd83946da5cd869b9e2e9bdffafd1b7e4a5',
RetinaFaceWeights.MNET_050: 'd8977186f6037999af5b4113d42ba77a84a6ab0c996b17c713cc3d53b88bfc37',
RetinaFaceWeights.MNET_V1: '75c961aaf0aff03d13c074e9ec656e5510e174454dd4964a161aab4fe5f04153',
RetinaFaceWeights.MNET_V2: '3ca44c045651cabeed1193a1fae8946ad1f3a55da8fa74b341feab5a8319f757',
RetinaFaceWeights.RESNET18: 'e8b5ddd7d2c3c8f7c942f9f10cec09d8e319f78f09725d3f709631de34fb649d',
RetinaFaceWeights.RESNET34: 'bd0263dc2a465d32859555cb1741f2d98991eb0053696e8ee33fec583d30e630',
# MobileFace
MobileFaceWeights.MNET_025: 'eeda7d23d9c2b40cf77fa8da8e895b5697465192648852216074679657f8ee8b',
MobileFaceWeights.MNET_V2: '38b148284dd48cc898d5d4453104252fbdcbacc105fe3f0b80e78954d9d20d89',
MobileFaceWeights.MNET_V3_SMALL: 'd4acafa1039a82957aa8a9a1dac278a401c353a749c39df43de0e29cc1c127c3',
MobileFaceWeights.MNET_V3_LARGE: '0e48f8e11f070211716d03e5c65a3db35a5e917cfb5bc30552358629775a142a',
# SphereFace
SphereFaceWeights.SPHERE20: 'c02878cf658eb1861f580b7e7144b0d27cc29c440bcaa6a99d466d2854f14c9d',
SphereFaceWeights.SPHERE36: '13b3890cd5d7dec2b63f7c36fd7ce07403e5a0bbb701d9647c0289e6cbe7bb20',
# ArcFace
ArcFaceWeights.MNET: '9cc6e4a75f0e2bf0b1aed94578f144d15175f357bdc05e815e5c4a02b319eb4f',
ArcFaceWeights.RESNET: '4c06341c33c2ca1f86781dab0e829f88ad5b64be9fba56e56bc9ebdefc619e43',
# AdaFace
AdaFaceWeights.IR_18: '6b6a35772fb636cdd4fa86520c1a259d0c41472a76f70f802b351837a00d9870',
AdaFaceWeights.IR_101: 'f2eb07d03de0af560a82e1214df799fec5e09375d43521e2868f9dc387e5a43e',
# SCRFD
SCRFDWeights.SCRFD_10G_KPS: '5838f7fe053675b1c7a08b633df49e7af5495cee0493c7dcf6697200b85b5b91',
SCRFDWeights.SCRFD_500M_KPS: '5e4447f50245bbd7966bd6c0fa52938c61474a04ec7def48753668a9d8b4ea3a',
# YOLOv5-Face
YOLOv5FaceWeights.YOLOV5N: 'eb244a06e36999db732b317c2b30fa113cd6cfc1a397eaf738f2d6f33c01f640',
YOLOv5FaceWeights.YOLOV5S: 'fc682801cd5880e1e296184a14aea0035486b5146ec1a1389d2e7149cb134bb2',
YOLOv5FaceWeights.YOLOV5M: '04302ce27a15bde3e20945691b688e2dd018a10e92dd8932146bede6a49207b2',
# YOLOv8-Face
YOLOv8FaceWeights.YOLOV8_LITE_S: '11bc496be01356d2d960085bfd8abb8f103199900a034f239a8a1705a1b31dba',
YOLOv8FaceWeights.YOLOV8N: '33f3951af7fc0c4d9b321b29cdcd8c9a59d0a29a8d4bdc01fcb5507d5c714809',
# DDAFM
DDAMFNWeights.AFFECNET7: '10535bf8b6afe8e9d6ae26cea6c3add9a93036e9addb6adebfd4a972171d015d',
DDAMFNWeights.AFFECNET8: '8c66963bc71db42796a14dfcbfcd181b268b65a3fc16e87147d6a3a3d7e0f487',
# AgeGender
AgeGenderWeights.DEFAULT: '4fde69b1c810857b88c64a335084f1c3fe8f01246c9a191b48c7bb756d6652fb',
# FairFace
FairFaceWeights.DEFAULT: '9c8c47d437cd310538d233f2465f9ed0524cb7fb51882a37f74e8bc22437fdbf',
# Landmark
LandmarkWeights.DEFAULT: 'f001b856447c413801ef5c42091ed0cd516fcd21f2d6b79635b1e733a7109dbf',
# MobileGaze (trained on Gaze360)
GazeWeights.RESNET18: '23d5d7e4f6f40dce8c35274ce9d08b45b9e22cbaaf5af73182f473229d713d31',
GazeWeights.RESNET34: '4457ee5f7acd1a5ab02da4b61f02fc3a0b17adbf3844dd0ba3cd4288f2b5e1de',
GazeWeights.RESNET50: 'e1eaf98f5ec7c89c6abe7cfe39f7be83e747163f98d1ff945c0603b3c521be22',
GazeWeights.MOBILENET_V2: 'fdcdb84e3e6421b5a79e8f95139f249fc258d7f387eed5ddac2b80a9a15ce076',
GazeWeights.MOBILEONE_S0: 'c0b5a4f4a0ffd24f76ab3c1452354bb2f60110899fd9a88b464c75bafec0fde8',
# Face Parsing
ParsingWeights.RESNET18: '0d9bd318e46987c3bdbfacae9e2c0f461cae1c6ac6ea6d43bbe541a91727e33f',
ParsingWeights.RESNET34: '5b805bba7b5660ab7070b5a381dcf75e5b3e04199f1e9387232a77a00095102e',
# Anti-Spoofing (MiniFASNet)
MiniFASNetWeights.V1SE: 'ebab7f90c7833fbccd46d3a555410e78d969db5438e169b6524be444862b3676',
MiniFASNetWeights.V2: 'b32929adc2d9c34b9486f8c4c7bc97c1b69bc0ea9befefc380e4faae4e463907',
# XSeg
XSegWeights.DEFAULT: '0b57328efcb839d85973164b617ceee9dfe6cfcb2c82e8a033bba9f4f09b27e5',
}
# Backward compatibility (optional, can be removed if all code uses MODEL_REGISTRY)
MODEL_URLS: dict[Enum, str] = {k: v.url for k, v in MODEL_REGISTRY.items()}
MODEL_SHA256: dict[Enum, str] = {k: v.sha256 for k, v in MODEL_REGISTRY.items()}
CHUNK_SIZE = 8192

View File

@@ -6,9 +6,12 @@ from __future__ import annotations
from typing import Any
import numpy as np
from uniface.types import Face
from uniface.constants import (
RetinaFaceWeights,
SCRFDWeights,
YOLOv5FaceWeights,
YOLOv8FaceWeights,
)
from .base import BaseDetector
from .retinaface import RetinaFace
@@ -16,48 +19,6 @@ from .scrfd import SCRFD
from .yolov5 import YOLOv5Face
from .yolov8 import YOLOv8Face
# Global cache for detector instances (keyed by method name + config hash)
_detector_cache: dict[str, BaseDetector] = {}
def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs: Any) -> list[Face]:
"""High-level face detection function.
Detects faces in an image using the specified detection method.
Results are cached for repeated calls with the same configuration.
Args:
image: Input image as numpy array with shape (H, W, C) in BGR format.
method: Detection method to use. Options: 'retinaface', 'scrfd', 'yolov5face', 'yolov8face'.
**kwargs: Additional arguments passed to the detector.
Returns:
A list of Face objects, each containing:
- bbox: [x1, y1, x2, y2] bounding box coordinates.
- confidence: The confidence score of the detection.
- landmarks: 5-point facial landmarks with shape (5, 2).
Example:
>>> from uniface import detect_faces
>>> import cv2
>>> image = cv2.imread('your_image.jpg')
>>> faces = detect_faces(image, method='retinaface', confidence_threshold=0.8)
>>> for face in faces:
... print(f'Found face with confidence: {face.confidence}')
... print(f'BBox: {face.bbox}')
"""
method_name = method.lower()
sorted_kwargs = sorted(kwargs.items())
cache_key = f'{method_name}_{sorted_kwargs!s}'
if cache_key not in _detector_cache:
# Pass kwargs to create the correctly configured detector
_detector_cache[cache_key] = create_detector(method, **kwargs)
detector = _detector_cache[cache_key]
return detector.detect(image)
def create_detector(method: str = 'retinaface', **kwargs: Any) -> BaseDetector:
"""Factory function to create face detectors.
@@ -122,7 +83,7 @@ def list_available_detectors() -> dict[str, dict[str, Any]]:
'supports_landmarks': True,
'paper': 'https://arxiv.org/abs/1905.00641',
'default_params': {
'model_name': 'mnet_v2',
'model_name': RetinaFaceWeights.MNET_V2.value,
'confidence_threshold': 0.5,
'nms_threshold': 0.4,
'input_size': (640, 640),
@@ -133,7 +94,7 @@ def list_available_detectors() -> dict[str, dict[str, Any]]:
'supports_landmarks': True,
'paper': 'https://arxiv.org/abs/2105.04714',
'default_params': {
'model_name': 'scrfd_10g_kps',
'model_name': SCRFDWeights.SCRFD_10G_KPS.value,
'confidence_threshold': 0.5,
'nms_threshold': 0.4,
'input_size': (640, 640),
@@ -144,9 +105,9 @@ def list_available_detectors() -> dict[str, dict[str, Any]]:
'supports_landmarks': True,
'paper': 'https://arxiv.org/abs/2105.12931',
'default_params': {
'model_name': 'yolov5s_face',
'confidence_threshold': 0.25,
'nms_threshold': 0.45,
'model_name': YOLOv5FaceWeights.YOLOV5S.value,
'confidence_threshold': 0.6,
'nms_threshold': 0.5,
'input_size': 640,
},
},
@@ -155,7 +116,7 @@ def list_available_detectors() -> dict[str, dict[str, Any]]:
'supports_landmarks': True,
'paper': 'https://github.com/derronqi/yolov8-face',
'default_params': {
'model_name': 'yolov8n_face',
'model_name': YOLOv8FaceWeights.YOLOV8N.value,
'confidence_threshold': 0.5,
'nms_threshold': 0.45,
'input_size': 640,
@@ -171,6 +132,5 @@ __all__ = [
'YOLOv5Face',
'YOLOv8Face',
'create_detector',
'detect_faces',
'list_available_detectors',
]

475
uniface/draw.py Normal file
View File

@@ -0,0 +1,475 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
import colorsys
from typing import TYPE_CHECKING
import cv2
import numpy as np
if TYPE_CHECKING:
from uniface.types import Face
__all__ = [
'FACE_PARSING_COLORS',
'FACE_PARSING_LABELS',
'calculate_optimal_line_thickness',
'calculate_optimal_text_scale',
'draw_corner_bbox',
'draw_detections',
'draw_gaze',
'draw_text_label',
'draw_tracks',
'vis_parsing_maps',
]
# Face parsing component names (19 classes)
FACE_PARSING_LABELS = [
'background',
'skin',
'l_brow',
'r_brow',
'l_eye',
'r_eye',
'eye_g',
'l_ear',
'r_ear',
'ear_r',
'nose',
'mouth',
'u_lip',
'l_lip',
'neck',
'neck_l',
'cloth',
'hair',
'hat',
]
# Color palette for face parsing visualization
FACE_PARSING_COLORS = [
[0, 0, 0],
[255, 85, 0],
[255, 170, 0],
[255, 0, 85],
[255, 0, 170],
[0, 255, 0],
[85, 255, 0],
[170, 255, 0],
[0, 255, 85],
[0, 255, 170],
[0, 0, 255],
[85, 0, 255],
[170, 0, 255],
[0, 85, 255],
[0, 170, 255],
[255, 255, 0],
[255, 255, 85],
[255, 255, 170],
[255, 0, 255],
]
# Per-point colors for 5-point facial landmarks (BGR)
_LANDMARK_COLORS = (
(0, 0, 255),
(0, 255, 255),
(255, 0, 255),
(0, 255, 0),
(255, 0, 0),
)
def _get_color(idx: int) -> tuple[int, int, int]:
"""Get a visually distinct BGR color for a given index.
Uses golden-ratio hue stepping in HSV space to maximize perceptual
separation between consecutive indices. Works for any non-negative index.
Args:
idx: Non-negative integer index (e.g. track ID).
Returns:
BGR color tuple suitable for OpenCV drawing functions.
"""
golden_ratio = 0.618033988749895
hue = (idx * golden_ratio) % 1.0
# HSV -> RGB with fixed saturation=0.85 and value=0.95 for vivid colors
r, g, b = colorsys.hsv_to_rgb(hue, 0.85, 0.95)
return int(b * 255), int(g * 255), int(r * 255)
def calculate_optimal_line_thickness(resolution_wh: tuple[int, int]) -> int:
"""Calculate adaptive line thickness based on image resolution.
Args:
resolution_wh: Image resolution as ``(width, height)``.
Returns:
Recommended line thickness in pixels.
Example:
>>> calculate_optimal_line_thickness((1920, 1080))
4
>>> calculate_optimal_line_thickness((640, 480))
2
"""
return max(round(sum(resolution_wh) / 2 * 0.003), 2)
def calculate_optimal_text_scale(resolution_wh: tuple[int, int]) -> float:
"""Calculate adaptive font scale based on image resolution.
Args:
resolution_wh: Image resolution as ``(width, height)``.
Returns:
Recommended font scale factor.
Example:
>>> calculate_optimal_text_scale((1920, 1080))
1.08
>>> calculate_optimal_text_scale((640, 480))
0.48
"""
return min(resolution_wh) * 1e-3
def draw_corner_bbox(
image: np.ndarray,
bbox: np.ndarray,
color: tuple[int, int, int] = (0, 255, 0),
thickness: int = 3,
proportion: float = 0.2,
) -> None:
"""Draw a bounding box with corner brackets on an image.
Draws a thin full rectangle with thick corner accents, commonly used in
face-detection overlays for a clean look.
Args:
image: Input image to draw on (modified in-place).
bbox: Bounding box in xyxy format ``[x1, y1, x2, y2]``.
color: BGR color of the box. Defaults to green ``(0, 255, 0)``.
thickness: Thickness of corner bracket lines. Defaults to 3.
proportion: Corner length as a fraction of the shorter side.
Defaults to 0.2.
"""
x1, y1, x2, y2 = map(int, bbox)
corner_length = int(proportion * min(x2 - x1, y2 - y1))
# Thin full rectangle
cv2.rectangle(image, (x1, y1), (x2, y2), color, 1)
# Top-left corner
cv2.line(image, (x1, y1), (x1 + corner_length, y1), color, thickness)
cv2.line(image, (x1, y1), (x1, y1 + corner_length), color, thickness)
# Top-right corner
cv2.line(image, (x2, y1), (x2 - corner_length, y1), color, thickness)
cv2.line(image, (x2, y1), (x2, y1 + corner_length), color, thickness)
# Bottom-left corner
cv2.line(image, (x1, y2), (x1, y2 - corner_length), color, thickness)
cv2.line(image, (x1, y2), (x1 + corner_length, y2), color, thickness)
# Bottom-right corner
cv2.line(image, (x2, y2), (x2, y2 - corner_length), color, thickness)
cv2.line(image, (x2, y2), (x2 - corner_length, y2), color, thickness)
def draw_text_label(
image: np.ndarray,
text: str,
x: int,
y: int,
bg_color: tuple[int, int, int],
text_color: tuple[int, int, int] = (255, 255, 255),
font_scale: float = 0.5,
font_thickness: int = 2,
padding: int = 5,
) -> None:
"""Draw text with a filled background rectangle above a given position.
The label is placed so that its bottom edge sits at *y*, making it
suitable for positioning above a bounding box top-left corner.
Args:
image: Input image to draw on (modified in-place).
text: The text string to render.
x: Left x-coordinate for the label.
y: Bottom y-coordinate for the label (e.g. ``bbox[1]``).
bg_color: BGR background fill color.
text_color: BGR text color. Defaults to white.
font_scale: OpenCV font scale factor. Defaults to 0.5.
font_thickness: OpenCV font thickness. Defaults to 2.
padding: Pixel padding around the text. Defaults to 5.
"""
(tw, th), baseline = cv2.getTextSize(text, cv2.FONT_HERSHEY_SIMPLEX, font_scale, font_thickness)
cv2.rectangle(
image,
(x, y - th - baseline - padding * 2),
(x + tw + padding * 2, y),
bg_color,
-1,
)
cv2.putText(
image,
text,
(x + padding, y - padding),
cv2.FONT_HERSHEY_SIMPLEX,
font_scale,
text_color,
font_thickness,
)
def draw_detections(
*,
image: np.ndarray,
bboxes: list[np.ndarray] | list[list[float]],
scores: np.ndarray | list[float],
landmarks: list[np.ndarray] | list[list[list[float]]],
vis_threshold: float = 0.6,
draw_score: bool = False,
corner_bbox: bool = True,
) -> None:
"""Draw bounding boxes, landmarks, and optional scores on an image.
Modifies the image in-place.
Args:
image: Input image to draw on (modified in-place).
bboxes: List of bounding boxes in xyxy format ``[x1, y1, x2, y2]``.
scores: List of confidence scores.
landmarks: List of landmark sets with shape ``(5, 2)``.
vis_threshold: Confidence threshold for filtering. Defaults to 0.6.
draw_score: Whether to draw confidence scores. Defaults to False.
corner_bbox: Use corner-style bounding boxes. Defaults to True.
"""
# Adaptive line thickness
line_thickness = max(round(sum(image.shape[:2]) / 2 * 0.003), 2)
for i, score in enumerate(scores):
if score < vis_threshold:
continue
bbox = np.array(bboxes[i], dtype=np.int32)
# Draw bounding box
if corner_bbox:
draw_corner_bbox(image, bbox, color=(0, 255, 0), thickness=line_thickness, proportion=0.2)
else:
cv2.rectangle(image, tuple(bbox[:2]), tuple(bbox[2:]), (0, 255, 0), line_thickness)
# Draw confidence score label
if draw_score:
font_scale = max(0.4, min(0.7, (bbox[3] - bbox[1]) / 200))
draw_text_label(
image,
f'{score:.2f}',
bbox[0],
bbox[1],
bg_color=(0, 255, 0),
text_color=(0, 0, 0),
font_scale=font_scale,
)
# Draw landmarks
landmark_set = np.array(landmarks[i], dtype=np.int32)
for j, point in enumerate(landmark_set):
cv2.circle(image, tuple(point), line_thickness + 1, _LANDMARK_COLORS[j % len(_LANDMARK_COLORS)], -1)
def draw_gaze(
image: np.ndarray,
bbox: np.ndarray,
pitch: np.ndarray | float,
yaw: np.ndarray | float,
*,
draw_bbox: bool = True,
corner_bbox: bool = True,
draw_angles: bool = True,
) -> None:
"""Draw gaze direction with optional bounding box on an image.
Modifies the image in-place.
Args:
image: Input image to draw on (modified in-place).
bbox: Face bounding box in xyxy format ``[x1, y1, x2, y2]``.
pitch: Vertical gaze angle in radians.
yaw: Horizontal gaze angle in radians.
draw_bbox: Whether to draw the bounding box. Defaults to True.
corner_bbox: Use corner-style bounding box. Defaults to True.
draw_angles: Whether to display pitch/yaw values as text. Defaults to True.
"""
x_min, y_min, x_max, y_max = map(int, bbox[:4])
# Adaptive line thickness
line_thickness = max(round(sum(image.shape[:2]) / 2 * 0.003), 2)
# Draw bounding box if requested
if draw_bbox:
if corner_bbox:
draw_corner_bbox(image, bbox, color=(0, 255, 0), thickness=line_thickness)
else:
cv2.rectangle(image, (x_min, y_min), (x_max, y_max), (0, 255, 0), line_thickness)
# Calculate center of the bounding box
x_center = (x_min + x_max) // 2
y_center = (y_min + y_max) // 2
# Calculate the direction of the gaze
length = x_max - x_min
dx = int(-length * np.sin(yaw) * np.cos(pitch))
dy = int(-length * np.sin(pitch))
# Draw gaze arrow
center_radius = max(line_thickness + 1, 4)
cv2.circle(image, (x_center, y_center), radius=center_radius, color=(0, 0, 255), thickness=-1)
cv2.arrowedLine(
image,
(x_center, y_center),
(x_center + dx, y_center + dy),
color=(0, 0, 255),
thickness=line_thickness,
line_type=cv2.LINE_AA,
tipLength=0.25,
)
# Draw angle values
if draw_angles:
font_scale = max(0.4, min(0.7, (y_max - y_min) / 200))
draw_text_label(
image,
f'P:{np.degrees(pitch):.0f}deg Y:{np.degrees(yaw):.0f}deg',
x_min,
y_min,
bg_color=(0, 0, 255),
text_color=(255, 255, 255),
font_scale=font_scale,
)
def draw_tracks(
*,
image: np.ndarray,
faces: list[Face],
draw_landmarks: bool = True,
draw_id: bool = True,
corner_bbox: bool = True,
) -> None:
"""Draw tracked faces with color-coded track IDs on an image.
Each track ID is assigned a deterministic color for consistent visualization
across frames. Faces without a ``track_id`` are drawn in gray.
Modifies the image in-place.
Args:
image: Input image to draw on (modified in-place).
faces: List of Face objects (with ``track_id`` assigned by BYTETracker).
draw_landmarks: Whether to draw facial landmarks. Defaults to True.
draw_id: Whether to draw track ID labels. Defaults to True.
corner_bbox: Use corner-style bounding boxes. Defaults to True.
Example:
>>> from uniface import BYTETracker, RetinaFace
>>> from uniface.draw import draw_tracks
>>> detector = RetinaFace()
>>> tracker = BYTETracker()
>>> draw_tracks(image=frame, faces=faces)
"""
untracked_color = (128, 128, 128)
# Adaptive line thickness
line_thickness = max(round(sum(image.shape[:2]) / 2 * 0.003), 2)
for face in faces:
bbox = np.array(face.bbox, dtype=np.int32)
track_id = face.track_id
# Pick color based on track ID
color = _get_color(track_id) if track_id is not None else untracked_color
# Draw bounding box
if corner_bbox:
draw_corner_bbox(image, bbox, color=color, thickness=line_thickness, proportion=0.2)
else:
cv2.rectangle(image, tuple(bbox[:2]), tuple(bbox[2:]), color, line_thickness)
# Draw track ID label
if draw_id and track_id is not None:
font_scale = max(0.4, min(0.7, (bbox[3] - bbox[1]) / 200))
draw_text_label(
image,
f'ID:{track_id}',
bbox[0],
bbox[1],
bg_color=color,
font_scale=font_scale,
)
# Draw landmarks
if draw_landmarks and face.landmarks is not None:
landmark_set = np.array(face.landmarks, dtype=np.int32)
for j, point in enumerate(landmark_set):
cv2.circle(image, tuple(point), line_thickness + 1, _LANDMARK_COLORS[j % len(_LANDMARK_COLORS)], -1)
def vis_parsing_maps(
image: np.ndarray,
segmentation_mask: np.ndarray,
*,
save_image: bool = False,
save_path: str = 'result.png',
) -> np.ndarray:
"""Visualize face parsing segmentation mask by overlaying colored regions.
Args:
image: Input face image in RGB format with shape ``(H, W, 3)``.
segmentation_mask: Segmentation mask with shape ``(H, W)`` where each
pixel value represents a facial component class (0-18).
save_image: Whether to save the visualization to disk. Defaults to False.
save_path: Path to save the visualization if *save_image* is True.
Returns:
Blended image with segmentation overlay in BGR format.
Example:
>>> import cv2
>>> from uniface.parsing import BiSeNet
>>> from uniface.draw import vis_parsing_maps
>>> parser = BiSeNet()
>>> face_image = cv2.imread('face.jpg')
>>> mask = parser.parse(face_image)
>>> face_rgb = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
>>> result = vis_parsing_maps(face_rgb, mask)
>>> cv2.imwrite('parsed_face.jpg', result)
"""
image = np.array(image).copy().astype(np.uint8)
segmentation_mask = segmentation_mask.copy().astype(np.uint8)
# Create a color mask
segmentation_mask_color = np.zeros((segmentation_mask.shape[0], segmentation_mask.shape[1], 3))
num_classes = np.max(segmentation_mask)
for class_index in range(1, num_classes + 1):
class_pixels = np.where(segmentation_mask == class_index)
segmentation_mask_color[class_pixels[0], class_pixels[1], :] = FACE_PARSING_COLORS[class_index]
segmentation_mask_color = segmentation_mask_color.astype(np.uint8)
# Convert image to BGR format for blending
bgr_image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
blended_image = cv2.addWeighted(bgr_image, 0.6, segmentation_mask_color, 0.4, 0)
if save_image:
cv2.imwrite(save_path, blended_image, [int(cv2.IMWRITE_JPEG_QUALITY), 100])
return blended_image

View File

@@ -31,7 +31,7 @@ def create_gaze_estimator(method: str = 'mobilegaze', **kwargs) -> BaseGazeEstim
ValueError: If the specified `method` is not supported.
Examples:
>>> # Create the default MobileGaze estimator (ResNet18 backbone)
>>> # Create the default MobileGaze estimator (ResNet34 backbone)
>>> estimator = create_gaze_estimator()
>>> # Create with MobileNetV2 backbone

View File

@@ -106,7 +106,7 @@ class MobileGaze(BaseGazeEstimator):
self.output_names = [output.name for output in outputs]
if len(self.output_names) != 2:
raise ValueError(f'Expected 2 output nodes (pitch, yaw), got {len(self.output_names)}')
raise ValueError(f'Expected 2 output nodes (yaw, pitch), got {len(self.output_names)}')
Logger.info(f'MobileGaze initialized with input size {self.input_size}')
@@ -161,19 +161,19 @@ class MobileGaze(BaseGazeEstimator):
Returns:
GazeResult: Result containing pitch and yaw angles in radians.
"""
pitch_logits, yaw_logits = outputs
yaw_logits, pitch_logits = outputs
# Convert logits to probabilities
pitch_probs = self._softmax(pitch_logits)
yaw_probs = self._softmax(yaw_logits)
pitch_probs = self._softmax(pitch_logits)
# Compute expected bin index (soft-argmax)
pitch_deg = np.sum(pitch_probs * self._idx_tensor, axis=1) * self._binwidth - self._angle_offset
yaw_deg = np.sum(yaw_probs * self._idx_tensor, axis=1) * self._binwidth - self._angle_offset
pitch_deg = np.sum(pitch_probs * self._idx_tensor, axis=1) * self._binwidth - self._angle_offset
# Convert degrees to radians
pitch = float(np.radians(pitch_deg[0]))
yaw = float(np.radians(yaw_deg[0]))
pitch = float(np.radians(pitch_deg[0]))
return GazeResult(pitch=pitch, yaw=yaw)

View File

@@ -0,0 +1,9 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Vector indexing backends for fast similarity search."""
from uniface.indexing.faiss import FAISS
__all__ = ['FAISS']

197
uniface/indexing/faiss.py Normal file
View File

@@ -0,0 +1,197 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
import json
import os
from typing import Any
import numpy as np
from uniface.log import Logger
__all__ = ['FAISS']
Metadata = dict[str, Any]
def _import_faiss():
"""Lazily import faiss, raising a clear error if not installed."""
# Prevent OpenMP abort on macOS when multiple libraries (e.g. scipy,
# torch) each bundle their own libomp.
os.environ.setdefault('KMP_DUPLICATE_LIB_OK', 'TRUE')
try:
import faiss
except ImportError as exc:
raise ImportError(
'faiss is required for FAISS vector store. '
'Install it with: pip install faiss-cpu (CPU) '
'or: pip install faiss-gpu (CUDA)'
) from exc
return faiss
class FAISS:
"""FAISS vector store using IndexFlatIP (inner product).
Vectors must be L2-normalised **before** being added so that inner
product equals cosine similarity. The store does not normalise
internally -- that is the caller's responsibility.
Each vector is paired with a metadata dict that can carry any
JSON-serialisable payload (person ID, name, source image, etc.).
Args:
embedding_size: Dimension of embedding vectors.
db_path: Directory for persisting the index and metadata.
Example:
>>> from uniface.indexing import FAISS
>>> store = FAISS(embedding_size=512, db_path='./my_index')
>>> store.add(embedding, {'person_id': '001', 'name': 'Alice'})
>>> result, score = store.search(query_embedding)
>>> result['name']
'Alice'
"""
def __init__(
self,
embedding_size: int = 512,
db_path: str = './vector_index',
) -> None:
faiss = _import_faiss()
self.embedding_size = embedding_size
self.db_path = db_path
self._index_file = os.path.join(db_path, 'faiss_index.bin')
self._meta_file = os.path.join(db_path, 'metadata.json')
os.makedirs(db_path, exist_ok=True)
self.index = faiss.IndexFlatIP(embedding_size)
self.metadata: list[Metadata] = []
def add(self, embedding: np.ndarray, metadata: Metadata) -> None:
"""Add a single embedding with associated metadata.
Args:
embedding: Embedding vector (must be L2-normalised).
metadata: Arbitrary dict of JSON-serialisable key-value pairs.
"""
vec = self._prepare(embedding).reshape(1, -1)
self.index.add(vec)
self.metadata.append(metadata)
def search(
self,
embedding: np.ndarray,
threshold: float = 0.4,
) -> tuple[Metadata | None, float]:
"""Find the closest match for a query embedding.
Args:
embedding: Query embedding vector (must be L2-normalised).
threshold: Minimum cosine similarity to accept a match.
Returns:
``(metadata, similarity)`` for the best match, or
``(None, similarity)`` when below *threshold* or the
index is empty.
"""
if self.index.ntotal == 0:
return None, 0.0
vec = self._prepare(embedding).reshape(1, -1)
similarities, indices = self.index.search(vec, 1)
similarity = float(similarities[0][0])
idx = int(indices[0][0])
if similarity > threshold and 0 <= idx < len(self.metadata):
return self.metadata[idx], similarity
return None, similarity
def remove(self, key: str, value: Any) -> int:
"""Remove all entries where ``metadata[key] == value`` and rebuild.
Args:
key: Metadata key to match against.
value: Value to match.
Returns:
Number of entries removed.
"""
faiss = _import_faiss()
keep = [i for i, m in enumerate(self.metadata) if m.get(key) != value]
removed = len(self.metadata) - len(keep)
if removed == 0:
return 0
if keep:
vectors = np.empty((len(keep), self.embedding_size), dtype=np.float32)
for dst, src in enumerate(keep):
self.index.reconstruct(src, vectors[dst])
new_index = faiss.IndexFlatIP(self.embedding_size)
new_index.add(vectors)
else:
new_index = faiss.IndexFlatIP(self.embedding_size)
self.index = new_index
self.metadata = [self.metadata[i] for i in keep]
Logger.info('Removed %d entries where %s=%s (%d remaining)', removed, key, value, self.index.ntotal)
return removed
def save(self) -> None:
"""Persist the FAISS index and metadata to disk."""
faiss = _import_faiss()
faiss.write_index(self.index, self._index_file)
with open(self._meta_file, 'w', encoding='utf-8') as fh:
json.dump(self.metadata, fh, ensure_ascii=False, indent=2)
Logger.info('Saved FAISS index with %d vectors to %s', self.index.ntotal, self.db_path)
def load(self) -> bool:
"""Load a previously saved index and metadata from disk.
Returns:
``True`` if loaded successfully, ``False`` if files are missing.
Raises:
RuntimeError: If files exist but cannot be read.
"""
if not (os.path.exists(self._index_file) and os.path.exists(self._meta_file)):
return False
faiss = _import_faiss()
try:
loaded_index = faiss.read_index(self._index_file)
with open(self._meta_file, encoding='utf-8') as fh:
loaded_metadata: list[Metadata] = json.load(fh)
except Exception as exc:
raise RuntimeError(f'Failed to load FAISS index from {self.db_path}') from exc
self.index = loaded_index
self.metadata = loaded_metadata
Logger.info('Loaded FAISS index with %d vectors from %s', self.index.ntotal, self.db_path)
return True
@property
def size(self) -> int:
"""Number of vectors currently in the index."""
return self.index.ntotal
@staticmethod
def _prepare(vec: np.ndarray) -> np.ndarray:
"""Cast to contiguous float32 for FAISS compatibility."""
return np.ascontiguousarray(vec.ravel(), dtype=np.float32)
def __len__(self) -> int:
return self.index.ntotal
def __repr__(self) -> str:
return f'FAISS(embedding_size={self.embedding_size}, vectors={self.index.ntotal})'

View File

@@ -11,17 +11,18 @@ def create_landmarker(method: str = '2d106det', **kwargs) -> BaseLandmarker:
Factory function to create facial landmark predictors.
Args:
method (str): Landmark prediction method. Options: '106'.
method (str): Landmark prediction method.
Options: '2d106det' (default), 'landmark106', '106'.
**kwargs: Model-specific parameters.
Returns:
Initialized landmarker instance.
"""
method = method.lower()
if method == '2d106det':
if method in ('2d106det', 'landmark106', '106'):
return Landmark106(**kwargs)
else:
available = ['2d106det']
available = ['2d106det', 'landmark106', '106']
raise ValueError(f"Unsupported method: '{method}'. Available: {available}")

View File

@@ -10,9 +10,11 @@ using SHA-256 checksums for integrity validation.
from __future__ import annotations
from concurrent.futures import ThreadPoolExecutor, as_completed
from enum import Enum
import hashlib
import os
import time
import requests
from tqdm import tqdm
@@ -20,10 +22,55 @@ from tqdm import tqdm
import uniface.constants as const
from uniface.log import Logger
__all__ = ['verify_model_weights']
__all__ = ['download_models', 'get_cache_dir', 'set_cache_dir', 'verify_model_weights']
_DEFAULT_CACHE_DIR = '~/.uniface/models'
_ENV_KEY = 'UNIFACE_CACHE_DIR'
def verify_model_weights(model_name: Enum, root: str = '~/.uniface/models') -> str:
def get_cache_dir() -> str:
"""Get the current model cache directory path.
Resolution order:
1. ``UNIFACE_CACHE_DIR`` environment variable (set via :func:`set_cache_dir` or directly).
2. Default: ``~/.uniface/models``.
Returns:
Absolute, expanded path to the cache directory.
Example:
>>> from uniface import get_cache_dir
>>> print(get_cache_dir())
'/home/user/.uniface/models'
"""
return os.path.expanduser(os.environ.get(_ENV_KEY, _DEFAULT_CACHE_DIR))
def set_cache_dir(path: str) -> None:
"""Set the model cache directory.
This sets the ``UNIFACE_CACHE_DIR`` environment variable so that all
subsequent model downloads and lookups use the new path.
Args:
path: Directory path for storing model weights.
Example:
>>> from uniface import set_cache_dir, get_cache_dir
>>> set_cache_dir('/data/models')
>>> print(get_cache_dir())
'/data/models'
"""
os.environ[_ENV_KEY] = path
Logger.info(f'Cache directory set to: {path}')
def verify_model_weights(
model_name: Enum,
root: str | None = None,
timeout: int = 60,
max_retries: int = 3,
) -> str:
"""Ensure model weights are present, downloading and verifying them if necessary.
Given a model identifier from an Enum class (e.g., `RetinaFaceWeights.MNET_V2`),
@@ -34,7 +81,9 @@ def verify_model_weights(model_name: Enum, root: str = '~/.uniface/models') -> s
Args:
model_name: Model weight identifier enum (e.g., `RetinaFaceWeights.MNET_V2`).
root: Directory to store or locate the model weights.
Defaults to '~/.uniface/models'.
If None, uses the cache directory from :func:`get_cache_dir`.
timeout: Connection timeout in seconds. Defaults to 60.
max_retries: Maximum number of download attempts. Defaults to 3.
Returns:
Absolute path to the verified model weights file.
@@ -51,63 +100,78 @@ def verify_model_weights(model_name: Enum, root: str = '~/.uniface/models') -> s
'/home/user/.uniface/models/retinaface_mnet_v2.onnx'
"""
root = os.getenv('UNIFACE_CACHE_DIR', root)
root = os.path.expanduser(root)
root = os.path.expanduser(root) if root is not None else get_cache_dir()
os.makedirs(root, exist_ok=True)
# Keep model_name as enum for dictionary lookup
url = const.MODEL_URLS.get(model_name)
if not url:
Logger.error(f"No URL found for model '{model_name}'")
raise ValueError(f"No URL found for model '{model_name}'")
# Lookup model info from registry
model_info = const.MODEL_REGISTRY.get(model_name)
if not model_info:
Logger.error(f"No entry found in MODEL_REGISTRY for model '{model_name}'")
raise ValueError(f"Unknown model identifier: '{model_name}'")
url = model_info.url
expected_hash = model_info.sha256
file_ext = os.path.splitext(url)[1]
model_path = os.path.normpath(os.path.join(root, f'{model_name.value}{file_ext}'))
if not os.path.exists(model_path):
Logger.info(f"Downloading model '{model_name}' from {url}")
Logger.info(f"Downloading model '{model_name.value}' from {url}")
try:
download_file(url, model_path)
Logger.info(f"Successfully downloaded '{model_name}' to {model_path}")
download_file(url, model_path, timeout=timeout, max_retries=max_retries)
Logger.info(f"Successfully downloaded '{model_name.value}' to {model_path}")
except Exception as e:
Logger.error(f"Failed to download model '{model_name}': {e}")
raise ConnectionError(f"Download failed for '{model_name}'") from e
Logger.error(f"Failed to download model '{model_name.value}': {e}")
raise ConnectionError(f"Download failed for '{model_name.value}' after {max_retries} attempts") from e
expected_hash = const.MODEL_SHA256.get(model_name)
if expected_hash and not verify_file_hash(model_path, expected_hash):
os.remove(model_path) # Remove corrupted file
Logger.warning('Corrupted weight detected. Removing...')
raise ValueError(f"Hash mismatch for '{model_name}'. The file may be corrupted; please try downloading again.")
Logger.warning(f"Corrupted weights detected for '{model_name.value}'. Removing...")
raise ValueError(f"Hash mismatch for '{model_name.value}'. The file may be corrupted; please try again.")
return model_path
def download_file(url: str, dest_path: str, timeout: int = 30) -> None:
"""Download a file from a URL in chunks and save it to the destination path.
def download_file(url: str, dest_path: str, timeout: int = 60, max_retries: int = 3) -> None:
"""Download a file from a URL with retry logic.
Args:
url: URL to download from.
dest_path: Local file path to save to.
timeout: Connection timeout in seconds. Defaults to 30.
timeout: Connection timeout in seconds. Defaults to 60.
max_retries: Maximum number of attempts. Defaults to 3.
"""
try:
response = requests.get(url, stream=True, timeout=timeout)
response.raise_for_status()
with (
open(dest_path, 'wb') as file,
tqdm(
desc=f'Downloading {dest_path}',
unit='B',
unit_scale=True,
unit_divisor=1024,
) as progress,
):
for chunk in response.iter_content(chunk_size=const.CHUNK_SIZE):
if chunk:
file.write(chunk)
progress.update(len(chunk))
except requests.RequestException as e:
raise ConnectionError(f'Failed to download file from {url}. Error: {e}') from e
last_error = None
for attempt in range(max_retries):
try:
response = requests.get(url, stream=True, timeout=timeout)
response.raise_for_status()
total_size = int(response.headers.get('content-length', 0))
with (
open(dest_path, 'wb') as file,
tqdm(
total=total_size,
desc=f'Attempt {attempt + 1}/{max_retries}',
unit='B',
unit_scale=True,
unit_divisor=1024,
) as progress,
):
for chunk in response.iter_content(chunk_size=const.CHUNK_SIZE):
if chunk:
file.write(chunk)
progress.update(len(chunk))
return # Success
except (OSError, requests.RequestException) as e:
last_error = e
Logger.warning(f'Download attempt {attempt + 1} failed: {e}. Retrying...')
if os.path.exists(dest_path):
os.remove(dest_path)
time.sleep(2**attempt) # Exponential backoff
raise ConnectionError(f'Failed to download file from {url}. Error: {last_error}')
def verify_file_hash(file_path: str, expected_hash: str) -> bool:
@@ -122,9 +186,52 @@ def verify_file_hash(file_path: str, expected_hash: str) -> bool:
return actual_hash == expected_hash
if __name__ == '__main__':
model_names = [model.value for model in const.RetinaFaceWeights]
def download_models(
model_names: list[Enum], max_workers: int = 4, timeout: int = 60, max_retries: int = 3
) -> dict[Enum, str]:
"""Download and verify multiple models concurrently.
# Download each model in the list
for model_name in model_names:
model_path = verify_model_weights(model_name)
Uses a thread pool to download models in parallel, which is significantly
faster when initializing several models at once.
Args:
model_names: List of model weight enum identifiers to download.
max_workers: Maximum number of concurrent download threads. Defaults to 4.
timeout: Connection timeout in seconds. Defaults to 60.
max_retries: Maximum number of attempts per model. Defaults to 3.
Returns:
Mapping of each model enum to its local file path.
Raises:
RuntimeError: If any model download or verification fails.
Example:
>>> from uniface import download_models
>>> from uniface.constants import RetinaFaceWeights, ArcFaceWeights
>>> paths = download_models([RetinaFaceWeights.MNET_V2, ArcFaceWeights.RESNET])
"""
results: dict[Enum, str] = {}
errors: list[str] = []
with ThreadPoolExecutor(max_workers=max_workers) as executor:
future_to_model = {
executor.submit(verify_model_weights, name, timeout=timeout, max_retries=max_retries): name
for name in model_names
}
for future in as_completed(future_to_model):
model = future_to_model[future]
try:
path = future.result()
results[model] = path
Logger.info(f'Ready: {model.value} -> {path}')
except Exception as e:
errors.append(f'{model.value}: {e}')
Logger.error(f'Failed to download {model.value}: {e}')
if errors:
raise RuntimeError(f'Failed to download {len(errors)} model(s):\n' + '\n'.join(errors))
Logger.info(f'All {len(results)} model(s) downloaded and verified')
return results

View File

@@ -10,6 +10,8 @@ inference sessions with automatic hardware acceleration detection.
from __future__ import annotations
import functools
import onnxruntime as ort
from uniface.log import Logger
@@ -17,6 +19,7 @@ from uniface.log import Logger
__all__ = ['create_onnx_session', 'get_available_providers']
@functools.lru_cache(maxsize=1)
def get_available_providers() -> list[str]:
"""Get list of available ONNX Runtime execution providers.
@@ -30,7 +33,7 @@ def get_available_providers() -> list[str]:
Example:
>>> providers = get_available_providers()
>>> # On M4 Mac: ['CoreMLExecutionProvider', 'CPUExecutionProvider']
>>> # On macOS: ['CoreMLExecutionProvider', 'CPUExecutionProvider']
>>> # On Linux with CUDA: ['CUDAExecutionProvider', 'CPUExecutionProvider']
"""
available = ort.get_available_providers()
@@ -98,7 +101,7 @@ def create_onnx_session(
'CPUExecutionProvider': 'CPU',
}
provider_display = provider_names.get(active_provider, active_provider)
Logger.info(f'Model loaded ({provider_display})')
Logger.debug(f'Model loaded from {model_path} ({provider_display})')
return session
except Exception as e:

View File

@@ -2,6 +2,8 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
from abc import ABC, abstractmethod
import numpy as np
@@ -69,7 +71,7 @@ class BaseFaceParser(ABC):
raise NotImplementedError('Subclasses must implement the postprocess method.')
@abstractmethod
def parse(self, face_image: np.ndarray) -> np.ndarray:
def parse(self, image: np.ndarray, *, landmarks: np.ndarray | None = None) -> np.ndarray:
"""
Perform end-to-end face parsing on a face image.
@@ -77,9 +79,11 @@ class BaseFaceParser(ABC):
running inference, and postprocessing to return the segmentation mask.
Args:
face_image (np.ndarray): A face image in BGR format.
The face should be roughly centered and
well-framed within the image.
image (np.ndarray): A face image in BGR format.
The face should be roughly centered and well-framed within the image.
landmarks (np.ndarray | None): Optional 5-point facial landmarks with
shape (5, 2). Required by some parsers (e.g., XSeg) for face alignment.
Ignored by parsers that do not need landmarks (e.g., BiSeNet).
Returns:
np.ndarray: Segmentation mask with the same size as input image,
@@ -92,14 +96,15 @@ class BaseFaceParser(ABC):
"""
raise NotImplementedError('Subclasses must implement the parse method.')
def __call__(self, face_image: np.ndarray) -> np.ndarray:
def __call__(self, image: np.ndarray, *, landmarks: np.ndarray | None = None) -> np.ndarray:
"""
Provides a convenient, callable shortcut for the `parse` method.
Args:
face_image (np.ndarray): A face image in BGR format.
image (np.ndarray): A face image in BGR format.
landmarks (np.ndarray | None): Optional 5-point facial landmarks.
Returns:
np.ndarray: Segmentation mask with the same size as input image.
"""
return self.parse(face_image)
return self.parse(image, landmarks=landmarks)

View File

@@ -2,6 +2,7 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
import cv2
import numpy as np
@@ -149,21 +150,26 @@ class BiSeNet(BaseFaceParser):
return restored_mask
def parse(self, face_image: np.ndarray) -> np.ndarray:
def parse(self, image: np.ndarray, *, landmarks: np.ndarray | None = None) -> np.ndarray:
"""
Perform end-to-end face parsing on a face image.
This method orchestrates the full pipeline: preprocessing the input,
running inference, and postprocessing to return the segmentation mask.
BiSeNet operates on face crops and does not require landmarks.
The ``landmarks`` parameter is accepted for API compatibility but ignored.
Args:
face_image (np.ndarray): A face image in BGR format.
image (np.ndarray): A face image in BGR format.
landmarks (np.ndarray | None): Ignored. Accepted for interface
compatibility with :class:`BaseFaceParser`.
Returns:
np.ndarray: Segmentation mask with the same size as input image.
"""
original_size = (face_image.shape[1], face_image.shape[0]) # (width, height)
input_tensor = self.preprocess(face_image)
original_size = (image.shape[1], image.shape[0]) # (width, height)
input_tensor = self.preprocess(image)
outputs = self.session.run(self.output_names, {self.input_name: input_tensor})
return self.postprocess(outputs[0], original_size)

Some files were not shown because too many files have changed in this diff Show More