18 Commits

Author SHA1 Message Date
Yakhyokhuja Valikhujaev
02c77ce5db feat: Add head pose estimation model (#99)
* feat: Add Head Pose Estimation  with 6 different models

* chore: Update jupyter notebook examples

* docs: Update head pose estimation related docs
2026-03-26 22:57:05 +09:00
Yakhyokhuja Valikhujaev
d70d6a254f ref: Unify attribute/detector base classes and fix tools reliability (#98)
* refactor: unify attribute API, deduplicate detectors, and fix embedding shape

* refactor: unify attribute API and deduplicate detector code

* chore: Update docs page build on tags and frame validation before flip
2026-03-25 23:43:56 +09:00
Yakhyokhuja Valikhujaev
7d37633b1a chore: drop Python 3.10 support, bump scikit-image to >=0.26.0 (#96) 2026-03-19 10:04:52 +09:00
Yakhyokhuja Valikhujaev
bc413df4a8 docs: Add release changelog markdown file (#92) 2026-03-19 09:46:16 +09:00
Marc-Antoine BERTIN
8db0577991 feat: Add Python 3.14 support (#95)
- Relax requires-python upper bound from <3.14 to <3.15
- Add Python 3.14 classifier to pyproject.toml
- Add Python 3.14 to CI test matrix (ubuntu-latest)
- Fix SimilarityTransform.estimate() deprecation warning (scikit-image >=0.26)
  by switching to SimilarityTransform.from_estimate() class constructor

All 147 tests pass on Python 3.14.3 with no warnings.

Co-authored-by: marc-antoine <marcantoine.bertin@storyzy.com>
2026-03-19 09:41:44 +09:00
Yakhyokhuja Valikhujaev
3682a2124f release: Release UniFace version v3.1.0 (#91)
* release: Release UniFace version v3.1.0

* docs: Change classifiers to stable from beta
2026-03-11 12:21:33 +09:00
Yakhyokhuja Valikhujaev
2ef6a1ebe8 refactor: Use dataclass-based model info in model management (#90)
- Refactor model management section: Using data classes for more robust model management.
2026-03-11 12:05:43 +09:00
Yakhyokhuja Valikhujaev
78a2dba7c7 feat: Add FAISS vectore database for fast face search (#88) 2026-03-05 22:46:03 +09:00
Yakhyokhuja Valikhujaev
87e496d1f5 feat: Add FAISS vector DB support for fast search (#86)
* feat: Add FAISS: VectorDB for face embedding search

* docs: Update Documentation
2026-03-03 12:12:05 +09:00
Yakhyokhuja Valikhujaev
5604ebf4f1 docs: Add datasets information in the docs (#85) 2026-02-18 16:02:37 +09:00
Yakhyokhuja Valikhujaev
971775b2e8 feat: Update API format and gaze estimation models (#82)
* docs: Update documentation

* fix: Update several missing docs and tests

* docs: Clean up and remove redundants

* fix: Fix the gaze output formula and change the output order

* chore: Update model weights for gaze estimation

* release: Update release version to v3.0.0
2026-02-14 23:54:51 +09:00
Yakhyokhuja Valikhujaev
c520ea2df2 faet: Add ByteTrack - Multi-Object Tracking by Associating Every Detection Box (#81)
* feat: Add BYTETrack for face/person tracking

* docs: Update documentation

* ref: Update tools folder file naming and imports

* docs: Update jupyter notebook examples

* ref: Rename the file and remove duplicate codes

* docs: Update README.md

* chore: Update description in mkdocs, add keywords for face tracking

* docs: Add announcement section

* feat: Remove expand bbox for tracking and update docs
2026-02-12 00:20:23 +09:00
Yakhyokhuja Valikhujaev
2a8cb54d31 feat: Add get and set for cache dir (#80) 2026-02-09 23:32:02 +09:00
Yakhyokhuja Valikhujaev
331f46be7c release: Update release version and docs (#79) 2026-02-05 21:45:28 +09:00
Yakhyokhuja Valikhujaev
9991fae62a docs: Update UniFace library documentation and README.md (#78)
* docs: Update wrong/missing references

* docs: Update README.md
2026-02-04 20:45:02 +09:00
Yakhyokhuja Valikhujaev
b74ab95d39 docs: Update UniFace github image (#75) 2026-01-25 17:07:40 +09:00
Yakhyokhuja Valikhujaev
d2b0303bfe docs: Add additional badges to README.md (#74)
* Update badges in README.md
* Update ci.yml
2026-01-24 22:25:09 +09:00
Yakhyokhuja Valikhujaev
5f74487eb3 feat: Add XSeg for Face Segmentation (#72)
* feat: Add XSeg for Face Segmentation DeepFaceLab

* docs: Update model inference related reference

* chore: Update jupyter notebook example for face segmentation
2026-01-22 22:33:31 +09:00
138 changed files with 8916 additions and 3295 deletions

Binary file not shown.

Before

Width:  |  Height:  |  Size: 716 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 673 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 826 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 563 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 33 KiB

BIN
.github/logos/uniface_enhanced.webp vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 427 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.7 MiB

BIN
.github/logos/uniface_rounded.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.8 MiB

BIN
.github/logos/uniface_rounded_150px.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.9 MiB

BIN
.github/logos/uniface_rounded_q80.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 872 KiB

BIN
.github/logos/uniface_rounded_q80.webp vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 62 KiB

View File

@@ -1,4 +1,4 @@
name: CI
name: Build
on:
push:
@@ -20,7 +20,7 @@ jobs:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.10"
python-version: "3.11"
- uses: pre-commit/action@v3.0.1
test:
@@ -34,9 +34,11 @@ jobs:
include:
# Full Python range on Linux (fastest runner)
- os: ubuntu-latest
python-version: "3.10"
python-version: "3.11"
- os: ubuntu-latest
python-version: "3.13"
- os: ubuntu-latest
python-version: "3.14"
- os: macos-latest
python-version: "3.13"
- os: windows-latest

View File

@@ -2,7 +2,8 @@ name: Deploy docs
on:
push:
branches: [main]
tags:
- "v*.*.*"
workflow_dispatch:
permissions:

View File

@@ -54,7 +54,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.10", "3.13"]
python-version: ["3.11", "3.13"]
steps:
- name: Checkout code
@@ -92,7 +92,7 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.10"
python-version: "3.11"
cache: 'pip'
- name: Install build tools

1
.gitignore vendored
View File

@@ -1,4 +1,5 @@
tmp_*
.vscode/
# Byte-compiled / optimized / DLL files
__pycache__/

View File

@@ -59,12 +59,12 @@ This project uses [Ruff](https://docs.astral.sh/ruff/) for linting and formattin
#### General Rules
- **Line length:** 120 characters maximum
- **Python version:** 3.10+ (use modern syntax)
- **Python version:** 3.11+ (use modern syntax)
- **Quote style:** Single quotes for strings, double quotes for docstrings
#### Type Hints
Use modern Python 3.10+ type hints (PEP 585 and PEP 604):
Use modern Python 3.11+ type hints (PEP 585 and PEP 604):
```python
# Preferred (modern)
@@ -82,23 +82,23 @@ def process(items: List[str], config: Optional[Dict[str, int]] = None) -> Tuple[
Use [Google-style docstrings](https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings) for all public APIs:
```python
def detect_faces(image: np.ndarray, threshold: float = 0.5) -> list[Face]:
"""Detect faces in an image.
def create_detector(method: str = 'retinaface', **kwargs: Any) -> BaseDetector:
"""Factory function to create face detectors.
Args:
image: Input image as a numpy array with shape (H, W, C) in BGR format.
threshold: Confidence threshold for filtering detections. Defaults to 0.5.
method: Detection method. Options: 'retinaface', 'scrfd', 'yolov5face', 'yolov8face'.
**kwargs: Detector-specific parameters.
Returns:
List of Face objects containing bounding boxes, confidence scores,
and facial landmarks.
Initialized detector instance.
Raises:
ValueError: If the input image has invalid dimensions.
ValueError: If method is not supported.
Example:
>>> from uniface import detect_faces
>>> faces = detect_faces(image, threshold=0.8)
>>> from uniface import create_detector
>>> detector = create_detector('retinaface', confidence_threshold=0.8)
>>> faces = detector.detect(image)
>>> print(f"Found {len(faces)} faces")
"""
```
@@ -174,16 +174,19 @@ When adding a new model or feature:
Example notebooks demonstrating library usage:
| Example | Notebook |
|---------|----------|
| Face Detection | [01_face_detection.ipynb](examples/01_face_detection.ipynb) |
| Face Alignment | [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) |
| Face Verification | [03_face_verification.ipynb](examples/03_face_verification.ipynb) |
| Face Search | [04_face_search.ipynb](examples/04_face_search.ipynb) |
| Face Analyzer | [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) |
| Face Parsing | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
| Example | Notebook |
| ------------------ | ------------------------------------------------------------------- |
| Face Detection | [01_face_detection.ipynb](examples/01_face_detection.ipynb) |
| Face Alignment | [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) |
| Face Verification | [03_face_verification.ipynb](examples/03_face_verification.ipynb) |
| Face Search | [04_face_search.ipynb](examples/04_face_search.ipynb) |
| Face Analyzer | [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) |
| Face Parsing | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
| Face Anonymization | [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) |
| Gaze Estimation | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
| Gaze Estimation | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
| Face Segmentation | [09_face_segmentation.ipynb](examples/09_face_segmentation.ipynb) |
| Face Vector Store | [10_face_vector_store.ipynb](examples/10_face_vector_store.ipynb) |
| Head Pose Estimation | [11_head_pose_estimation.ipynb](examples/11_head_pose_estimation.ipynb) |
## Questions?

184
README.md
View File

@@ -1,34 +1,39 @@
# UniFace: All-in-One Face Analysis Library
<h1 align="center">UniFace: All-in-One Face Analysis Library</h1>
<div align="center">
[![PyPI](https://img.shields.io/pypi/v/uniface.svg?label=PyPI)](https://pypi.org/project/uniface/)
[![Python](https://img.shields.io/badge/Python-3.10%2B-blue)](https://www.python.org/)
[![PyPI Version](https://img.shields.io/pypi/v/uniface.svg?label=Version)](https://pypi.org/project/uniface/)
[![Python Version](https://img.shields.io/badge/Python-3.11%2B-blue)](https://www.python.org/)
[![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![CI](https://github.com/yakhyo/uniface/actions/workflows/ci.yml/badge.svg)](https://github.com/yakhyo/uniface/actions)
[![Github Build Status](https://github.com/yakhyo/uniface/actions/workflows/ci.yml/badge.svg)](https://github.com/yakhyo/uniface/actions)
[![PyPI Downloads](https://static.pepy.tech/personalized-badge/uniface?period=total&units=INTERNATIONAL_SYSTEM&left_color=GRAY&right_color=BLUE&left_text=Downloads)](https://pepy.tech/projects/uniface)
[![Docs](https://img.shields.io/badge/Docs-UniFace-blue.svg)](https://yakhyo.github.io/uniface/)
[![UniFace Documentation](https://img.shields.io/badge/Docs-UniFace-blue.svg)](https://yakhyo.github.io/uniface/)
[![Kaggle Badge](https://img.shields.io/badge/Notebooks-Kaggle?label=Kaggle&color=blue)](https://www.kaggle.com/yakhyokhuja/code)
[![Discord](https://img.shields.io/badge/Discord-Join%20Server-5865F2?logo=discord&logoColor=white)](https://discord.gg/wdzrjr7R5j)
</div>
<div align="center">
<img src=".github/logos/logo_web.webp" width=80%>
<img src="https://raw.githubusercontent.com/yakhyo/uniface/main/.github/logos/uniface_rounded_q80.webp" width="90%" alt="UniFace - All-in-One Open-Source Face Analysis Library">
</div>
---
**UniFace** is a lightweight, production-ready face analysis library built on ONNX Runtime. It provides high-performance face detection, recognition, landmark detection, face parsing, gaze estimation, and attribute analysis with hardware acceleration support across platforms.
> 💬 **Have questions?** [Chat with this codebase on DeepWiki](https://deepwiki.com/yakhyo/uniface) - AI-powered docs that let you ask anything about UniFace.
---
## Features
- **Face Detection** — RetinaFace, SCRFD, YOLOv5-Face, and YOLOv8-Face with 5-point landmarks
- **Face Recognition** — ArcFace, MobileFace, and SphereFace embeddings
- **Facial Landmarks** — 106-point landmark localization
- **Face Parsing** — BiSeNet semantic segmentation (19 classes)
- **Face Tracking** — Multi-object tracking with [BYTETracker](https://github.com/yakhyo/bytetrack-tracker) for persistent IDs across video frames
- **Facial Landmarks** — 106-point landmark localization module (separate from 5-point detector landmarks)
- **Face Parsing** — BiSeNet semantic segmentation (19 classes), XSeg face masking
- **Gaze Estimation** — Real-time gaze direction with MobileGaze
- **Head Pose Estimation** — 3D head orientation (pitch, yaw, roll) with 6D rotation representation
- **Attribute Analysis** — Age, gender, race (FairFace), and emotion
- **Vector Indexing** — FAISS-backed embedding store for fast multi-identity search
- **Anti-Spoofing** — Face liveness detection with MiniFASNet
- **Face Anonymization** — 5 blur methods for privacy protection
- **Hardware Acceleration** — ARM64 (Apple Silicon), CUDA (NVIDIA), CPU
@@ -37,31 +42,72 @@
## Installation
**Standard installation**
```bash
# Standard installation
pip install uniface
```
# GPU support (CUDA)
**GPU support (CUDA)**
```bash
pip install uniface[gpu]
```
# From source
**From source (latest version)**
```bash
git clone https://github.com/yakhyo/uniface.git
cd uniface && pip install -e .
```
**FAISS vector indexing**
```bash
pip install faiss-cpu # or faiss-gpu for CUDA
```
**Optional dependencies**
- Emotion model uses TorchScript and requires `torch`:
`pip install torch` (choose the correct build for your OS/CUDA)
- YOLOv5-Face and YOLOv8-Face support faster NMS with `torchvision`:
`pip install torch torchvision` then use `nms_mode='torchvision'`
---
## Quick Example
## Model Downloads and Cache
Models are downloaded automatically on first use and verified via SHA-256.
Default cache location: `~/.uniface/models`
Override with the programmatic API or environment variable:
```python
from uniface.model_store import get_cache_dir, set_cache_dir
set_cache_dir('/data/models')
print(get_cache_dir()) # /data/models
```
```bash
export UNIFACE_CACHE_DIR=/data/models
```
---
## Quick Example (Detection)
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
# Initialize detector (models auto-download on first use)
detector = RetinaFace()
# Detect faces
image = cv2.imread("photo.jpg")
if image is None:
raise ValueError("Failed to load image. Check the path to 'photo.jpg'.")
faces = detector.detect(image)
for face in faces:
@@ -71,14 +117,54 @@ for face in faces:
```
<div align="center">
<img src="assets/test_result.png">
<img src="https://raw.githubusercontent.com/yakhyo/uniface/main/assets/test_result.png" width="90%">
<p>Face Detection Model Output</p>
</div>
---
## Example (Face Analyzer)
```python
import cv2
from uniface.analyzer import FaceAnalyzer
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
detector = RetinaFace()
recognizer = ArcFace()
analyzer = FaceAnalyzer(detector, recognizer=recognizer)
image = cv2.imread("photo.jpg")
if image is None:
raise ValueError("Failed to load image. Check the path to 'photo.jpg'.")
faces = analyzer.analyze(image)
for face in faces:
print(face.bbox, face.embedding.shape if face.embedding is not None else None)
```
---
## Execution Providers (ONNX Runtime)
```python
from uniface.detection import RetinaFace
# Force CPU-only inference
detector = RetinaFace(providers=["CPUExecutionProvider"])
```
See more in the docs:
https://yakhyo.github.io/uniface/concepts/execution-providers/
---
## Documentation
📚 **Full documentation**: [yakhyo.github.io/uniface](https://yakhyo.github.io/uniface/)
Full documentation: https://yakhyo.github.io/uniface/
| Resource | Description |
|----------|-------------|
@@ -87,8 +173,28 @@ for face in faces:
| [API Reference](https://yakhyo.github.io/uniface/modules/detection/) | Detailed module documentation |
| [Tutorials](https://yakhyo.github.io/uniface/recipes/image-pipeline/) | Step-by-step workflow examples |
| [Guides](https://yakhyo.github.io/uniface/concepts/overview/) | Architecture and design principles |
| [Datasets](https://yakhyo.github.io/uniface/datasets/) | Training data and evaluation benchmarks |
### Jupyter Notebooks
---
## Datasets
| Task | Training Dataset | Models |
|------|-----------------|--------|
| Detection | WIDER FACE | RetinaFace, SCRFD, YOLOv5-Face, YOLOv8-Face |
| Recognition | MS1MV2 | MobileFace, SphereFace |
| Recognition | WebFace600K | ArcFace |
| Recognition | WebFace4M / 12M | AdaFace |
| Gaze | Gaze360 | MobileGaze |
| Head Pose | 300W-LP | HeadPose (ResNet, MobileNet) |
| Parsing | CelebAMask-HQ | BiSeNet |
| Attributes | CelebA, FairFace, AffectNet | AgeGender, FairFace, Emotion |
> See [Datasets documentation](https://yakhyo.github.io/uniface/datasets/) for download links, benchmarks, and details.
---
## Jupyter Notebooks
| Example | Colab | Description |
|---------|:-----:|-------------|
@@ -100,6 +206,22 @@ for face in faces:
| [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | Semantic face segmentation |
| [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | Privacy-preserving blur |
| [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | Gaze direction estimation |
| [09_face_segmentation.ipynb](examples/09_face_segmentation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | Face segmentation with XSeg |
| [10_face_vector_store.ipynb](examples/10_face_vector_store.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/10_face_vector_store.ipynb) | FAISS-backed face database |
| [11_head_pose_estimation.ipynb](examples/11_head_pose_estimation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/11_head_pose_estimation.ipynb) | Head pose estimation (pitch, yaw, roll) |
---
## Licensing and Model Usage
UniFace is MIT-licensed, but several pretrained models carry their own licenses.
Review: https://yakhyo.github.io/uniface/license-attribution/
Notable examples:
- YOLOv5-Face and YOLOv8-Face weights are GPL-3.0
- FairFace weights are CC BY 4.0
If you plan commercial use, verify model license compatibility.
---
@@ -107,12 +229,15 @@ for face in faces:
| Feature | Repository | Training | Description |
|---------|------------|:--------:|-------------|
| Detection | [retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) | [x] | RetinaFace PyTorch Training & Export |
| Detection | [retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) | | RetinaFace PyTorch Training & Export |
| Detection | [yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) | - | YOLOv5-Face ONNX Inference |
| Detection | [yolov8-face-onnx-inference](https://github.com/yakhyo/yolov8-face-onnx-inference) | - | YOLOv8-Face ONNX Inference |
| Recognition | [face-recognition](https://github.com/yakhyo/face-recognition) | [x] | MobileFace, SphereFace Training |
| Parsing | [face-parsing](https://github.com/yakhyo/face-parsing) | [x] | BiSeNet Face Parsing |
| Gaze | [gaze-estimation](https://github.com/yakhyo/gaze-estimation) | [x] | MobileGaze Training |
| Tracking | [bytetrack-tracker](https://github.com/yakhyo/bytetrack-tracker) | - | BYTETracker Multi-Object Tracking |
| Recognition | [face-recognition](https://github.com/yakhyo/face-recognition) | ✓ | MobileFace, SphereFace Training |
| Parsing | [face-parsing](https://github.com/yakhyo/face-parsing) | ✓ | BiSeNet Face Parsing |
| Parsing | [face-segmentation](https://github.com/yakhyo/face-segmentation) | - | XSeg Face Segmentation |
| Gaze | [gaze-estimation](https://github.com/yakhyo/gaze-estimation) | ✓ | MobileGaze Training |
| Head Pose | [head-pose-estimation](https://github.com/yakhyo/head-pose-estimation) | ✓ | Head Pose Training (6DRepNet-style) |
| Anti-Spoofing | [face-anti-spoofing](https://github.com/yakhyo/face-anti-spoofing) | - | MiniFASNet Inference |
| Attributes | [fairface-onnx](https://github.com/yakhyo/fairface-onnx) | - | FairFace ONNX Inference |
@@ -122,7 +247,16 @@ for face in faces:
## Contributing
Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
Contributions are welcome. Please see [CONTRIBUTING.md](CONTRIBUTING.md).
## Support
If you find this project useful, consider giving it a ⭐ on GitHub — it helps others discover it!
Questions or feedback:
- Discord: https://discord.gg/wdzrjr7R5j
- GitHub Issues: https://github.com/yakhyo/uniface/issues
- DeepWiki Q&A: https://deepwiki.com/yakhyo/uniface
## License

BIN
assets/einstein/img_0.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 99 KiB

View File

@@ -93,7 +93,7 @@ landmarks = face.landmarks # Shape: (5, 2)
Returned by `Landmark106`:
```python
from uniface import Landmark106
from uniface.landmark import Landmark106
landmarker = Landmark106()
landmarks = landmarker.get_landmarks(image, face.bbox)
@@ -174,7 +174,7 @@ yaw = -90° ────┼──── yaw = +90°
Face alignment uses 5-point landmarks to normalize face orientation:
```python
from uniface import face_alignment
from uniface.face_utils import face_alignment
# Align face to standard template
aligned_face = face_alignment(image, face.landmarks)

View File

@@ -9,7 +9,7 @@ UniFace uses ONNX Runtime for model inference, which supports multiple hardware
UniFace automatically selects the optimal execution provider based on available hardware:
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
# Automatically uses best available provider
detector = RetinaFace()
@@ -17,8 +17,8 @@ detector = RetinaFace()
**Priority order:**
1. **CUDAExecutionProvider** - NVIDIA GPU
2. **CoreMLExecutionProvider** - Apple Silicon
1. **CoreMLExecutionProvider** - Apple Silicon
2. **CUDAExecutionProvider** - NVIDIA GPU
3. **CPUExecutionProvider** - Fallback
---
@@ -28,7 +28,8 @@ detector = RetinaFace()
You can specify which execution provider to use by passing the `providers` parameter:
```python
from uniface import RetinaFace, ArcFace
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
# Force CPU execution (even if GPU is available)
detector = RetinaFace(providers=['CPUExecutionProvider'])
@@ -38,16 +39,20 @@ recognizer = ArcFace(providers=['CPUExecutionProvider'])
detector = RetinaFace(providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
```
All model classes accept the `providers` parameter:
All **ONNX-based** model classes accept the `providers` parameter:
- Detection: `RetinaFace`, `SCRFD`, `YOLOv5Face`, `YOLOv8Face`
- Recognition: `ArcFace`, `AdaFace`, `MobileFace`, `SphereFace`
- Landmarks: `Landmark106`
- Gaze: `MobileGaze`
- Parsing: `BiSeNet`
- Parsing: `BiSeNet`, `XSeg`
- Attributes: `AgeGender`, `FairFace`
- Anti-Spoofing: `MiniFASNet`
!!! note "Non-ONNX components"
- **Emotion** uses TorchScript and selects its device automatically (`mps` / `cuda` / `cpu`). It does **not** accept the `providers` parameter.
- **BlurFace** is a pure OpenCV utility and does not load any model.
---
## Check Available Providers
@@ -174,7 +179,7 @@ pip install uniface[gpu]
Smaller input sizes are faster but may reduce accuracy:
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
# Faster, lower accuracy
detector = RetinaFace(input_size=(320, 320))

View File

@@ -53,6 +53,7 @@ class Face:
race: str | None = None # "East Asian", etc.
emotion: str | None = None # "Happy", etc.
emotion_confidence: float | None = None
track_id: int | None = None # Persistent ID from tracker
```
### Properties
@@ -105,6 +106,27 @@ print(f"Yaw: {np.degrees(result.yaw):.1f}°")
---
### HeadPoseResult
```python
@dataclass(frozen=True)
class HeadPoseResult:
pitch: float # Rotation around X-axis (degrees), + = looking down
yaw: float # Rotation around Y-axis (degrees), + = looking right
roll: float # Rotation around Z-axis (degrees), + = tilting clockwise
```
**Usage:**
```python
result = head_pose.estimate(face_crop)
print(f"Pitch: {result.pitch:.1f}°")
print(f"Yaw: {result.yaw:.1f}°")
print(f"Roll: {result.roll:.1f}°")
```
---
### SpoofingResult
```python
@@ -143,11 +165,11 @@ class AttributeResult:
```python
# AgeGender model
result = age_gender.predict(image, face.bbox)
result = age_gender.predict(image, face)
print(f"{result.sex}, {result.age} years old")
# FairFace model
result = fairface.predict(image, face.bbox)
result = fairface.predict(image, face)
print(f"{result.sex}, {result.age_group}, {result.race}")
```
@@ -170,14 +192,14 @@ Face recognition models return normalized 512-dimensional embeddings:
```python
embedding = recognizer.get_normalized_embedding(image, landmarks)
print(f"Shape: {embedding.shape}") # (1, 512)
print(f"Shape: {embedding.shape}") # (512,)
print(f"Norm: {np.linalg.norm(embedding):.4f}") # ~1.0
```
### Similarity Computation
```python
from uniface import compute_similarity
from uniface.face_utils import compute_similarity
similarity = compute_similarity(embedding1, embedding2)
# Returns: float between -1 and 1 (cosine similarity)
@@ -199,16 +221,16 @@ print(f"Classes: {np.unique(mask)}") # [0, 1, 2, ...]
| ID | Class | ID | Class |
|----|-------|----|-------|
| 0 | Background | 10 | Ear Ring |
| 1 | Skin | 11 | Nose |
| 2 | Left Eyebrow | 12 | Mouth |
| 3 | Right Eyebrow | 13 | Upper Lip |
| 4 | Left Eye | 14 | Lower Lip |
| 5 | Right Eye | 15 | Neck |
| 6 | Eye Glasses | 16 | Neck Lace |
| 7 | Left Ear | 17 | Cloth |
| 8 | Right Ear | 18 | Hair |
| 9 | Hat | | |
| 0 | Background | 10 | Nose |
| 1 | Skin | 11 | Mouth |
| 2 | Left Eyebrow | 12 | Upper Lip |
| 3 | Right Eyebrow | 13 | Lower Lip |
| 4 | Left Eye | 14 | Neck |
| 5 | Right Eye | 15 | Necklace |
| 6 | Eyeglasses | 16 | Cloth |
| 7 | Left Ear | 17 | Hair |
| 8 | Right Ear | 18 | Hat |
| 9 | Earring | | |
---

View File

@@ -9,7 +9,7 @@ UniFace automatically downloads and caches models. This page explains how model
Models are downloaded on first use:
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
# First run: downloads model to cache
detector = RetinaFace() # ~3.5 MB download
@@ -32,9 +32,9 @@ Default cache directory:
```
~/.uniface/models/
├── retinaface_mv2.onnx
├── w600k_mbf.onnx
├── 2d106det.onnx
├── retinaface_mnet_v2.onnx
├── arcface_mnet.onnx
├── 2d_106.onnx
├── gaze_resnet34.onnx
├── parsing_resnet18.onnx
└── ...
@@ -44,44 +44,57 @@ Default cache directory:
## Custom Cache Directory
Specify a custom cache location:
Use the programmatic API to change the cache location at runtime:
```python
from uniface.model_store import verify_model_weights
from uniface.constants import RetinaFaceWeights
from uniface.model_store import get_cache_dir, set_cache_dir
# Download to custom directory
model_path = verify_model_weights(
RetinaFaceWeights.MNET_V2,
root='./my_models'
)
print(f"Model at: {model_path}")
# Set a custom cache directory
set_cache_dir('/data/models')
# Verify the current path
print(get_cache_dir()) # /data/models
# All subsequent model loads use the new directory
from uniface.detection import RetinaFace
detector = RetinaFace() # Downloads to /data/models/
```
Or set the `UNIFACE_CACHE_DIR` environment variable (see [Environment Variables](#environment-variables) below).
---
## Pre-Download Models
Download models before deployment:
Download models before deployment using the concurrent downloader:
```python
from uniface.model_store import verify_model_weights
from uniface.model_store import download_models
from uniface.constants import (
RetinaFaceWeights,
ArcFaceWeights,
AgeGenderWeights,
)
# Download all needed models
models = [
# Download multiple models concurrently (up to 4 threads by default)
paths = download_models([
RetinaFaceWeights.MNET_V2,
ArcFaceWeights.MNET,
AgeGenderWeights.DEFAULT,
]
])
for model in models:
path = verify_model_weights(model)
print(f"Downloaded: {path}")
for model, path in paths.items():
print(f"{model.value} -> {path}")
```
Or download one at a time:
```python
from uniface.model_store import verify_model_weights
from uniface.constants import RetinaFaceWeights
path = verify_model_weights(RetinaFaceWeights.MNET_V2)
print(f"Downloaded: {path}")
```
Or use the CLI tool:
@@ -115,11 +128,20 @@ print(f"Copy from: {path}")
scp -r ~/.uniface/models/ user@offline-machine:~/.uniface/models/
```
### 3. Use normally
### 3. Point to the cache (if non-default location)
```python
from uniface.model_store import set_cache_dir
# Only needed if the models are not at ~/.uniface/models/
set_cache_dir('/path/to/copied/models')
```
### 4. Use normally
```python
# Models load from local cache
from uniface import RetinaFace
from uniface.detection import RetinaFace
detector = RetinaFace() # No network required
```
@@ -182,7 +204,12 @@ If a model fails verification, it's re-downloaded automatically.
## Clear Cache
Remove cached models:
Find and remove cached models:
```python
from uniface.model_store import get_cache_dir
print(get_cache_dir()) # shows the active cache path
```
```bash
# Remove all cached models
@@ -198,20 +225,35 @@ Models will be re-downloaded on next use.
## Environment Variables
Set custom cache location via environment variable:
There are three equivalent ways to configure the cache directory:
```bash
export UNIFACE_CACHE_DIR=/path/to/custom/cache
**1. Programmatic API (recommended)**
```python
from uniface.model_store import get_cache_dir, set_cache_dir
set_cache_dir('/path/to/custom/cache')
print(get_cache_dir()) # /path/to/custom/cache
```
**2. Direct environment variable (Python)**
```python
import os
os.environ['UNIFACE_CACHE_DIR'] = '/path/to/custom/cache'
from uniface import RetinaFace
from uniface.detection import RetinaFace
detector = RetinaFace() # Uses custom cache
```
**3. Shell environment variable**
```bash
export UNIFACE_CACHE_DIR=/path/to/custom/cache
```
All three methods set the same `UNIFACE_CACHE_DIR` environment variable under the hood. `get_cache_dir()` always returns the resolved path.
---
## Next Steps

View File

@@ -23,11 +23,20 @@ graph TB
LMK[Landmarks]
ATTR[Attributes]
GAZE[Gaze]
HPOSE[Head Pose]
PARSE[Parsing]
SPOOF[Anti-Spoofing]
PRIV[Privacy]
end
subgraph Tracking
TRK[BYTETracker]
end
subgraph Indexing
IDX[FAISS Vector Store]
end
subgraph Output
FACE[Face Objects]
end
@@ -37,12 +46,16 @@ graph TB
DET --> LMK
DET --> ATTR
DET --> GAZE
DET --> HPOSE
DET --> PARSE
DET --> SPOOF
DET --> PRIV
DET --> TRK
REC --> IDX
REC --> FACE
LMK --> FACE
ATTR --> FACE
TRK --> FACE
```
---
@@ -51,12 +64,14 @@ graph TB
### 1. ONNX-First
All models use ONNX Runtime for inference:
UniFace runs inference primarily via ONNX Runtime for core components:
- **Cross-platform**: Same models work on macOS, Linux, Windows
- **Hardware acceleration**: Automatic selection of optimal provider
- **Production-ready**: No Python-only dependencies for inference
Some optional components (e.g., emotion TorchScript, torchvision NMS) require PyTorch.
### 2. Minimal Dependencies
Core dependencies are kept minimal:
@@ -74,12 +89,14 @@ tqdm # Progress bars
Factory functions and direct instantiation:
```python
# Factory function
detector = create_detector('retinaface')
from uniface.detection import RetinaFace
# Direct instantiation (recommended)
from uniface import RetinaFace
detector = RetinaFace()
# Or via factory function
from uniface.detection import create_detector
detector = create_detector('retinaface')
```
### 4. Type Safety
@@ -99,17 +116,20 @@ def detect(self, image: np.ndarray) -> list[Face]:
uniface/
├── detection/ # Face detection (RetinaFace, SCRFD, YOLOv5Face, YOLOv8Face)
├── recognition/ # Face recognition (AdaFace, ArcFace, MobileFace, SphereFace)
├── tracking/ # Multi-object tracking (BYTETracker)
├── landmark/ # 106-point landmarks
├── attribute/ # Age, gender, emotion, race
├── parsing/ # Face semantic segmentation
├── gaze/ # Gaze estimation
├── headpose/ # Head pose estimation
├── spoofing/ # Anti-spoofing
├── privacy/ # Face anonymization
├── types.py # Dataclasses (Face, GazeResult, etc.)
├── indexing/ # Vector indexing (FAISS)
├── types.py # Dataclasses (Face, GazeResult, HeadPoseResult, etc.)
├── constants.py # Model weights and URLs
├── model_store.py # Model download and caching
├── onnx_utils.py # ONNX Runtime utilities
└── visualization.py # Drawing utilities
└── draw.py # Drawing utilities
```
---
@@ -120,7 +140,9 @@ A typical face analysis workflow:
```python
import cv2
from uniface import RetinaFace, ArcFace, AgeGender
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
# 1. Initialize models
detector = RetinaFace()
@@ -139,7 +161,7 @@ for face in faces:
embedding = recognizer.get_normalized_embedding(image, face.landmarks)
# Attributes
attrs = age_gender.predict(image, face.bbox)
attrs = age_gender.predict(image, face)
print(f"Face: {attrs.sex}, {attrs.age} years")
```
@@ -151,12 +173,20 @@ for face in faces:
For convenience, `FaceAnalyzer` combines multiple modules:
```python
from uniface import FaceAnalyzer
from uniface.analyzer import FaceAnalyzer
from uniface.attribute import AgeGender, FairFace
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
detector = RetinaFace()
recognizer = ArcFace()
age_gender = AgeGender()
fairface = FairFace()
analyzer = FaceAnalyzer(
detect=True,
recognize=True,
attributes=True
detector,
recognizer=recognizer,
attributes=[age_gender, fairface],
)
faces = analyzer.analyze(image)
@@ -170,7 +200,7 @@ for face in faces:
## Model Lifecycle
1. **First use**: Model is downloaded from GitHub releases
2. **Cached**: Stored in `~/.uniface/models/`
2. **Cached**: Stored in `~/.uniface/models/` (configurable via `set_cache_dir()` or `UNIFACE_CACHE_DIR`)
3. **Verified**: SHA-256 checksum validation
4. **Loaded**: ONNX Runtime session created
5. **Inference**: Hardware-accelerated execution
@@ -179,6 +209,11 @@ for face in faces:
# Models auto-download on first use
detector = RetinaFace() # Downloads if not cached
# Optionally configure cache location
from uniface.model_store import get_cache_dir, set_cache_dir
set_cache_dir('/data/models')
print(get_cache_dir()) # /data/models
# Or manually pre-download
from uniface.model_store import verify_model_weights
from uniface.constants import RetinaFaceWeights

View File

@@ -11,7 +11,7 @@ This page explains how to tune detection and recognition thresholds for your use
Controls minimum confidence for face detection:
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
# Default (balanced)
detector = RetinaFace(confidence_threshold=0.5)
@@ -81,7 +81,7 @@ For identity verification (same person check):
```python
import numpy as np
from uniface import compute_similarity
from uniface.face_utils import compute_similarity
similarity = compute_similarity(embedding1, embedding2)
@@ -199,7 +199,7 @@ else:
For drawing detections, filter by confidence:
```python
from uniface.visualization import draw_detections
from uniface.draw import draw_detections
# Only draw high-confidence detections
bboxes = [f.bbox for f in faces if f.confidence > 0.7]

View File

@@ -32,7 +32,7 @@ ruff check . --fix
**Guidelines:**
- Line length: 120
- Python 3.10+ type hints
- Python 3.11+ type hints
- Google-style docstrings
---

348
docs/datasets.md Normal file
View File

@@ -0,0 +1,348 @@
# Datasets
Overview of all training datasets and evaluation benchmarks used by UniFace models.
---
## Quick Reference
| Task | Dataset | Scale | Models |
| ----------- | ------------------------------------------------ | ---------------------- | ------------------------------------------- |
| Detection | [WIDER FACE](#wider-face) | 32K images | RetinaFace, SCRFD, YOLOv5-Face, YOLOv8-Face |
| Recognition | [MS1MV2](#ms1mv2) | 5.8M images, 85.7K IDs | MobileFace, SphereFace |
| Recognition | [WebFace600K](#webface600k) | 600K images | ArcFace |
| Recognition | [WebFace4M / WebFace12M](#webface4m--webface12m) | 4M / 12M images | AdaFace |
| Gaze | [Gaze360](#gaze360) | 238 subjects | MobileGaze |
| Parsing | [CelebAMask-HQ](#celebamask-hq) | 30K images | BiSeNet |
| Attributes | [CelebA](#celeba) | 200K images | AgeGender |
| Attributes | [FairFace](#fairface) | Balanced demographics | FairFace |
| Attributes | [AffectNet](#affectnet) | Emotion labels | Emotion |
---
## Training Datasets
### Face Detection
#### WIDER FACE
Large-scale face detection benchmark with images across 61 event categories. Contains faces with a high degree of variability in scale, pose, occlusion, expression, and illumination.
| Property | Value |
| -------- | ------------------------------------------- |
| Images | ~32,000 (train/val/test split) |
| Faces | ~394,000 annotated |
| Subsets | Easy, Medium, Hard |
| Used by | RetinaFace, SCRFD, YOLOv5-Face, YOLOv8-Face |
!!! info "Download & References"
**Paper**: [WIDER FACE: A Face Detection Benchmark](https://arxiv.org/abs/1511.06523)
**Download**: [http://shuoyang1213.me/WIDERFACE/](http://shuoyang1213.me/WIDERFACE/)
---
### Face Recognition
#### MS1MV2
Refined version of the MS-Celeb-1M dataset, cleaned by InsightFace. Widely used for training face recognition models.
| Property | Value |
| ---------- | ------------------------------ |
| Identities | 85.7K |
| Images | 5.8M |
| Format | Aligned and cropped to 112x112 |
| Used by | MobileFace, SphereFace |
!!! info "Download"
**Kaggle (aligned 112x112)**: [ms1m-arcface-dataset](https://www.kaggle.com/datasets/yakhyokhuja/ms1m-arcface-dataset) (from InsightFace)
**Training code**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition)
---
#### WebFace600K
Medium-scale face recognition dataset from the WebFace series.
| Property | Value |
| -------- | ------- |
| Images | ~600K |
| Used by | ArcFace |
!!! info "Source"
**Origin**: [InsightFace](https://github.com/deepinsight/insightface)
**Paper**: [ArcFace: Additive Angular Margin Loss for Deep Face Recognition](https://arxiv.org/abs/1801.07698)
---
#### WebFace4M / WebFace12M
Large-scale face recognition datasets from the WebFace260M collection. Used for training AdaFace models with adaptive quality-aware margin.
| Property | WebFace4M | WebFace12M |
| -------- | ------------- | -------------- |
| Images | ~4M | ~12M |
| Used by | AdaFace IR_18 | AdaFace IR_101 |
!!! info "Source"
**Paper**: [AdaFace: Quality Adaptive Margin for Face Recognition](https://arxiv.org/abs/2204.00964)
**Original code**: [mk-minchul/AdaFace](https://github.com/mk-minchul/AdaFace)
---
#### CASIA-WebFace
Smaller-scale face recognition dataset suitable for academic research and lighter training runs.
| Property | Value |
| ---------- | ------------------------------ |
| Identities | 10.6K |
| Images | 491K |
| Format | Aligned and cropped to 112x112 |
| Used by | Alternative training set |
!!! info "Download"
**Kaggle (aligned 112x112)**: [webface-112x112](https://www.kaggle.com/datasets/yakhyokhuja/webface-112x112) (from OpenSphere)
---
#### VGGFace2
Large-scale dataset with wide variations in pose, age, illumination, ethnicity, and profession.
| Property | Value |
| ---------- | ------------------------------ |
| Identities | 8.6K |
| Images | 3.1M |
| Format | Aligned and cropped to 112x112 |
| Used by | Alternative training set |
!!! info "Download"
**Kaggle (aligned 112x112)**: [vggface2-112x112](https://www.kaggle.com/datasets/yakhyokhuja/vggface2-112x112) (from OpenSphere)
---
### Gaze Estimation
#### Gaze360
Large-scale gaze estimation dataset collected in indoor and outdoor environments with diverse head poses and wide gaze ranges (up to 360 degrees).
| Property | Value |
| ----------- | --------------------- |
| Subjects | 238 |
| Environment | Indoor and outdoor |
| Used by | All MobileGaze models |
!!! info "Download & Preprocessing"
**Download**: [gaze360.csail.mit.edu/download.php](https://gaze360.csail.mit.edu/download.php)
**Preprocessing**: [GazeHub - Gaze360](https://phi-ai.buaa.edu.cn/Gazehub/3D-dataset/#gaze360)
!!! note "UniFace Models"
All MobileGaze models shipped with UniFace are trained exclusively on Gaze360 for 200 epochs.
**Dataset structure:**
```
data/
└── Gaze360/
├── Image/
└── Label/
```
---
#### MPIIFaceGaze
Dataset for appearance-based gaze estimation from laptop webcam images of participants during everyday laptop usage. Supported by the gaze estimation training code but not used for the UniFace pretrained weights.
| Property | Value |
| ----------- | ---------------------------------------- |
| Subjects | 15 |
| Environment | Everyday laptop usage |
| Used by | Supported (not used for UniFace weights) |
!!! info "Download & Preprocessing"
**Download**: [MPIIFaceGaze download page](https://www.mpi-inf.mpg.de/departments/computer-vision-and-machine-learning/research/gaze-based-human-computer-interaction/its-written-all-over-your-face-full-face-appearance-based-gaze-estimation)
**Preprocessing**: [GazeHub - MPIIFaceGaze](https://phi-ai.buaa.edu.cn/Gazehub/3D-dataset/#mpiifacegaze)
**Dataset structure:**
```
data/
└── MPIIFaceGaze/
├── Image/
└── Label/
```
---
### Head Pose Estimation
#### 300W-LP
Large-scale synthesized face dataset with large pose variations, generated from 300W by face profiling. Used for training head pose estimation models.
| Property | Value |
| ----------- | ----------------------------- |
| Images | ~122,000 (synthesized) |
| Source | 300W (profiled) |
| Pose range | ±90° yaw |
| Evaluation | AFLW2000 |
| Used by | All HeadPose models |
!!! info "Download & Reference"
**Paper**: [Face Alignment Across Large Poses: A 3D Solution](https://arxiv.org/abs/1511.07212)
**Training code**: [yakhyo/head-pose-estimation](https://github.com/yakhyo/head-pose-estimation)
!!! note "UniFace Models"
All HeadPose models shipped with UniFace are trained on 300W-LP and evaluated on AFLW2000.
---
### Face Parsing
#### CelebAMask-HQ
High-quality face parsing dataset with pixel-level annotations for 19 facial component classes.
| Property | Value |
| ---------- | ---------------------------- |
| Images | 30,000 |
| Classes | 19 facial components |
| Resolution | High quality |
| Used by | BiSeNet (ResNet18, ResNet34) |
!!! info "Source"
**GitHub**: [switchablenorms/CelebAMask-HQ](https://github.com/switchablenorms/CelebAMask-HQ)
**Training code**: [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing)
**Dataset structure:**
```
dataset/
├── images/ # Input face images
│ ├── image1.jpg
│ └── ...
└── labels/ # Segmentation masks
├── image1.png
└── ...
```
---
### Attribute Analysis
#### CelebA
Large-scale face attributes dataset widely used for training age and gender prediction models.
| Property | Value |
| ---------- | -------------------- |
| Images | ~200K |
| Attributes | 40 binary attributes |
| Used by | AgeGender |
!!! info "Reference"
**Paper**: [Deep Learning Face Attributes in the Wild](https://arxiv.org/abs/1411.7766)
---
#### FairFace
Face attribute dataset designed for balanced representation across race, gender, and age groups. Provides more equitable predictions compared to imbalanced datasets.
| Property | Value |
| ---------- | ----------------------------------- |
| Attributes | Race (7), Gender (2), Age Group (9) |
| Used by | FairFace |
| License | CC BY 4.0 |
!!! info "Reference"
**Paper**: [FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age](https://arxiv.org/abs/1908.04913)
**ONNX inference**: [yakhyo/fairface-onnx](https://github.com/yakhyo/fairface-onnx)
---
#### AffectNet
Large-scale facial expression dataset for emotion recognition training.
| Property | Value |
| -------- | ----------------------------------------------------------------------- |
| Classes | 7 or 8 (Neutral, Happy, Sad, Surprise, Fear, Disgust, Angry + Contempt) |
| Used by | Emotion (AFFECNET7, AFFECNET8) |
!!! info "Reference"
**Paper**: [AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild](https://ieeexplore.ieee.org/document/8013713)
---
## Evaluation Benchmarks
### Face Detection
#### WIDER FACE Validation Set
The standard benchmark for face detection models. Results are reported across three difficulty subsets.
| Subset | Criteria |
| ------ | --------------------------------------------- |
| Easy | Large, clear, unoccluded faces |
| Medium | Moderate scale and occlusion |
| Hard | Small, heavily occluded, or challenging faces |
See [Model Zoo - Detection](models.md#face-detection-models) for per-model accuracy on each subset.
---
### Face Recognition
Recognition models are evaluated across multiple benchmarks. Aligned 112x112 validation datasets are available as a single download.
!!! info "Download"
**Kaggle**: [agedb-30-calfw-cplfw-lfw-aligned-112x112](https://www.kaggle.com/datasets/yakhyokhuja/agedb-30-calfw-cplfw-lfw-aligned-112x112)
| Benchmark | Description | Used by |
| ------------ | ----------------------------------------------------------------- | ------------------------------- |
| **LFW** | Labeled Faces in the Wild - standard face verification benchmark | ArcFace, MobileFace, SphereFace |
| **CALFW** | Cross-Age LFW - face verification across age gaps | MobileFace, SphereFace |
| **CPLFW** | Cross-Pose LFW - face verification across pose variations | MobileFace, SphereFace |
| **AgeDB-30** | Age database with 30-year age gaps | ArcFace, MobileFace, SphereFace |
| **CFP-FP** | Celebrities in Frontal-Profile - frontal vs. profile verification | ArcFace |
| **IJB-B** | IARPA Janus Benchmark B - TAR@FAR=0.01% | AdaFace |
| **IJB-C** | IARPA Janus Benchmark C - TAR@FAR=1e-4 | AdaFace, ArcFace |
See [Model Zoo - Recognition](models.md#face-recognition-models) for per-model accuracy on each benchmark.
---
### Gaze Estimation
| Benchmark | Metric | Description |
| -------------------- | ------------- | -------------------------------------------- |
| **Gaze360 test set** | MAE (degrees) | Mean Absolute Error in gaze angle prediction |
See [Model Zoo - Gaze](models.md#gaze-estimation-models) for per-model MAE scores.
---
## Training Repositories
For training your own models or reproducing results, see the following repositories:
| Task | Repository | Datasets Supported |
| ----------- | ------------------------------------------------------------------------- | ------------------------------- |
| Detection | [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) | WIDER FACE |
| Recognition | [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) | MS1MV2, CASIA-WebFace, VGGFace2 |
| Gaze | [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) | Gaze360, MPIIFaceGaze |
| Parsing | [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) | CelebAMask-HQ |

View File

@@ -10,12 +10,17 @@ template: home.html
# UniFace { .hero-title }
<p class="hero-subtitle">A lightweight, production-ready face analysis library built on ONNX Runtime</p>
<p class="hero-subtitle">All-in-One Open-Source Face Analysis Library</p>
[![PyPI](https://img.shields.io/pypi/v/uniface.svg?label=PyPI)](https://pypi.org/project/uniface/)
[![Python](https://img.shields.io/badge/Python-3.10%2B-blue)](https://www.python.org/)
[![PyPI Version](https://img.shields.io/pypi/v/uniface.svg?label=Version)](https://pypi.org/project/uniface/)
[![Python Version](https://img.shields.io/badge/Python-3.11%2B-blue)](https://www.python.org/)
[![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![Github Build Status](https://github.com/yakhyo/uniface/actions/workflows/ci.yml/badge.svg)](https://github.com/yakhyo/uniface/actions)
[![PyPI Downloads](https://static.pepy.tech/personalized-badge/uniface?period=total&units=INTERNATIONAL_SYSTEM&left_color=GRAY&right_color=BLUE&left_text=Downloads)](https://pepy.tech/projects/uniface)
[![Kaggle Badge](https://img.shields.io/badge/Notebooks-Kaggle?label=Kaggle&color=blue)](https://www.kaggle.com/yakhyokhuja/code)
[![Discord](https://img.shields.io/badge/Discord-Join%20Server-5865F2?logo=discord&logoColor=white)](https://discord.gg/wdzrjr7R5j)
<!-- <img src="https://raw.githubusercontent.com/yakhyo/uniface/main/.github/logos/uniface_rounded_q80.webp" alt="UniFace - All-in-One Open-Source Face Analysis Library" style="max-width: 70%; margin: 1rem 0;"> -->
[Get Started](quickstart.md){ .md-button .md-button--primary }
[View on GitHub](https://github.com/yakhyo/uniface){ .md-button }
@@ -54,6 +59,16 @@ BiSeNet semantic segmentation with 19 facial component classes.
Real-time gaze direction prediction with MobileGaze models.
</div>
<div class="feature-card" markdown>
### :material-axis-arrow: Head Pose
3D head orientation (pitch, yaw, roll) estimation with 6D rotation models.
</div>
<div class="feature-card" markdown>
### :material-motion-play: Tracking
Multi-object tracking with BYTETracker for persistent face IDs across video frames.
</div>
<div class="feature-card" markdown>
### :material-shield-check: Anti-Spoofing
Face liveness detection with MiniFASNet to prevent fraud.
@@ -64,31 +79,35 @@ Face liveness detection with MiniFASNet to prevent fraud.
Face anonymization with 5 blur methods for privacy protection.
</div>
<div class="feature-card" markdown>
### :material-database-search: Vector Indexing
FAISS-backed embedding store for fast multi-identity face search.
</div>
</div>
---
## Installation
=== "Standard"
UniFace runs inference primarily via **ONNX Runtime**; some optional components (e.g., emotion TorchScript, torchvision NMS) require **PyTorch**.
```bash
pip install uniface
```
**Standard**
```bash
pip install uniface
```
=== "GPU (CUDA)"
**GPU (CUDA)**
```bash
pip install uniface[gpu]
```
```bash
pip install uniface[gpu]
```
=== "From Source"
```bash
git clone https://github.com/yakhyo/uniface.git
cd uniface
pip install -e .
```
**From Source**
```bash
git clone https://github.com/yakhyo/uniface.git
cd uniface
pip install -e .
```
---

View File

@@ -6,7 +6,7 @@ This guide covers all installation options for UniFace.
## Requirements
- **Python**: 3.10 or higher
- **Python**: 3.11 or higher
- **Operating Systems**: macOS, Linux, Windows
---
@@ -55,11 +55,10 @@ pip install uniface[gpu]
**Requirements:**
- CUDA 11.x or 12.x
- cuDNN 8.x
- `uniface[gpu]` automatically installs `onnxruntime-gpu`. Requirements depend on the ORT version and execution provider.
!!! info "CUDA Compatibility"
See [ONNX Runtime GPU requirements](https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html) for detailed compatibility matrix.
See the [ONNX Runtime GPU compatibility matrix](https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html) for matching CUDA and cuDNN versions.
Verify GPU installation:
@@ -71,6 +70,19 @@ print("Available providers:", ort.get_available_providers())
---
### FAISS Vector Indexing
For fast multi-identity face search using a FAISS index:
```bash
pip install faiss-cpu # CPU
pip install faiss-gpu # NVIDIA GPU (CUDA)
```
See the [Indexing module](modules/indexing.md) for usage.
---
### CPU-Only (All Platforms)
```bash
@@ -108,9 +120,19 @@ UniFace has minimal dependencies:
| `numpy` | Array operations |
| `opencv-python` | Image processing |
| `onnxruntime` | Model inference |
| `scikit-image` | Geometric transforms |
| `requests` | Model download |
| `tqdm` | Progress bars |
**Optional:**
| Package | Install extra | Purpose |
|---------|---------------|---------|
| `faiss-cpu` / `faiss-gpu` | `pip install faiss-cpu` | FAISS vector indexing |
| `onnxruntime-gpu` | `uniface[gpu]` | CUDA acceleration |
| `torch` | `pip install torch` | Emotion model uses TorchScript |
| `torchvision` | `pip install torchvision` | Faster NMS for YOLO detectors |
---
## Verify Installation
@@ -126,7 +148,7 @@ import onnxruntime as ort
print(f"Available providers: {ort.get_available_providers()}")
# Quick test
from uniface import RetinaFace
from uniface.detection import RetinaFace
detector = RetinaFace()
print("Installation successful!")
```
@@ -137,11 +159,11 @@ print("Installation successful!")
### Import Errors
If you encounter import errors, ensure you're using Python 3.10+:
If you encounter import errors, ensure you're using Python 3.11+:
```bash
python --version
# Should show: Python 3.10.x or higher
# Should show: Python 3.11.x or higher
```
### Model Download Issues

View File

@@ -8,7 +8,7 @@ Complete guide to all available models and their performance characteristics.
### RetinaFace Family
RetinaFace models are trained on the WIDER FACE dataset.
RetinaFace models are trained on the [WIDER FACE](datasets.md#wider-face) dataset.
| Model Name | Params | Size | Easy | Medium | Hard |
| -------------- | ------ | ----- | ------ | ------ | ------ |
@@ -22,29 +22,29 @@ RetinaFace models are trained on the WIDER FACE dataset.
!!! info "Accuracy & Benchmarks"
**Accuracy**: WIDER FACE validation set (Easy/Medium/Hard subsets) - from [RetinaFace paper](https://arxiv.org/abs/1905.00641)
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --iterations 100`
**Speed**: Benchmark on your own hardware using `python tools/detect.py --source <image>`
---
### SCRFD Family
SCRFD (Sample and Computation Redistribution for Efficient Face Detection) models trained on WIDER FACE dataset.
SCRFD (Sample and Computation Redistribution for Efficient Face Detection) models trained on [WIDER FACE](datasets.md#wider-face) dataset.
| Model Name | Params | Size | Easy | Medium | Hard |
| ---------------- | ------ | ----- | ------ | ------ | ------ |
| `SCRFD_500M` | 0.6M | 2.5MB | 90.57% | 88.12% | 68.51% |
| `SCRFD_10G` :material-check-circle: | 4.2M | 17MB | 95.16% | 93.87% | 83.05% |
| `SCRFD_500M_KPS` | 0.6M | 2.5MB | 90.57% | 88.12% | 68.51% |
| `SCRFD_10G_KPS` :material-check-circle: | 4.2M | 17MB | 95.16% | 93.87% | 83.05% |
!!! info "Accuracy & Benchmarks"
**Accuracy**: WIDER FACE validation set - from [SCRFD paper](https://arxiv.org/abs/2105.04714)
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --iterations 100`
**Speed**: Benchmark on your own hardware using `python tools/detect.py --source <image>`
---
### YOLOv5-Face Family
YOLOv5-Face models provide detection with 5-point facial landmarks, trained on WIDER FACE dataset.
YOLOv5-Face models provide detection with 5-point facial landmarks, trained on [WIDER FACE](datasets.md#wider-face) dataset.
| Model Name | Size | Easy | Medium | Hard |
| -------------- | ---- | ------ | ------ | ------ |
@@ -55,7 +55,7 @@ YOLOv5-Face models provide detection with 5-point facial landmarks, trained on W
!!! info "Accuracy & Benchmarks"
**Accuracy**: WIDER FACE validation set - from [YOLOv5-Face paper](https://arxiv.org/abs/2105.12931)
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --iterations 100`
**Speed**: Benchmark on your own hardware using `python tools/detect.py --source <image>`
!!! note "Fixed Input Size"
All YOLOv5-Face models use a fixed input size of 640×640.
@@ -74,7 +74,7 @@ YOLOv8-Face models use anchor-free design with DFL (Distribution Focal Loss) for
!!! info "Accuracy & Benchmarks"
**Accuracy**: WIDER FACE validation set (Easy/Medium/Hard subsets)
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --method yolov8face`
**Speed**: Benchmark on your own hardware using `python tools/detect.py --source <image> --method yolov8face`
!!! note "Fixed Input Size"
All YOLOv8-Face models use a fixed input size of 640×640.
@@ -93,7 +93,7 @@ Face recognition using adaptive margin based on image quality.
| `IR_101` | IR-101 | WebFace12M | 249 MB | - | 97.66% |
!!! info "Training Data & Accuracy"
**Dataset**: WebFace4M (4M images) / WebFace12M (12M images)
**Dataset**: [WebFace4M / WebFace12M](datasets.md#webface4m--webface12m) (4M / 12M images)
**Accuracy**: IJB-B and IJB-C benchmarks, TAR@FAR=0.01%
@@ -113,7 +113,7 @@ Face recognition using additive angular margin loss.
| `RESNET` | ResNet50 | 43.6M | 166MB | 99.83% | 99.33% | 98.23% | 97.25% |
!!! info "Training Data"
**Dataset**: Trained on WebFace600K (600K images)
**Dataset**: Trained on [WebFace600K](datasets.md#webface600k) (600K images)
**Accuracy**: IJB-C accuracy reported as TAR@FAR=1e-4
@@ -131,7 +131,7 @@ Lightweight face recognition models with MobileNet backbones.
| `MNET_V3_LARGE` | MobileNetV3-L | 3.52M | 10MB | 99.53% | 94.56% | 86.79% | 95.13% |
!!! info "Training Data"
**Dataset**: Trained on MS1M-V2 (5.8M images, 85K identities)
**Dataset**: Trained on [MS1MV2](datasets.md#ms1mv2) (5.8M images, 85K identities)
**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks
@@ -147,7 +147,7 @@ Face recognition using angular softmax loss.
| `SPHERE36` | Sphere36 | 34.6M | 92MB | 99.72% | 95.64% | 89.92% | 96.83% |
!!! info "Training Data"
**Dataset**: Trained on MS1M-V2 (5.8M images, 85K identities)
**Dataset**: Trained on [MS1MV2](datasets.md#ms1mv2) (5.8M images, 85K identities)
**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks
@@ -187,7 +187,7 @@ Facial landmark localization model.
| `AgeGender` | Age, Gender | 2.1M | 8MB |
!!! info "Training Data"
**Dataset**: Trained on CelebA
**Dataset**: Trained on [CelebA](datasets.md#celeba)
!!! warning "Accuracy Note"
Accuracy varies by demographic and image quality. Test on your specific use case.
@@ -201,7 +201,7 @@ Facial landmark localization model.
| `FairFace` | Race, Gender, Age Group | - | 44MB |
!!! info "Training Data"
**Dataset**: Trained on FairFace dataset with balanced demographics
**Dataset**: Trained on [FairFace](datasets.md#fairface) dataset with balanced demographics
!!! tip "Equitable Predictions"
FairFace provides more equitable predictions across different racial and gender groups.
@@ -219,12 +219,12 @@ Facial landmark localization model.
| `AFFECNET7` | 7 | 0.5M | 2MB |
| `AFFECNET8` | 8 | 0.5M | 2MB |
**Classes (7)**: Neutral, Happy, Sad, Surprise, Fear, Disgust, Anger
**Classes (7)**: Neutral, Happy, Sad, Surprise, Fear, Disgust, Angry
**Classes (8)**: Above + Contempt
!!! info "Training Data"
**Dataset**: Trained on AffectNet
**Dataset**: Trained on [AffectNet](datasets.md#affectnet)
!!! note "Accuracy Note"
Emotion detection accuracy depends heavily on facial expression clarity and cultural context.
@@ -235,7 +235,7 @@ Facial landmark localization model.
### MobileGaze Family
Gaze direction prediction models trained on Gaze360 dataset. Returns pitch (vertical) and yaw (horizontal) angles in radians.
Gaze direction prediction models trained on [Gaze360](datasets.md#gaze360) dataset. Returns pitch (vertical) and yaw (horizontal) angles in radians.
| Model Name | Params | Size | MAE* |
| -------------- | ------ | ------- | ----- |
@@ -248,7 +248,7 @@ Gaze direction prediction models trained on Gaze360 dataset. Returns pitch (vert
*MAE (Mean Absolute Error) in degrees on Gaze360 test set - lower is better
!!! info "Training Data"
**Dataset**: Trained on Gaze360 (indoor/outdoor scenes with diverse head poses)
**Dataset**: Trained on [Gaze360](datasets.md#gaze360) (indoor/outdoor scenes with diverse head poses)
**Training**: 200 epochs with classification-based approach (binned angles)
@@ -257,6 +257,33 @@ Gaze direction prediction models trained on Gaze360 dataset. Returns pitch (vert
---
## Head Pose Estimation Models
### HeadPose Family
Head pose estimation models using 6D rotation representation. Trained on [300W-LP](datasets.md#300w-lp) dataset, evaluated on AFLW2000. Returns pitch, yaw, and roll angles in degrees.
| Model Name | Backbone | Size | MAE* |
| -------------- | -------- | ------- | ----- |
| `RESNET18` :material-check-circle: | ResNet18 | 43 MB | 5.22° |
| `RESNET34` | ResNet34 | 82 MB | 5.07° |
| `RESNET50` | ResNet50 | 91 MB | 4.83° |
| `MOBILENET_V2` | MobileNetV2 | 9.6 MB | 5.72° |
| `MOBILENET_V3_SMALL` | MobileNetV3-Small | 4.8 MB | 6.31° |
| `MOBILENET_V3_LARGE` | MobileNetV3-Large | 16 MB | 5.58° |
*MAE (Mean Absolute Error) in degrees on AFLW2000 test set — lower is better
!!! info "Training Data"
**Dataset**: Trained on [300W-LP](datasets.md#300w-lp) (synthesized large-pose faces from 300W)
**Method**: 6D rotation representation (rotation matrix → Euler angles)
!!! note "Input Requirements"
Requires face crop as input. Use face detection first to obtain bounding boxes.
---
## Face Parsing Models
### BiSeNet Family
@@ -269,7 +296,7 @@ BiSeNet (Bilateral Segmentation Network) models for semantic face parsing. Segme
| `RESNET34` | 24.1M | 89.2 MB | 19 |
!!! info "Training Data"
**Dataset**: Trained on CelebAMask-HQ
**Dataset**: Trained on [CelebAMask-HQ](datasets.md#celebamask-hq)
**Architecture**: BiSeNet with ResNet backbone
@@ -279,13 +306,13 @@ BiSeNet (Bilateral Segmentation Network) models for semantic face parsing. Segme
| # | Class | # | Class | # | Class |
|---|-------|---|-------|---|-------|
| 1 | Background | 8 | Left Ear | 15 | Neck |
| 2 | Skin | 9 | Right Ear | 16 | Neck Lace |
| 3 | Left Eyebrow | 10 | Ear Ring | 17 | Cloth |
| 4 | Right Eyebrow | 11 | Nose | 18 | Hair |
| 5 | Left Eye | 12 | Mouth | 19 | Hat |
| 6 | Right Eye | 13 | Upper Lip | | |
| 7 | Eye Glasses | 14 | Lower Lip | | |
| 0 | Background | 7 | Left Ear | 14 | Neck |
| 1 | Skin | 8 | Right Ear | 15 | Neck Lace |
| 2 | Left Eyebrow | 9 | Ear Ring | 16 | Cloth |
| 3 | Right Eyebrow | 10 | Nose | 17 | Hair |
| 4 | Left Eye | 11 | Mouth | 18 | Hat |
| 5 | Right Eye | 12 | Upper Lip | | |
| 6 | Eye Glasses | 13 | Lower Lip | | |
**Applications:**
@@ -300,6 +327,32 @@ BiSeNet (Bilateral Segmentation Network) models for semantic face parsing. Segme
---
### XSeg
XSeg from DeepFaceLab outputs masks for face regions. Requires 5-point landmarks for face alignment.
| Model Name | Size | Output |
|------------|--------|--------|
| `DEFAULT` | 67 MB | Mask [0, 1] |
!!! info "Model Details"
**Origin**: DeepFaceLab
**Input**: NHWC format, normalized to [0, 1]
**Alignment**: Requires 5-point landmarks (not bbox crops)
**Applications:**
- Face region extraction
- Face swapping pipelines
- Occlusion handling
!!! note "Input Requirements"
Requires 5-point facial landmarks. Use a face detector like RetinaFace to obtain landmarks first.
---
## Anti-Spoofing Models
### MiniFASNet Family
@@ -323,10 +376,14 @@ Face anti-spoofing models for liveness detection. Detect if a face is real (live
Models are automatically downloaded and cached on first use.
- **Cache location**: `~/.uniface/models/`
- **Cache location**: `~/.uniface/models/` (configurable via `set_cache_dir()` or `UNIFACE_CACHE_DIR` env var)
- **Inspect cache path**: `get_cache_dir()` returns the resolved active path
- **Verification**: Models are verified with SHA-256 checksums
- **Concurrent download**: `download_models([...])` fetches multiple models in parallel
- **Manual download**: Use `python tools/download_model.py` to pre-download models
See [Model Cache & Offline Use](concepts/model-cache-offline.md) for full details.
---
## References
@@ -342,7 +399,9 @@ Models are automatically downloaded and cached on first use.
- **AdaFace ONNX**: [yakhyo/adaface-onnx](https://github.com/yakhyo/adaface-onnx) - ONNX export and inference
- **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) - ArcFace, MobileFace, SphereFace training code
- **Gaze Estimation Training**: [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) - MobileGaze training code and pretrained weights
- **Head Pose Estimation**: [yakhyo/head-pose-estimation](https://github.com/yakhyo/head-pose-estimation) - 6D rotation head pose estimation training and ONNX models
- **Face Parsing Training**: [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) - BiSeNet training code and pretrained weights
- **Face Segmentation**: [yakhyo/face-segmentation](https://github.com/yakhyo/face-segmentation) - XSeg ONNX Inference
- **Face Anti-Spoofing**: [yakhyo/face-anti-spoofing](https://github.com/yakhyo/face-anti-spoofing) - MiniFASNet ONNX inference (weights from [minivision-ai/Silent-Face-Anti-Spoofing](https://github.com/minivision-ai/Silent-Face-Anti-Spoofing))
- **FairFace**: [yakhyo/fairface-onnx](https://github.com/yakhyo/fairface-onnx) - FairFace ONNX inference for race, gender, age prediction
- **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface) - Model architectures and pretrained weights

View File

@@ -21,7 +21,8 @@ Predicts exact age and binary gender.
### Basic Usage
```python
from uniface import RetinaFace, AgeGender
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
detector = RetinaFace()
age_gender = AgeGender()
@@ -29,9 +30,10 @@ age_gender = AgeGender()
faces = detector.detect(image)
for face in faces:
result = age_gender.predict(image, face.bbox)
result = age_gender.predict(image, face)
print(f"Gender: {result.sex}") # "Female" or "Male"
print(f"Age: {result.age} years")
# face.gender and face.age are also set automatically
```
### Output
@@ -54,7 +56,8 @@ Predicts gender, age group, and race with balanced demographics.
### Basic Usage
```python
from uniface import RetinaFace, FairFace
from uniface.attribute import FairFace
from uniface.detection import RetinaFace
detector = RetinaFace()
fairface = FairFace()
@@ -62,10 +65,11 @@ fairface = FairFace()
faces = detector.detect(image)
for face in faces:
result = fairface.predict(image, face.bbox)
result = fairface.predict(image, face)
print(f"Gender: {result.sex}")
print(f"Age Group: {result.age_group}")
print(f"Race: {result.race}")
# face.gender, face.age_group, face.race are also set automatically
```
### Output
@@ -120,7 +124,7 @@ Predicts facial emotions. Requires PyTorch.
### Basic Usage
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.attribute import Emotion
from uniface.constants import DDAMFNWeights
@@ -130,7 +134,7 @@ emotion = Emotion(model_name=DDAMFNWeights.AFFECNET7)
faces = detector.detect(image)
for face in faces:
result = emotion.predict(image, face.landmarks)
result = emotion.predict(image, face)
print(f"Emotion: {result.emotion}")
print(f"Confidence: {result.confidence:.2%}")
```
@@ -147,7 +151,7 @@ for face in faces:
| Surprise |
| Fear |
| Disgust |
| Anger |
| Angry |
=== "8-Class (AFFECNET8)"
@@ -159,7 +163,7 @@ for face in faces:
| Surprise |
| Fear |
| Disgust |
| Anger |
| Angry |
| Contempt |
### Model Variants
@@ -177,12 +181,29 @@ emotion = Emotion(model_name=DDAMFNWeights.AFFECNET8)
---
## Factory Function
Use `create_attribute_predictor()` for dynamic model selection:
```python
from uniface import create_attribute_predictor
age_gender = create_attribute_predictor('age_gender')
fairface = create_attribute_predictor('fairface')
emotion = create_attribute_predictor('emotion')
```
Available model names: `'age_gender'`, `'fairface'`, `'emotion'`.
---
## Combining Models
### Full Attribute Analysis
```python
from uniface import RetinaFace, AgeGender, FairFace
from uniface.attribute import AgeGender, FairFace
from uniface.detection import RetinaFace
detector = RetinaFace()
age_gender = AgeGender()
@@ -192,10 +213,10 @@ faces = detector.detect(image)
for face in faces:
# Get exact age from AgeGender
ag_result = age_gender.predict(image, face.bbox)
ag_result = age_gender.predict(image, face)
# Get race from FairFace
ff_result = fairface.predict(image, face.bbox)
ff_result = fairface.predict(image, face)
print(f"Gender: {ag_result.sex}")
print(f"Exact Age: {ag_result.age}")
@@ -206,12 +227,13 @@ for face in faces:
### Using FaceAnalyzer
```python
from uniface import FaceAnalyzer
from uniface.analyzer import FaceAnalyzer
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
analyzer = FaceAnalyzer(
detect=True,
recognize=False,
attributes=True # Uses AgeGender
RetinaFace(),
attributes=[AgeGender()],
)
faces = analyzer.analyze(image)
@@ -253,7 +275,7 @@ def draw_attributes(image, face, result):
# Usage
for face in faces:
result = age_gender.predict(image, face.bbox)
result = age_gender.predict(image, face)
image = draw_attributes(image, face, result)
cv2.imwrite("attributes.jpg", image)

View File

@@ -24,7 +24,7 @@ Single-shot face detector with multi-scale feature pyramid.
### Basic Usage
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
detector = RetinaFace()
faces = detector.detect(image)
@@ -38,7 +38,7 @@ for face in faces:
### Model Variants
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.constants import RetinaFaceWeights
# Lightweight (mobile/edge)
@@ -82,7 +82,7 @@ State-of-the-art detection with excellent accuracy-speed tradeoff.
### Basic Usage
```python
from uniface import SCRFD
from uniface.detection import SCRFD
detector = SCRFD()
faces = detector.detect(image)
@@ -91,7 +91,7 @@ faces = detector.detect(image)
### Model Variants
```python
from uniface import SCRFD
from uniface.detection import SCRFD
from uniface.constants import SCRFDWeights
# Real-time (lightweight)
@@ -127,7 +127,7 @@ YOLO-based detection optimized for faces.
### Basic Usage
```python
from uniface import YOLOv5Face
from uniface.detection import YOLOv5Face
detector = YOLOv5Face()
faces = detector.detect(image)
@@ -136,7 +136,7 @@ faces = detector.detect(image)
### Model Variants
```python
from uniface import YOLOv5Face
from uniface.detection import YOLOv5Face
from uniface.constants import YOLOv5FaceWeights
# Lightweight
@@ -179,7 +179,7 @@ Anchor-free detection with DFL (Distribution Focal Loss) for accurate bbox regre
### Basic Usage
```python
from uniface import YOLOv8Face
from uniface.detection import YOLOv8Face
detector = YOLOv8Face()
faces = detector.detect(image)
@@ -188,7 +188,7 @@ faces = detector.detect(image)
### Model Variants
```python
from uniface import YOLOv8Face
from uniface.detection import YOLOv8Face
from uniface.constants import YOLOv8FaceWeights
# Lightweight
@@ -225,7 +225,7 @@ detector = YOLOv8Face(
Create detectors dynamically:
```python
from uniface import create_detector
from uniface.detection import create_detector
detector = create_detector('retinaface')
# or
@@ -238,22 +238,6 @@ detector = create_detector('yolov8face')
---
## High-Level API
One-line detection:
```python
from uniface import detect_faces
# Using RetinaFace (default)
faces = detect_faces(image, method='retinaface', confidence_threshold=0.5)
# Using YOLOv8-Face
faces = detect_faces(image, method='yolov8face', confidence_threshold=0.5)
```
---
## Output Format
All detectors return `list[Face]`:
@@ -276,7 +260,7 @@ for face in faces:
## Visualization
```python
from uniface.visualization import draw_detections
from uniface.draw import draw_detections
draw_detections(
image=image,
@@ -296,7 +280,7 @@ cv2.imwrite("result.jpg", image)
Benchmark on your hardware:
```bash
python tools/detection.py --source image.jpg --iterations 100
python tools/detect.py --source image.jpg
```
---

View File

@@ -23,7 +23,8 @@ Gaze estimation predicts where a person is looking (pitch and yaw angles).
```python
import cv2
import numpy as np
from uniface import RetinaFace, MobileGaze
from uniface.detection import RetinaFace
from uniface.gaze import MobileGaze
detector = RetinaFace()
gaze_estimator = MobileGaze()
@@ -52,7 +53,7 @@ for face in faces:
## Model Variants
```python
from uniface import MobileGaze
from uniface.gaze import MobileGaze
from uniface.constants import GazeWeights
# Default (ResNet34, recommended)
@@ -102,7 +103,7 @@ yaw = -90° ────┼──── yaw = +90°
## Visualization
```python
from uniface.visualization import draw_gaze
from uniface.draw import draw_gaze
# Detect faces
faces = detector.detect(image)
@@ -154,8 +155,9 @@ def draw_gaze_custom(image, bbox, pitch, yaw, length=100, color=(0, 255, 0)):
```python
import cv2
import numpy as np
from uniface import RetinaFace, MobileGaze
from uniface.visualization import draw_gaze
from uniface.detection import RetinaFace
from uniface.gaze import MobileGaze
from uniface.draw import draw_gaze
detector = RetinaFace()
gaze_estimator = MobileGaze()
@@ -256,7 +258,7 @@ print(f"Looking: {direction}")
## Factory Function
```python
from uniface import create_gaze_estimator
from uniface.gaze import create_gaze_estimator
gaze = create_gaze_estimator() # Returns MobileGaze
```
@@ -265,6 +267,7 @@ gaze = create_gaze_estimator() # Returns MobileGaze
## Next Steps
- [Head Pose Estimation](headpose.md) - 3D head orientation
- [Anti-Spoofing](spoofing.md) - Face liveness detection
- [Privacy](privacy.md) - Face anonymization
- [Video Recipe](../recipes/video-webcam.md) - Real-time processing

232
docs/modules/headpose.md Normal file
View File

@@ -0,0 +1,232 @@
# Head Pose Estimation
Head pose estimation predicts the 3D orientation of a person's head (pitch, yaw, and roll angles).
---
## Available Models
| Model | Backbone | Size | MAE* |
|-------|----------|------|------|
| **ResNet18** :material-check-circle: | ResNet18 | 43 MB | 5.22° |
| ResNet34 | ResNet34 | 82 MB | 5.07° |
| ResNet50 | ResNet50 | 91 MB | 4.83° |
| MobileNetV2 | MobileNetV2 | 9.6 MB | 5.72° |
| MobileNetV3-Small | MobileNetV3 | 4.8 MB | 6.31° |
| MobileNetV3-Large | MobileNetV3 | 16 MB | 5.58° |
*MAE = Mean Absolute Error on AFLW2000 test set (lower is better)
---
## Basic Usage
```python
import cv2
from uniface.detection import RetinaFace
from uniface.headpose import HeadPose
detector = RetinaFace()
head_pose = HeadPose()
image = cv2.imread("photo.jpg")
faces = detector.detect(image)
for face in faces:
# Crop face
x1, y1, x2, y2 = map(int, face.bbox)
face_crop = image[y1:y2, x1:x2]
if face_crop.size > 0:
# Estimate head pose
result = head_pose.estimate(face_crop)
print(f"Pitch: {result.pitch:.1f}°, Yaw: {result.yaw:.1f}°, Roll: {result.roll:.1f}°")
```
---
## Model Variants
```python
from uniface.headpose import HeadPose
from uniface.constants import HeadPoseWeights
# Default (ResNet18, recommended balance of speed and accuracy)
hp = HeadPose()
# Lightweight for mobile/edge
hp = HeadPose(model_name=HeadPoseWeights.MOBILENET_V3_SMALL)
# Higher accuracy
hp = HeadPose(model_name=HeadPoseWeights.RESNET50)
```
---
## Output Format
```python
result = head_pose.estimate(face_crop)
# HeadPoseResult dataclass
result.pitch # Rotation around X-axis in degrees
result.yaw # Rotation around Y-axis in degrees
result.roll # Rotation around Z-axis in degrees
```
### Angle Convention
```
pitch > 0 (looking down)
yaw < 0 ─────┼───── yaw > 0
(looking left) │ (looking right)
pitch < 0 (looking up)
roll > 0 = clockwise tilt
roll < 0 = counter-clockwise tilt
```
- **Pitch**: Rotation around X-axis (positive = looking down)
- **Yaw**: Rotation around Y-axis (positive = looking right)
- **Roll**: Rotation around Z-axis (positive = tilting clockwise)
---
## Visualization
### 3D Cube (default)
The default visualization draws a wireframe cube oriented to match the head pose.
```python
from uniface.draw import draw_head_pose
faces = detector.detect(image)
for face in faces:
x1, y1, x2, y2 = map(int, face.bbox)
face_crop = image[y1:y2, x1:x2]
if face_crop.size > 0:
result = head_pose.estimate(face_crop)
# Draw cube on image (default)
draw_head_pose(image, face.bbox, result.pitch, result.yaw, result.roll)
cv2.imwrite("headpose_output.jpg", image)
```
### Axis Visualization
```python
from uniface.draw import draw_head_pose
# X/Y/Z coordinate axes
draw_head_pose(image, face.bbox, result.pitch, result.yaw, result.roll, draw_type='axis')
```
### Low-Level Drawing Functions
```python
from uniface.draw import draw_head_pose_cube, draw_head_pose_axis
# Draw cube directly
draw_head_pose_cube(image, yaw=10.0, pitch=-5.0, roll=2.0, bbox=[100, 100, 250, 280])
# Draw axes directly
draw_head_pose_axis(image, yaw=10.0, pitch=-5.0, roll=2.0, bbox=[100, 100, 250, 280])
```
---
## Real-Time Head Pose Tracking
```python
import cv2
from uniface.detection import RetinaFace
from uniface.headpose import HeadPose
from uniface.draw import draw_head_pose
detector = RetinaFace()
head_pose = HeadPose()
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
if not ret:
break
faces = detector.detect(frame)
for face in faces:
x1, y1, x2, y2 = map(int, face.bbox)
face_crop = frame[y1:y2, x1:x2]
if face_crop.size > 0:
result = head_pose.estimate(face_crop)
draw_head_pose(frame, face.bbox, result.pitch, result.yaw, result.roll)
cv2.imshow("Head Pose Estimation", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
```
---
## Use Cases
### Driver Drowsiness Detection
```python
def is_head_drooping(result, pitch_threshold=-15):
"""Check if the head is drooping (looking down significantly)."""
return result.pitch < pitch_threshold
result = head_pose.estimate(face_crop)
if is_head_drooping(result):
print("Warning: Head drooping detected")
```
### Attention Monitoring
```python
def is_facing_forward(result, threshold=20):
"""Check if the person is facing roughly forward."""
return (
abs(result.pitch) < threshold
and abs(result.yaw) < threshold
and abs(result.roll) < threshold
)
result = head_pose.estimate(face_crop)
if is_facing_forward(result):
print("Facing forward")
else:
print("Looking away")
```
---
## Factory Function
```python
from uniface.headpose import create_head_pose_estimator
hp = create_head_pose_estimator() # Returns HeadPose
```
---
## Next Steps
- [Gaze Estimation](gaze.md) - Eye gaze direction
- [Anti-Spoofing](spoofing.md) - Face liveness detection
- [Video Recipe](../recipes/video-webcam.md) - Real-time processing

172
docs/modules/indexing.md Normal file
View File

@@ -0,0 +1,172 @@
# Indexing
FAISS-backed vector store for fast similarity search over embeddings.
!!! info "Optional dependency"
```bash
pip install faiss-cpu
```
---
## FAISS
```python
from uniface.indexing import FAISS
```
A thin wrapper around a FAISS `IndexFlatIP` (inner-product) index. Vectors
**must** be L2-normalised before adding so that inner product equals cosine
similarity. The store does not normalise internally.
Each vector is paired with a metadata `dict` that can carry any
JSON-serialisable payload (person ID, name, source path, etc.).
### Constructor
```python
store = FAISS(embedding_size=512, db_path="./vector_index")
```
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `embedding_size` | `int` | `512` | Dimension of embedding vectors |
| `db_path` | `str` | `"./vector_index"` | Directory for persisting index and metadata |
---
### Methods
#### `add(embedding, metadata)`
Add a single embedding with associated metadata.
```python
store.add(embedding, {"person_id": "alice", "source": "photo.jpg"})
```
| Parameter | Type | Description |
|-----------|------|-------------|
| `embedding` | `np.ndarray` | L2-normalised embedding vector |
| `metadata` | `dict[str, Any]` | Arbitrary JSON-serialisable key-value pairs |
---
#### `search(embedding, threshold=0.4)`
Find the closest match for a query embedding.
```python
result, similarity = store.search(query_embedding, threshold=0.4)
if result:
print(result["person_id"], similarity)
```
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `embedding` | `np.ndarray` | — | L2-normalised query vector |
| `threshold` | `float` | `0.4` | Minimum cosine similarity to accept a match |
**Returns:** `(metadata, similarity)` if a match is found, or `(None, similarity)` when below threshold or the index is empty.
---
#### `remove(key, value)`
Remove all entries where `metadata[key] == value` and rebuild the index.
```python
removed = store.remove("person_id", "bob")
print(f"Removed {removed} entries")
```
| Parameter | Type | Description |
|-----------|------|-------------|
| `key` | `str` | Metadata key to match |
| `value` | `Any` | Value to match |
**Returns:** Number of entries removed.
---
#### `save()`
Persist the FAISS index and metadata to disk.
```python
store.save()
```
Writes two files to `db_path`:
- `faiss_index.bin` — binary FAISS index
- `metadata.json` — JSON array of metadata dicts
---
#### `load()`
Load a previously saved index and metadata.
```python
store = FAISS(db_path="./vector_index")
loaded = store.load() # True if files exist
```
**Returns:** `True` if loaded successfully, `False` if files are missing.
**Raises:** `RuntimeError` if files exist but cannot be read.
---
### Properties
| Property | Type | Description |
|----------|------|-------------|
| `size` | `int` | Number of vectors in the index |
| `len(store)` | `int` | Same as `size` |
---
## Example: End-to-End
```python
import cv2
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.indexing import FAISS
detector = RetinaFace()
recognizer = ArcFace()
# Build
store = FAISS(db_path="./my_index")
image = cv2.imread("alice.jpg")
faces = detector.detect(image)
embedding = recognizer.get_normalized_embedding(image, faces[0].landmarks)
store.add(embedding, {"person_id": "alice"})
store.save()
# Search
store2 = FAISS(db_path="./my_index")
store2.load()
query = cv2.imread("unknown.jpg")
faces = detector.detect(query)
emb = recognizer.get_normalized_embedding(query, faces[0].landmarks)
result, sim = store2.search(emb)
if result:
print(f"Matched: {result['person_id']} (similarity: {sim:.3f})")
else:
print(f"No match (similarity: {sim:.3f})")
```
---
## See Also
- [Face Search Recipe](../recipes/face-search.md) - Building and querying indexes
- [Recognition Module](recognition.md) - Embedding extraction
- [Thresholds Guide](../concepts/thresholds-calibration.md) - Tuning similarity thresholds

View File

@@ -20,7 +20,8 @@ Facial landmark detection provides precise localization of facial features.
### Basic Usage
```python
from uniface import RetinaFace, Landmark106
from uniface.detection import RetinaFace
from uniface.landmark import Landmark106
detector = RetinaFace()
landmarker = Landmark106()
@@ -78,7 +79,7 @@ mouth = landmarks[87:106]
All detection models provide 5-point landmarks:
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
detector = RetinaFace()
faces = detector.detect(image)
@@ -152,7 +153,7 @@ def draw_landmarks_with_connections(image, landmarks):
### Face Alignment
```python
from uniface import face_alignment
from uniface.face_utils import face_alignment
# Align face using 5-point landmarks
aligned = face_alignment(image, faces[0].landmarks)
@@ -236,7 +237,7 @@ def estimate_head_pose(landmarks, image_shape):
## Factory Function
```python
from uniface import create_landmarker
from uniface.landmark import create_landmarker
landmarker = create_landmarker() # Returns Landmark106
```

View File

@@ -1,15 +1,16 @@
# Parsing
Face parsing segments faces into semantic components (skin, eyes, nose, mouth, hair, etc.).
Face parsing segments faces into semantic components or face regions.
---
## Available Models
| Model | Backbone | Size | Classes |
|-------|----------|------|---------|
| **BiSeNet ResNet18** :material-check-circle: | ResNet18 | 51 MB | 19 |
| BiSeNet ResNet34 | ResNet34 | 89 MB | 19 |
| Model | Backbone | Size | Output |
|-------|----------|------|--------|
| **BiSeNet ResNet18** :material-check-circle: | ResNet18 | 51 MB | 19 classes |
| BiSeNet ResNet34 | ResNet34 | 89 MB | 19 classes |
| XSeg | - | 67 MB | Mask |
---
@@ -18,7 +19,7 @@ Face parsing segments faces into semantic components (skin, eyes, nose, mouth, h
```python
import cv2
from uniface.parsing import BiSeNet
from uniface.visualization import vis_parsing_maps
from uniface.draw import vis_parsing_maps
# Initialize parser
parser = BiSeNet()
@@ -45,16 +46,16 @@ cv2.imwrite("parsed.jpg", vis_bgr)
| ID | Class | ID | Class |
|----|-------|----|-------|
| 0 | Background | 10 | Ear Ring |
| 1 | Skin | 11 | Nose |
| 2 | Left Eyebrow | 12 | Mouth |
| 3 | Right Eyebrow | 13 | Upper Lip |
| 4 | Left Eye | 14 | Lower Lip |
| 5 | Right Eye | 15 | Neck |
| 6 | Eye Glasses | 16 | Neck Lace |
| 7 | Left Ear | 17 | Cloth |
| 8 | Right Ear | 18 | Hair |
| 9 | Hat | | |
| 0 | Background | 10 | Nose |
| 1 | Skin | 11 | Mouth |
| 2 | Left Eyebrow | 12 | Upper Lip |
| 3 | Right Eyebrow | 13 | Lower Lip |
| 4 | Left Eye | 14 | Neck |
| 5 | Right Eye | 15 | Necklace |
| 6 | Eyeglasses | 16 | Cloth |
| 7 | Left Ear | 17 | Hair |
| 8 | Right Ear | 18 | Hat |
| 9 | Earring | | |
---
@@ -84,9 +85,9 @@ parser = BiSeNet(model_name=ParsingWeights.RESNET34)
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.parsing import BiSeNet
from uniface.visualization import vis_parsing_maps
from uniface.draw import vis_parsing_maps
detector = RetinaFace()
parser = BiSeNet()
@@ -125,7 +126,7 @@ mask = parser.parse(face_image)
# Extract specific component
SKIN = 1
HAIR = 18
HAIR = 17
LEFT_EYE = 4
RIGHT_EYE = 5
@@ -148,10 +149,10 @@ mask = parser.parse(face_image)
component_names = {
0: 'Background', 1: 'Skin', 2: 'L-Eyebrow', 3: 'R-Eyebrow',
4: 'L-Eye', 5: 'R-Eye', 6: 'Glasses', 7: 'L-Ear', 8: 'R-Ear',
9: 'Hat', 10: 'Earring', 11: 'Nose', 12: 'Mouth',
13: 'U-Lip', 14: 'L-Lip', 15: 'Neck', 16: 'Necklace',
17: 'Cloth', 18: 'Hair'
4: 'L-Eye', 5: 'R-Eye', 6: 'Eyeglasses', 7: 'L-Ear', 8: 'R-Ear',
9: 'Earring', 10: 'Nose', 11: 'Mouth',
12: 'U-Lip', 13: 'L-Lip', 14: 'Neck', 15: 'Necklace',
16: 'Cloth', 17: 'Hair', 18: 'Hat'
}
for class_id in np.unique(mask):
@@ -176,23 +177,19 @@ def apply_lip_color(image, mask, color=(180, 50, 50)):
"""Apply lip color using parsing mask."""
result = image.copy()
# Get lip mask (upper + lower lip)
lip_mask = ((mask == 13) | (mask == 14)).astype(np.uint8)
# Get lip mask (upper lip=12, lower lip=13)
lip_mask = ((mask == 12) | (mask == 13)).astype(np.uint8)
# Create color overlay
overlay = np.zeros_like(image)
overlay[:] = color
# Blend with original
lip_region = cv2.bitwise_and(overlay, overlay, mask=lip_mask)
non_lip = cv2.bitwise_and(result, result, mask=1 - lip_mask)
# Combine with alpha blending
# Alpha blend lip region
alpha = 0.4
result = cv2.addWeighted(result, 1 - alpha * lip_mask[:,:,np.newaxis] / 255,
lip_region, alpha, 0)
mask_3ch = lip_mask[:, :, np.newaxis]
result = np.where(mask_3ch, (image * (1 - alpha) + overlay * alpha).astype(np.uint8), result)
return result.astype(np.uint8)
return result
```
### Background Replacement
@@ -218,7 +215,7 @@ def replace_background(image, mask, background):
```python
def get_hair_mask(mask):
"""Extract clean hair mask."""
hair_mask = (mask == 18).astype(np.uint8) * 255
hair_mask = (mask == 17).astype(np.uint8) * 255
# Clean up with morphological operations
kernel = np.ones((5, 5), np.uint8)
@@ -233,7 +230,7 @@ def get_hair_mask(mask):
## Visualization Options
```python
from uniface.visualization import vis_parsing_maps
from uniface.draw import vis_parsing_maps
# Default visualization
vis_result = vis_parsing_maps(face_rgb, mask)
@@ -248,12 +245,83 @@ vis_result = vis_parsing_maps(
---
## XSeg
XSeg outputs a mask for face regions. Unlike BiSeNet which works on bbox crops, XSeg requires 5-point landmarks for face alignment.
### Basic Usage
```python
import cv2
from uniface.detection import RetinaFace
from uniface.parsing import XSeg
detector = RetinaFace()
parser = XSeg()
image = cv2.imread("photo.jpg")
faces = detector.detect(image)
for face in faces:
if face.landmarks is not None:
mask = parser.parse(image, landmarks=face.landmarks)
print(f"Mask shape: {mask.shape}") # (H, W), values in [0, 1]
```
### Parameters
```python
from uniface.parsing import XSeg
# Default settings
parser = XSeg()
# Custom settings
parser = XSeg(
align_size=256, # Face alignment size
blur_sigma=5, # Gaussian blur for smoothing (0 = raw)
)
```
| Parameter | Default | Description |
|-----------|---------|-------------|
| `align_size` | 256 | Face alignment output size |
| `blur_sigma` | 0 | Mask smoothing (0 = no blur) |
### Methods
```python
# Full pipeline: align -> segment -> warp back to original space
mask = parser.parse(image, landmarks=landmarks)
# For pre-aligned face crops
mask = parser.parse_aligned(face_crop)
# Get mask + crop + inverse matrix for custom warping
mask, face_crop, inverse_matrix = parser.parse_with_inverse(image, landmarks)
```
### BiSeNet vs XSeg
| Feature | BiSeNet | XSeg |
|---------|---------|------|
| Output | 19 class labels | Mask [0, 1] |
| Input | Bbox crop | Requires landmarks |
| Use case | Facial components | Face region extraction |
---
## Factory Function
```python
from uniface import create_face_parser
from uniface.parsing import create_face_parser
from uniface.constants import ParsingWeights, XSegWeights
parser = create_face_parser() # Returns BiSeNet
# BiSeNet (default)
parser = create_face_parser()
# XSeg
parser = create_face_parser(XSegWeights.DEFAULT)
```
---

View File

@@ -18,25 +18,8 @@ Face anonymization protects privacy by blurring or obscuring faces in images and
## Quick Start
### One-Line Anonymization
```python
from uniface.privacy import anonymize_faces
import cv2
image = cv2.imread("group_photo.jpg")
anonymized = anonymize_faces(image, method='pixelate')
cv2.imwrite("anonymized.jpg", anonymized)
```
---
## BlurFace Class
For more control, use the `BlurFace` class:
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
import cv2
@@ -59,12 +42,12 @@ cv2.imwrite("anonymized.jpg", anonymized)
Blocky pixelation effect (common in news media):
```python
blurrer = BlurFace(method='pixelate', pixel_blocks=10)
blurrer = BlurFace(method='pixelate', pixel_blocks=15)
```
| Parameter | Default | Description |
|-----------|---------|-------------|
| `pixel_blocks` | 10 | Number of blocks (lower = more pixelated) |
| `pixel_blocks` | 15 | Number of blocks (lower = more pixelated) |
### Gaussian
@@ -137,7 +120,7 @@ result = blurrer.anonymize(image, faces, inplace=True)
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
detector = RetinaFace()
@@ -166,7 +149,7 @@ cv2.destroyAllWindows()
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
detector = RetinaFace()
@@ -238,7 +221,7 @@ def anonymize_low_confidence(image, faces, blurrer, confidence_threshold=0.8):
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
detector = RetinaFace()
@@ -259,13 +242,13 @@ for method in methods:
```bash
# Anonymize image with pixelation
python tools/face_anonymize.py --source photo.jpg
python tools/anonymize.py --source photo.jpg
# Real-time webcam
python tools/face_anonymize.py --source 0 --method gaussian
python tools/anonymize.py --source 0 --method gaussian
# Custom blur strength
python tools/face_anonymize.py --source photo.jpg --method gaussian --blur-strength 5.0
python tools/anonymize.py --source photo.jpg --method gaussian --blur-strength 5.0
```
---

View File

@@ -22,7 +22,8 @@ Face recognition using adaptive margin based on image quality.
### Basic Usage
```python
from uniface import RetinaFace, AdaFace
from uniface.detection import RetinaFace
from uniface.recognition import AdaFace
detector = RetinaFace()
recognizer = AdaFace()
@@ -39,7 +40,7 @@ if faces:
### Model Variants
```python
from uniface import AdaFace
from uniface.recognition import AdaFace
from uniface.constants import AdaFaceWeights
# Lightweight (default)
@@ -69,7 +70,8 @@ Face recognition using additive angular margin loss.
### Basic Usage
```python
from uniface import RetinaFace, ArcFace
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
detector = RetinaFace()
recognizer = ArcFace()
@@ -86,7 +88,7 @@ if faces:
### Model Variants
```python
from uniface import ArcFace
from uniface.recognition import ArcFace
from uniface.constants import ArcFaceWeights
# Lightweight (default)
@@ -118,7 +120,7 @@ Lightweight face recognition models with MobileNet backbones.
### Basic Usage
```python
from uniface import MobileFace
from uniface.recognition import MobileFace
recognizer = MobileFace()
embedding = recognizer.get_normalized_embedding(image, landmarks)
@@ -127,7 +129,7 @@ embedding = recognizer.get_normalized_embedding(image, landmarks)
### Model Variants
```python
from uniface import MobileFace
from uniface.recognition import MobileFace
from uniface.constants import MobileFaceWeights
# Ultra-lightweight
@@ -156,7 +158,7 @@ Face recognition using angular softmax loss (A-Softmax).
### Basic Usage
```python
from uniface import SphereFace
from uniface.recognition import SphereFace
from uniface.constants import SphereFaceWeights
recognizer = SphereFace(model_name=SphereFaceWeights.SPHERE20)
@@ -175,7 +177,7 @@ embedding = recognizer.get_normalized_embedding(image, landmarks)
### Compute Similarity
```python
from uniface import compute_similarity
from uniface.face_utils import compute_similarity
import numpy as np
# Extract embeddings
@@ -211,7 +213,7 @@ Recognition models require aligned faces. UniFace handles this internally:
embedding = recognizer.get_normalized_embedding(image, landmarks)
# Or manually align
from uniface import face_alignment
from uniface.face_utils import face_alignment
aligned_face = face_alignment(image, landmarks)
# Returns: 112x112 aligned face image
@@ -223,7 +225,8 @@ aligned_face = face_alignment(image, landmarks)
```python
import numpy as np
from uniface import RetinaFace, ArcFace
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
detector = RetinaFace()
recognizer = ArcFace()
@@ -282,7 +285,7 @@ else:
## Factory Function
```python
from uniface import create_recognizer
from uniface.recognition import create_recognizer
# Available methods: 'arcface', 'adaface', 'mobileface', 'sphereface'
recognizer = create_recognizer('arcface')

View File

@@ -17,7 +17,7 @@ Face anti-spoofing detects whether a face is real (live) or fake (photo, video r
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.spoofing import MiniFASNet
detector = RetinaFace()
@@ -69,20 +69,21 @@ spoofer = MiniFASNet(model_name=MiniFASNetWeights.V1SE)
## Confidence Thresholds
The default threshold is 0.5. Adjust for your use case:
`result.is_real` is based on the model's top predicted class (argmax). If you want stricter behavior,
apply your own confidence threshold:
```python
result = spoofer.predict(image, face.bbox)
# High security (fewer false accepts)
HIGH_THRESHOLD = 0.7
if result.confidence > HIGH_THRESHOLD:
if result.is_real and result.confidence > HIGH_THRESHOLD:
print("Real (high confidence)")
else:
print("Suspicious")
# Balanced
if result.is_real: # Uses default 0.5 threshold
# Balanced (argmax decision)
if result.is_real:
print("Real")
else:
print("Fake")
@@ -127,7 +128,7 @@ cv2.imwrite("spoofing_result.jpg", image)
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.spoofing import MiniFASNet
detector = RetinaFace()
@@ -252,7 +253,7 @@ python tools/spoofing.py --source 0
## Factory Function
```python
from uniface import create_spoofer
from uniface.spoofing import create_spoofer
spoofer = create_spoofer() # Returns MiniFASNet
```

263
docs/modules/tracking.md Normal file
View File

@@ -0,0 +1,263 @@
# Tracking
Multi-object tracking using [BYTETracker](https://github.com/yakhyo/bytetrack-tracker) with Kalman filtering and IoU-based association. The tracker assigns persistent IDs to detected objects across video frames using a two-stage association strategy — first matching high-confidence detections, then low-confidence ones.
---
## How It Works
BYTETracker takes detection bounding boxes as input and returns tracked bounding boxes with persistent IDs. It does not depend on any specific detector — any source of `[x1, y1, x2, y2, score]` arrays will work.
Each frame, the tracker:
1. Splits detections into high-confidence and low-confidence groups
2. Matches high-confidence detections to existing tracks using IoU
3. Matches remaining tracks to low-confidence detections (second chance)
4. Starts new tracks for unmatched high-confidence detections
5. Removes tracks that have been lost for too long
The Kalman filter predicts where each track will be in the next frame, which helps maintain associations even when detections are noisy.
---
## Basic Usage
```python
import cv2
import numpy as np
from uniface.common import xyxy_to_cxcywh
from uniface.detection import SCRFD
from uniface.tracking import BYTETracker
from uniface.draw import draw_tracks
detector = SCRFD()
tracker = BYTETracker(track_thresh=0.5, track_buffer=30)
cap = cv2.VideoCapture("video.mp4")
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
# 1. Detect faces
faces = detector.detect(frame)
# 2. Build detections array: [x1, y1, x2, y2, score]
dets = np.array([[*f.bbox, f.confidence] for f in faces])
dets = dets if len(dets) > 0 else np.empty((0, 5))
# 3. Update tracker
tracks = tracker.update(dets)
# 4. Map track IDs back to face objects
if len(tracks) > 0 and len(faces) > 0:
face_bboxes = np.array([f.bbox for f in faces], dtype=np.float32)
track_ids = tracks[:, 4].astype(int)
face_centers = xyxy_to_cxcywh(face_bboxes)[:, :2]
track_centers = xyxy_to_cxcywh(tracks[:, :4])[:, :2]
for ti in range(len(tracks)):
dists = (track_centers[ti, 0] - face_centers[:, 0]) ** 2 + (track_centers[ti, 1] - face_centers[:, 1]) ** 2
faces[int(np.argmin(dists))].track_id = track_ids[ti]
# 5. Draw
tracked_faces = [f for f in faces if f.track_id is not None]
draw_tracks(image=frame, faces=tracked_faces)
cv2.imshow("Tracking", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
```
Each track ID gets a deterministic color via golden-ratio hue stepping, so the same person keeps the same color across the entire video.
---
## Webcam Tracking
```python
import cv2
import numpy as np
from uniface.common import xyxy_to_cxcywh
from uniface.detection import SCRFD
from uniface.tracking import BYTETracker
from uniface.draw import draw_tracks
detector = SCRFD()
tracker = BYTETracker(track_thresh=0.5, track_buffer=30)
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
if not ret:
break
faces = detector.detect(frame)
dets = np.array([[*f.bbox, f.confidence] for f in faces])
dets = dets if len(dets) > 0 else np.empty((0, 5))
tracks = tracker.update(dets)
if len(tracks) > 0 and len(faces) > 0:
face_bboxes = np.array([f.bbox for f in faces], dtype=np.float32)
track_ids = tracks[:, 4].astype(int)
face_centers = xyxy_to_cxcywh(face_bboxes)[:, :2]
track_centers = xyxy_to_cxcywh(tracks[:, :4])[:, :2]
for ti in range(len(tracks)):
dists = (track_centers[ti, 0] - face_centers[:, 0]) ** 2 + (track_centers[ti, 1] - face_centers[:, 1]) ** 2
faces[int(np.argmin(dists))].track_id = track_ids[ti]
draw_tracks(image=frame, faces=[f for f in faces if f.track_id is not None])
cv2.imshow("Face Tracking - Press 'q' to quit", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
```
---
## Parameters
```python
from uniface.tracking import BYTETracker
tracker = BYTETracker(
track_thresh=0.5,
track_buffer=30,
match_thresh=0.8,
low_thresh=0.1,
)
```
| Parameter | Default | Description |
|-----------|---------|-------------|
| `track_thresh` | 0.5 | Detections above this score go through first-pass association |
| `track_buffer` | 30 | How many frames to keep a lost track before removing it |
| `match_thresh` | 0.8 | IoU threshold for matching tracks to detections |
| `low_thresh` | 0.1 | Detections below this score are discarded entirely |
---
## Input / Output
**Input**`(N, 5)` numpy array with `[x1, y1, x2, y2, confidence]` per detection:
```python
detections = np.array([
[100, 50, 200, 160, 0.95],
[300, 80, 380, 200, 0.87],
])
```
**Output**`(M, 5)` numpy array with `[x1, y1, x2, y2, track_id]` per active track:
```python
tracks = tracker.update(detections)
# array([[101.2, 51.3, 199.8, 159.8, 1.],
# [300.5, 80.2, 379.7, 200.1, 2.]])
```
The output bounding boxes come from the Kalman filter prediction, so they may differ slightly from the input. Track IDs are integers that persist across frames for the same object.
---
## Resetting the Tracker
When switching to a different video or scene, reset the tracker to clear all internal state:
```python
tracker.reset()
```
This clears all active, lost, and removed tracks, resets the frame counter, and resets the ID counter back to zero.
---
## Visualization
`draw_tracks` draws bounding boxes color-coded by track ID:
```python
from uniface.draw import draw_tracks
draw_tracks(
image=frame,
faces=tracked_faces,
draw_landmarks=True,
draw_id=True,
corner_bbox=True,
)
```
---
## Small Face Performance
!!! warning "Tracking performance with small faces"
The tracker relies on IoU (Intersection over Union) to match detections across
frames. When faces occupy a small portion of the image — for example in
surveillance footage or wide-angle cameras — even slight movement between frames
can cause a large drop in IoU. This makes it harder for the tracker to maintain
consistent IDs, and you may see IDs switching or resetting more often than expected.
This is not specific to BYTETracker; it applies to any IoU-based tracker. A few
things that can help:
- **Lower `match_thresh`** (e.g. `0.5` or `0.6`) so the tracker accepts lower
overlap as a valid match.
- **Increase `track_buffer`** (e.g. `60` or higher) to hold onto lost tracks
longer before discarding them.
- **Use a higher-resolution input** if possible, so face bounding boxes are
larger in pixel terms.
```python
tracker = BYTETracker(
track_thresh=0.4,
track_buffer=60,
match_thresh=0.6,
)
```
---
## CLI Tool
```bash
# Track faces in a video
python tools/track.py --source video.mp4
# Webcam
python tools/track.py --source 0
# Save output
python tools/track.py --source video.mp4 --output tracked.mp4
# Use RetinaFace instead of SCRFD
python tools/track.py --source video.mp4 --detector retinaface
# Keep lost tracks longer
python tools/track.py --source video.mp4 --track-buffer 60
```
---
## References
- [yakhyo/bytetrack-tracker](https://github.com/yakhyo/bytetrack-tracker) — standalone BYTETracker implementation used in UniFace
- [ByteTrack paper](https://arxiv.org/abs/2110.06864) — Zhang et al., "ByteTrack: Multi-Object Tracking by Associating Every Detection Box"
---
## See Also
- [Detection](detection.md) — face detection models
- [Video & Webcam](../recipes/video-webcam.md) — video processing patterns
- [Inputs & Outputs](../concepts/inputs-outputs.md) — data types and formats

View File

@@ -16,6 +16,9 @@ Run UniFace examples directly in your browser with Google Colab, or download and
| [Face Parsing](https://github.com/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | Semantic face segmentation |
| [Face Anonymization](https://github.com/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | Privacy-preserving blur |
| [Gaze Estimation](https://github.com/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | Gaze direction estimation |
| [Face Segmentation](https://github.com/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | Face segmentation with XSeg |
| [Face Vector Store](https://github.com/yakhyo/uniface/blob/main/examples/10_face_vector_store.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/10_face_vector_store.ipynb) | FAISS-backed face database |
| [Head Pose Estimation](https://github.com/yakhyo/uniface/blob/main/examples/11_head_pose_estimation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/11_head_pose_estimation.ipynb) | 3D head orientation estimation |
---

7
docs/overrides/main.html Normal file
View File

@@ -0,0 +1,7 @@
{% extends "base.html" %}
{% block announce %}
<a href="https://github.com/yakhyo/uniface" target="_blank" rel="noopener">
Support our work &mdash; give UniFace a <span class="twemoji">{% include ".icons/octicons/star-fill-16.svg" %}</span> on <strong>GitHub</strong> and help us reach more developers!
</a>
{% endblock %}

View File

@@ -10,7 +10,7 @@ Detect faces in an image:
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
# Load image
image = cv2.imread("photo.jpg")
@@ -46,8 +46,8 @@ Draw bounding boxes and landmarks:
```python
import cv2
from uniface import RetinaFace
from uniface.visualization import draw_detections
from uniface.detection import RetinaFace
from uniface.draw import draw_detections
# Detect faces
detector = RetinaFace()
@@ -80,8 +80,8 @@ Compare two faces:
```python
import cv2
import numpy as np
from uniface import RetinaFace, ArcFace
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
# Initialize models
detector = RetinaFace()
@@ -96,12 +96,13 @@ faces1 = detector.detect(image1)
faces2 = detector.detect(image2)
if faces1 and faces2:
# Extract embeddings
# Extract embeddings (normalized 1-D vectors)
emb1 = recognizer.get_normalized_embedding(image1, faces1[0].landmarks)
emb2 = recognizer.get_normalized_embedding(image2, faces2[0].landmarks)
# Compute similarity (cosine similarity)
similarity = np.dot(emb1, emb2.T)[0][0]
# Compute cosine similarity
from uniface import compute_similarity
similarity = compute_similarity(emb1, emb2, normalized=True)
# Interpret result
if similarity > 0.6:
@@ -121,7 +122,8 @@ if faces1 and faces2:
```python
import cv2
from uniface import RetinaFace, AgeGender
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
# Initialize models
detector = RetinaFace()
@@ -133,7 +135,7 @@ faces = detector.detect(image)
# Predict attributes
for i, face in enumerate(faces):
result = age_gender.predict(image, face.bbox)
result = age_gender.predict(image, face)
print(f"Face {i+1}: {result.sex}, {result.age} years old")
```
@@ -152,7 +154,8 @@ Detect race, gender, and age group:
```python
import cv2
from uniface import RetinaFace, FairFace
from uniface.attribute import FairFace
from uniface.detection import RetinaFace
detector = RetinaFace()
fairface = FairFace()
@@ -161,7 +164,7 @@ image = cv2.imread("photo.jpg")
faces = detector.detect(image)
for i, face in enumerate(faces):
result = fairface.predict(image, face.bbox)
result = fairface.predict(image, face)
print(f"Face {i+1}: {result.sex}, {result.age_group}, {result.race}")
```
@@ -178,7 +181,8 @@ Face 2: Female, 20-29, White
```python
import cv2
from uniface import RetinaFace, Landmark106
from uniface.detection import RetinaFace
from uniface.landmark import Landmark106
detector = RetinaFace()
landmarker = Landmark106()
@@ -204,8 +208,9 @@ if faces:
```python
import cv2
import numpy as np
from uniface import RetinaFace, MobileGaze
from uniface.visualization import draw_gaze
from uniface.detection import RetinaFace
from uniface.gaze import MobileGaze
from uniface.draw import draw_gaze
detector = RetinaFace()
gaze_estimator = MobileGaze()
@@ -229,6 +234,36 @@ cv2.imwrite("gaze_output.jpg", image)
---
## Head Pose Estimation
```python
import cv2
from uniface.detection import RetinaFace
from uniface.headpose import HeadPose
from uniface.draw import draw_head_pose
detector = RetinaFace()
head_pose = HeadPose()
image = cv2.imread("photo.jpg")
faces = detector.detect(image)
for i, face in enumerate(faces):
x1, y1, x2, y2 = map(int, face.bbox[:4])
face_crop = image[y1:y2, x1:x2]
if face_crop.size > 0:
result = head_pose.estimate(face_crop)
print(f"Face {i+1}: pitch={result.pitch:.1f}°, yaw={result.yaw:.1f}°, roll={result.roll:.1f}°")
# Draw 3D cube visualization
draw_head_pose(image, face.bbox, result.pitch, result.yaw, result.roll)
cv2.imwrite("headpose_output.jpg", image)
```
---
## Face Parsing
Segment face into semantic components:
@@ -237,7 +272,7 @@ Segment face into semantic components:
import cv2
import numpy as np
from uniface.parsing import BiSeNet
from uniface.visualization import vis_parsing_maps
from uniface.draw import vis_parsing_maps
parser = BiSeNet()
@@ -261,26 +296,24 @@ print(f"Detected {len(np.unique(mask))} facial components")
Blur faces for privacy protection:
```python
from uniface.privacy import anonymize_faces
import cv2
# One-liner: automatic detection and blurring
image = cv2.imread("group_photo.jpg")
anonymized = anonymize_faces(image, method='pixelate')
cv2.imwrite("anonymized.jpg", anonymized)
```
**Manual control:**
```python
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
detector = RetinaFace()
blurrer = BlurFace(method='gaussian', blur_strength=5.0)
blurrer = BlurFace(method='pixelate')
image = cv2.imread("group_photo.jpg")
faces = detector.detect(image)
anonymized = blurrer.anonymize(image, faces)
cv2.imwrite("anonymized.jpg", anonymized)
```
**Custom blur settings:**
```python
blurrer = BlurFace(method='gaussian', blur_strength=5.0)
anonymized = blurrer.anonymize(image, faces)
```
**Available methods:**
@@ -301,7 +334,7 @@ Detect real vs. fake faces:
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.spoofing import MiniFASNet
detector = RetinaFace()
@@ -324,8 +357,8 @@ Real-time face detection:
```python
import cv2
from uniface import RetinaFace
from uniface.visualization import draw_detections
from uniface.detection import RetinaFace
from uniface.draw import draw_detections
detector = RetinaFace()
cap = cv2.VideoCapture(0)
@@ -355,6 +388,60 @@ cv2.destroyAllWindows()
---
## Face Tracking
Track faces across video frames with persistent IDs:
```python
import cv2
import numpy as np
from uniface.common import xyxy_to_cxcywh
from uniface.detection import SCRFD
from uniface.tracking import BYTETracker
from uniface.draw import draw_tracks
detector = SCRFD()
tracker = BYTETracker(track_thresh=0.5, track_buffer=30)
cap = cv2.VideoCapture("video.mp4")
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
faces = detector.detect(frame)
dets = np.array([[*f.bbox, f.confidence] for f in faces])
dets = dets if len(dets) > 0 else np.empty((0, 5))
tracks = tracker.update(dets)
# Assign track IDs to faces
if len(tracks) > 0 and len(faces) > 0:
face_bboxes = np.array([f.bbox for f in faces], dtype=np.float32)
track_ids = tracks[:, 4].astype(int)
face_centers = xyxy_to_cxcywh(face_bboxes)[:, :2]
track_centers = xyxy_to_cxcywh(tracks[:, :4])[:, :2]
for ti in range(len(tracks)):
dists = (track_centers[ti, 0] - face_centers[:, 0]) ** 2 + (track_centers[ti, 1] - face_centers[:, 1]) ** 2
faces[int(np.argmin(dists))].track_id = track_ids[ti]
tracked_faces = [f for f in faces if f.track_id is not None]
draw_tracks(image=frame, faces=tracked_faces)
cv2.imshow("Tracking", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
```
For more details, see the [Tracking module](modules/tracking.md).
---
## Model Selection
For detailed model comparisons and benchmarks, see the [Model Zoo](models.md).
@@ -365,7 +452,9 @@ For detailed model comparisons and benchmarks, see the [Model Zoo](models.md).
|------|------------------|
| Detection | `RetinaFace`, `SCRFD`, `YOLOv5Face`, `YOLOv8Face` |
| Recognition | `ArcFace`, `AdaFace`, `MobileFace`, `SphereFace` |
| Tracking | `BYTETracker` |
| Gaze | `MobileGaze` (ResNet18/34/50, MobileNetV2, MobileOneS0) |
| Head Pose | `HeadPose` (ResNet18/34/50, MobileNetV2/V3) |
| Parsing | `BiSeNet` (ResNet18/34) |
| Attributes | `AgeGender`, `FairFace`, `Emotion` |
| Anti-Spoofing | `MiniFASNet` (V1SE, V2) |
@@ -407,13 +496,19 @@ python -c "import platform; print(platform.machine())"
### Import Errors
```python
# Correct imports
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.detection import RetinaFace, SCRFD
from uniface.recognition import ArcFace, AdaFace
from uniface.attribute import AgeGender, FairFace
from uniface.landmark import Landmark106
# Also works (re-exported at package level)
from uniface import RetinaFace, ArcFace, Landmark106
from uniface.gaze import MobileGaze
from uniface.headpose import HeadPose
from uniface.parsing import BiSeNet, XSeg
from uniface.privacy import BlurFace
from uniface.spoofing import MiniFASNet
from uniface.tracking import BYTETracker
from uniface.analyzer import FaceAnalyzer
from uniface.indexing import FAISS # pip install faiss-cpu
from uniface.draw import draw_detections, draw_tracks
```
---

View File

@@ -11,7 +11,7 @@ Blur faces in real-time video streams for privacy protection.
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
detector = RetinaFace()
@@ -40,7 +40,7 @@ cv2.destroyAllWindows()
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
detector = RetinaFace()
@@ -67,14 +67,19 @@ out.release()
---
## One-Liner for Images
## Single Image
```python
from uniface.privacy import anonymize_faces
import cv2
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
detector = RetinaFace()
blurrer = BlurFace(method='pixelate')
image = cv2.imread("photo.jpg")
result = anonymize_faces(image, method='pixelate')
faces = detector.detect(image)
result = blurrer.anonymize(image, faces)
cv2.imwrite("anonymized.jpg", result)
```
@@ -84,7 +89,7 @@ cv2.imwrite("anonymized.jpg", result)
| Method | Usage |
|--------|-------|
| Pixelate | `BlurFace(method='pixelate', pixel_blocks=10)` |
| Pixelate | `BlurFace(method='pixelate', pixel_blocks=15)` |
| Gaussian | `BlurFace(method='gaussian', blur_strength=3.0)` |
| Blackout | `BlurFace(method='blackout', color=(0,0,0))` |
| Elliptical | `BlurFace(method='elliptical', margin=20)` |

View File

@@ -12,7 +12,7 @@ Process multiple images efficiently.
```python
import cv2
from pathlib import Path
from uniface import RetinaFace
from uniface.detection import RetinaFace
detector = RetinaFace()
@@ -54,7 +54,8 @@ for image_path in tqdm(image_files, desc="Processing"):
## Extract Embeddings
```python
from uniface import RetinaFace, ArcFace
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
import numpy as np
detector = RetinaFace()

View File

@@ -27,29 +27,29 @@ import numpy as np
class MyDetector(BaseDetector):
def __init__(self, model_path: str, confidence_threshold: float = 0.5):
super().__init__(confidence_threshold=confidence_threshold)
self.session = create_onnx_session(model_path)
self.threshold = confidence_threshold
def preprocess(self, image: np.ndarray) -> np.ndarray:
# Your preprocessing logic
# e.g., resize, normalize, transpose
raise NotImplementedError
def postprocess(self, outputs, shape) -> list[Face]:
# Your postprocessing logic
# e.g., decode boxes, apply NMS, create Face objects
raise NotImplementedError
def detect(self, image: np.ndarray) -> list[Face]:
# 1. Preprocess image
input_tensor = self._preprocess(image)
input_tensor = self.preprocess(image)
# 2. Run inference
outputs = self.session.run(None, {'input': input_tensor})
# 3. Postprocess outputs to Face objects
faces = self._postprocess(outputs, image.shape)
return faces
def _preprocess(self, image):
# Your preprocessing logic
# e.g., resize, normalize, transpose
pass
def _postprocess(self, outputs, shape):
# Your postprocessing logic
# e.g., decode boxes, apply NMS, create Face objects
pass
return self.postprocess(outputs, image.shape)
```
---
@@ -57,36 +57,14 @@ class MyDetector(BaseDetector):
## Add Custom Recognition Model
```python
from uniface.recognition.base import BaseRecognizer
from uniface.onnx_utils import create_onnx_session
from uniface import face_alignment
import numpy as np
from uniface.recognition.base import BaseRecognizer, PreprocessConfig
class MyRecognizer(BaseRecognizer):
def __init__(self, model_path: str):
self.session = create_onnx_session(model_path)
def __init__(self, model_path: str, providers=None):
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
super().__init__(model_path, preprocessing, providers=providers)
def get_normalized_embedding(
self,
image: np.ndarray,
landmarks: np.ndarray
) -> np.ndarray:
# 1. Align face
aligned = face_alignment(image, landmarks)
# 2. Preprocess
input_tensor = self._preprocess(aligned)
# 3. Run inference
embedding = self.session.run(None, {'input': input_tensor})[0]
# 4. Normalize
embedding = embedding / np.linalg.norm(embedding)
return embedding
def _preprocess(self, image):
# Your preprocessing logic
pass
# Optional: override preprocess() if your model expects custom normalization.
```
---

View File

@@ -1,178 +1,166 @@
# Face Search
Build a face search system for finding people in images.
Find and identify people in images and video streams.
!!! note "Work in Progress"
This page contains example code patterns. Test thoroughly before using in production.
UniFace supports two search approaches:
| Approach | Use case | Tool |
| -------------------- | ------------------------------------------------ | ----------------------- |
| **Reference search** | "Is this specific person in the video?" | `tools/search.py` |
| **Vector search** | "Who is this?" against a database of known faces | `tools/faiss_search.py` |
---
## Basic Face Database
## Reference Search (single image)
Compare every detected face against a single reference photo:
```python
import cv2
import numpy as np
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.face_utils import compute_similarity
detector = RetinaFace()
recognizer = ArcFace()
ref_image = cv2.imread("reference.jpg")
ref_faces = detector.detect(ref_image)
ref_embedding = recognizer.get_normalized_embedding(ref_image, ref_faces[0].landmarks)
query_image = cv2.imread("group_photo.jpg")
faces = detector.detect(query_image)
for face in faces:
embedding = recognizer.get_normalized_embedding(query_image, face.landmarks)
sim = compute_similarity(ref_embedding, embedding)
label = f"Match ({sim:.2f})" if sim > 0.4 else f"Unknown ({sim:.2f})"
print(label)
```
**CLI tool:**
```bash
python tools/search.py --reference ref.jpg --source video.mp4
python tools/search.py --reference ref.jpg --source 0 # webcam
```
---
## Vector Search (FAISS index)
For identifying faces against a database of many known people, use the
[`FAISS`](../modules/indexing.md) vector store.
!!! info "Install extra"
`bash
pip install faiss-cpu
`
### Build an index
Organise face images in person sub-folders:
```
dataset/
├── alice/
│ ├── 001.jpg
│ └── 002.jpg
├── bob/
│ └── 001.jpg
└── charlie/
├── 001.jpg
└── 002.jpg
```
```python
import cv2
from pathlib import Path
from uniface import RetinaFace, ArcFace
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.indexing import FAISS
class FaceDatabase:
def __init__(self):
self.detector = RetinaFace()
self.recognizer = ArcFace()
self.embeddings = {}
detector = RetinaFace()
recognizer = ArcFace()
store = FAISS(db_path="./my_index")
def add_face(self, person_id, image):
"""Add a face to the database."""
faces = self.detector.detect(image)
if not faces:
raise ValueError(f"No face found for {person_id}")
for person_dir in sorted(Path("dataset").iterdir()):
if not person_dir.is_dir():
continue
for img_path in person_dir.glob("*.jpg"):
image = cv2.imread(str(img_path))
faces = detector.detect(image)
if faces:
emb = recognizer.get_normalized_embedding(image, faces[0].landmarks)
store.add(emb, {"person_id": person_dir.name, "source": str(img_path)})
face = max(faces, key=lambda f: f.confidence)
embedding = self.recognizer.get_normalized_embedding(image, face.landmarks)
self.embeddings[person_id] = embedding
return True
def search(self, image, threshold=0.6):
"""Search for faces in an image."""
faces = self.detector.detect(image)
results = []
for face in faces:
embedding = self.recognizer.get_normalized_embedding(image, face.landmarks)
best_match = None
best_similarity = -1
for person_id, db_embedding in self.embeddings.items():
similarity = np.dot(embedding, db_embedding.T)[0][0]
if similarity > best_similarity:
best_similarity = similarity
best_match = person_id
results.append({
'bbox': face.bbox,
'match': best_match if best_similarity >= threshold else None,
'similarity': best_similarity
})
return results
def save(self, path):
"""Save database to file."""
np.savez(path, embeddings=dict(self.embeddings))
def load(self, path):
"""Load database from file."""
data = np.load(path, allow_pickle=True)
self.embeddings = data['embeddings'].item()
# Usage
db = FaceDatabase()
# Add faces
for image_path in Path("known_faces/").glob("*.jpg"):
person_id = image_path.stem
image = cv2.imread(str(image_path))
try:
db.add_face(person_id, image)
print(f"Added: {person_id}")
except ValueError as e:
print(f"Skipped: {e}")
# Save database
db.save("face_database.npz")
# Search
query_image = cv2.imread("group_photo.jpg")
results = db.search(query_image)
for r in results:
if r['match']:
print(f"Found: {r['match']} (similarity: {r['similarity']:.3f})")
store.save()
print(f"Index saved: {store}")
```
---
**CLI tool:**
## Visualization
```bash
python tools/faiss_search.py build --faces-dir dataset/ --db-path ./my_index
```
### Search against the index
```python
import cv2
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.indexing import FAISS
def visualize_search_results(image, results):
"""Draw search results on image."""
for r in results:
x1, y1, x2, y2 = map(int, r['bbox'])
detector = RetinaFace()
recognizer = ArcFace()
if r['match']:
color = (0, 255, 0) # Green for match
label = f"{r['match']} ({r['similarity']:.2f})"
else:
color = (0, 0, 255) # Red for unknown
label = f"Unknown ({r['similarity']:.2f})"
store = FAISS(db_path="./my_index")
store.load()
cv2.rectangle(image, (x1, y1), (x2, y2), color, 2)
cv2.putText(image, label, (x1, y1 - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
image = cv2.imread("query.jpg")
faces = detector.detect(image)
return image
for face in faces:
embedding = recognizer.get_normalized_embedding(image, face.landmarks)
result, similarity = store.search(embedding, threshold=0.4)
# Usage
results = db.search(image)
annotated = visualize_search_results(image.copy(), results)
cv2.imwrite("search_result.jpg", annotated)
if result:
print(f"Matched: {result['person_id']} ({similarity:.2f})")
else:
print(f"Unknown ({similarity:.2f})")
```
---
**CLI tool:**
## Real-Time Search
```bash
python tools/faiss_search.py run --db-path ./my_index --source video.mp4
python tools/faiss_search.py run --db-path ./my_index --source 0 # webcam
```
### Manage the index
```python
import cv2
from uniface.indexing import FAISS
def realtime_search(db):
"""Real-time face search from webcam."""
cap = cv2.VideoCapture(0)
store = FAISS(db_path="./my_index")
store.load()
while True:
ret, frame = cap.read()
if not ret:
break
print(f"Total vectors: {len(store)}")
results = db.search(frame, threshold=0.5)
removed = store.remove("person_id", "bob")
print(f"Removed {removed} entries")
for r in results:
x1, y1, x2, y2 = map(int, r['bbox'])
if r['match']:
color = (0, 255, 0)
label = r['match']
else:
color = (0, 0, 255)
label = "Unknown"
cv2.rectangle(frame, (x1, y1), (x2, y2), color, 2)
cv2.putText(frame, label, (x1, y1 - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
cv2.imshow("Face Search", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
# Usage
db = FaceDatabase()
db.load("face_database.npz")
realtime_search(db)
store.save()
```
---
## See Also
- [Indexing Module](../modules/indexing.md) - Full `FAISS` API reference
- [Recognition Module](../modules/recognition.md) - Face recognition details
- [Batch Processing](batch-processing.md) - Process multiple files
- [Video & Webcam](video-webcam.md) - Real-time processing
- [Concepts: Thresholds](../concepts/thresholds-calibration.md) - Tuning similarity thresholds

View File

@@ -8,8 +8,10 @@ A complete pipeline for processing images with detection, recognition, and attri
```python
import cv2
from uniface import RetinaFace, ArcFace, AgeGender
from uniface.visualization import draw_detections
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.draw import draw_detections
# Initialize models
detector = RetinaFace()
@@ -32,7 +34,7 @@ def process_image(image_path):
embedding = recognizer.get_normalized_embedding(image, face.landmarks)
# Step 3: Predict attributes
attrs = age_gender.predict(image, face.bbox)
attrs = age_gender.predict(image, face)
results.append({
'face_id': i,
@@ -67,14 +69,21 @@ cv2.imwrite("result.jpg", result_image)
For convenience, use the built-in `FaceAnalyzer`:
```python
from uniface import FaceAnalyzer
from uniface.analyzer import FaceAnalyzer
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
import cv2
# Initialize with desired modules
detector = RetinaFace()
recognizer = ArcFace()
age_gender = AgeGender()
analyzer = FaceAnalyzer(
detect=True,
recognize=True,
attributes=True
detector,
recognizer=recognizer,
attributes=[age_gender],
)
# Process image
@@ -97,13 +106,15 @@ Complete pipeline with all modules:
```python
import cv2
import numpy as np
from uniface import (
RetinaFace, ArcFace, AgeGender, FairFace,
Landmark106, MobileGaze
)
from uniface.attribute import AgeGender, FairFace
from uniface.detection import RetinaFace
from uniface.gaze import MobileGaze
from uniface.headpose import HeadPose
from uniface.landmark import Landmark106
from uniface.recognition import ArcFace
from uniface.parsing import BiSeNet
from uniface.spoofing import MiniFASNet
from uniface.visualization import draw_detections, draw_gaze
from uniface.draw import draw_detections, draw_gaze, draw_head_pose
class FaceAnalysisPipeline:
def __init__(self):
@@ -114,6 +125,7 @@ class FaceAnalysisPipeline:
self.fairface = FairFace()
self.landmarker = Landmark106()
self.gaze = MobileGaze()
self.head_pose = HeadPose()
self.parser = BiSeNet()
self.spoofer = MiniFASNet()
@@ -135,12 +147,12 @@ class FaceAnalysisPipeline:
)
# Attributes
ag_result = self.age_gender.predict(image, face.bbox)
ag_result = self.age_gender.predict(image, face)
result['age'] = ag_result.age
result['gender'] = ag_result.sex
# FairFace attributes
ff_result = self.fairface.predict(image, face.bbox)
ff_result = self.fairface.predict(image, face)
result['age_group'] = ff_result.age_group
result['race'] = ff_result.race
@@ -157,6 +169,13 @@ class FaceAnalysisPipeline:
result['gaze_pitch'] = gaze_result.pitch
result['gaze_yaw'] = gaze_result.yaw
# Head pose estimation
if face_crop.size > 0:
hp_result = self.head_pose.estimate(face_crop)
result['head_pitch'] = hp_result.pitch
result['head_yaw'] = hp_result.yaw
result['head_roll'] = hp_result.roll
# Face parsing
if face_crop.size > 0:
result['parsing_mask'] = self.parser.parse(face_crop)
@@ -179,6 +198,7 @@ for i, r in enumerate(results):
print(f" Gender: {r['gender']}, Age: {r['age']}")
print(f" Race: {r['race']}, Age Group: {r['age_group']}")
print(f" Gaze: pitch={np.degrees(r['gaze_pitch']):.1f}°")
print(f" Head Pose: P={r['head_pitch']:.1f}° Y={r['head_yaw']:.1f}° R={r['head_roll']:.1f}°")
print(f" Real: {r['is_real']} ({r['spoof_confidence']:.1%})")
```
@@ -189,8 +209,10 @@ for i, r in enumerate(results):
```python
import cv2
import numpy as np
from uniface import RetinaFace, AgeGender, MobileGaze
from uniface.visualization import draw_detections, draw_gaze
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
from uniface.gaze import MobileGaze
from uniface.draw import draw_detections, draw_gaze
def visualize_analysis(image_path, output_path):
"""Create annotated visualization of face analysis."""
@@ -208,7 +230,7 @@ def visualize_analysis(image_path, output_path):
cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Age and gender
attrs = age_gender.predict(image, face.bbox)
attrs = age_gender.predict(image, face)
label = f"{attrs.sex}, {attrs.age}y"
cv2.putText(image, label, (x1, y1 - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)
@@ -256,6 +278,11 @@ def results_to_json(results):
'gaze': {
'pitch_deg': float(np.degrees(r['gaze_pitch'])) if 'gaze_pitch' in r else None,
'yaw_deg': float(np.degrees(r['gaze_yaw'])) if 'gaze_yaw' in r else None
},
'head_pose': {
'pitch': float(r['head_pitch']) if 'head_pitch' in r else None,
'yaw': float(r['head_yaw']) if 'head_yaw' in r else None,
'roll': float(r['head_roll']) if 'head_roll' in r else None
}
}
output.append(item)
@@ -279,3 +306,4 @@ with open('results.json', 'w') as f:
- [Face Search](face-search.md) - Build a search system
- [Detection Module](../modules/detection.md) - Detection options
- [Recognition Module](../modules/recognition.md) - Recognition details
- [Head Pose Module](../modules/headpose.md) - Head orientation estimation

View File

@@ -11,8 +11,8 @@ Real-time face analysis for video streams.
```python
import cv2
from uniface import RetinaFace
from uniface.visualization import draw_detections
from uniface.detection import RetinaFace
from uniface.draw import draw_detections
detector = RetinaFace()
cap = cv2.VideoCapture(0)
@@ -48,7 +48,7 @@ cv2.destroyAllWindows()
```python
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
def process_video(input_path, output_path):
"""Process a video file."""
@@ -83,6 +83,57 @@ process_video("input.mp4", "output.mp4")
---
## Webcam Tracking
To track faces across frames with persistent IDs, pair a detector with `BYTETracker`:
```python
import cv2
import numpy as np
from uniface.common import xyxy_to_cxcywh
from uniface.detection import SCRFD
from uniface.tracking import BYTETracker
from uniface.draw import draw_tracks
detector = SCRFD()
tracker = BYTETracker(track_thresh=0.5, track_buffer=30)
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
if not ret:
break
faces = detector.detect(frame)
dets = np.array([[*f.bbox, f.confidence] for f in faces])
dets = dets if len(dets) > 0 else np.empty((0, 5))
tracks = tracker.update(dets)
if len(tracks) > 0 and len(faces) > 0:
face_bboxes = np.array([f.bbox for f in faces], dtype=np.float32)
track_ids = tracks[:, 4].astype(int)
face_centers = xyxy_to_cxcywh(face_bboxes)[:, :2]
track_centers = xyxy_to_cxcywh(tracks[:, :4])[:, :2]
for ti in range(len(tracks)):
dists = (track_centers[ti, 0] - face_centers[:, 0]) ** 2 + (track_centers[ti, 1] - face_centers[:, 1]) ** 2
faces[int(np.argmin(dists))].track_id = track_ids[ti]
draw_tracks(image=frame, faces=[f for f in faces if f.track_id is not None])
cv2.imshow("Face Tracking", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
```
For more details on tracker parameters and tuning, see [Tracking](../modules/tracking.md).
---
## Performance Tips
### Skip Frames
@@ -119,7 +170,9 @@ while True:
## See Also
- [Tracking Module](../modules/tracking.md) - Face tracking with BYTETracker
- [Anonymize Stream](anonymize-stream.md) - Privacy protection in video
- [Batch Processing](batch-processing.md) - Process multiple files
- [Detection Module](../modules/detection.md) - Detection options
- [Gaze Module](../modules/gaze.md) - Gaze tracking
- [Gaze Module](../modules/gaze.md) - Gaze estimation
- [Head Pose Module](../modules/headpose.md) - Head orientation estimation

View File

@@ -51,7 +51,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"2.0.0\n"
"3.2.0\n"
]
}
],
@@ -62,7 +62,7 @@
"\n",
"import uniface\n",
"from uniface.detection import RetinaFace\n",
"from uniface.visualization import draw_detections\n",
"from uniface.draw import draw_detections\n",
"\n",
"print(uniface.__version__)"
]
@@ -162,7 +162,7 @@
"landmarks = [f.landmarks for f in faces]\n",
"\n",
"# Draw detections\n",
"draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, fancy_bbox=True)\n",
"draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, corner_bbox=True)\n",
"\n",
"# Display result\n",
"output_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n",
@@ -214,7 +214,7 @@
"scores = [f.confidence for f in faces]\n",
"landmarks = [f.landmarks for f in faces]\n",
"\n",
"draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, fancy_bbox=True)\n",
"draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, corner_bbox=True)\n",
"\n",
"output_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n",
"display.display(Image.fromarray(output_image))"
@@ -261,7 +261,7 @@
"scores = [f.confidence for f in faces]\n",
"landmarks = [f.landmarks for f in faces]\n",
"\n",
"draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, fancy_bbox=True)\n",
"draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, corner_bbox=True)\n",
"\n",
"output_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n",
"display.display(Image.fromarray(output_image))"

View File

@@ -55,7 +55,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"2.0.0\n"
"3.2.0\n"
]
}
],
@@ -67,7 +67,7 @@
"import uniface\n",
"from uniface.detection import RetinaFace\n",
"from uniface.face_utils import face_alignment\n",
"from uniface.visualization import draw_detections\n",
"from uniface.draw import draw_detections\n",
"\n",
"print(uniface.__version__)"
]
@@ -142,7 +142,7 @@
" bboxes = [f.bbox for f in faces]\n",
" scores = [f.confidence for f in faces]\n",
" landmarks = [f.landmarks for f in faces]\n",
" draw_detections(image=bbox_image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, fancy_bbox=True)\n",
" draw_detections(image=bbox_image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=0.6, corner_bbox=True)\n",
"\n",
" # Align first detected face (returns aligned image and inverse transform matrix)\n",
" first_landmarks = faces[0].landmarks\n",

View File

@@ -44,7 +44,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"2.0.0\n"
"3.2.0\n"
]
}
],
@@ -53,7 +53,7 @@
"import matplotlib.pyplot as plt\n",
"\n",
"import uniface\n",
"from uniface import FaceAnalyzer\n",
"from uniface.analyzer import FaceAnalyzer\n",
"from uniface.detection import RetinaFace\n",
"from uniface.recognition import ArcFace\n",
"\n",

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -15,7 +15,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 1,
"metadata": {},
"outputs": [
{
@@ -53,7 +53,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"UniFace version: 2.0.0\n"
"UniFace version: 3.2.0\n"
]
}
],
@@ -66,7 +66,7 @@
"import uniface\n",
"from uniface.parsing import BiSeNet\n",
"from uniface.constants import ParsingWeights\n",
"from uniface.visualization import vis_parsing_maps\n",
"from uniface.draw import vis_parsing_maps\n",
"\n",
"print(f\"UniFace version: {uniface.__version__}\")"
]
@@ -82,15 +82,7 @@
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"✓ Model loaded (CoreML (Apple Silicon))\n"
]
}
],
"outputs": [],
"source": [
"# Initialize face parser (uses ResNet18 by default)\n",
"parser = BiSeNet(model_name=ParsingWeights.RESNET34) # use resnet34 for better accuracy"

File diff suppressed because one or more lines are too long

View File

@@ -51,7 +51,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"UniFace version: 2.0.0\n"
"UniFace version: 3.2.0\n"
]
}
],
@@ -65,7 +65,7 @@
"import uniface\n",
"from uniface.detection import RetinaFace\n",
"from uniface.gaze import MobileGaze\n",
"from uniface.visualization import draw_gaze\n",
"from uniface.draw import draw_gaze\n",
"\n",
"print(f\"UniFace version: {uniface.__version__}\")"
]
@@ -110,19 +110,19 @@
"text": [
"Processing: image0.jpg\n",
" Detected 1 face(s)\n",
" Face 1: pitch=-0.0°, yaw=7.1°\n",
" Face 1: pitch=7.1°, yaw=-0.0°\n",
"Processing: image1.jpg\n",
" Detected 1 face(s)\n",
" Face 1: pitch=-3.3°, yaw=-5.6°\n",
" Face 1: pitch=-5.6°, yaw=-3.3°\n",
"Processing: image2.jpg\n",
" Detected 1 face(s)\n",
" Face 1: pitch=-3.9°, yaw=-0.3°\n",
" Face 1: pitch=-0.3°, yaw=-3.9°\n",
"Processing: image3.jpg\n",
" Detected 1 face(s)\n",
" Face 1: pitch=-22.1°, yaw=1.0°\n",
" Face 1: pitch=1.0°, yaw=-22.1°\n",
"Processing: image4.jpg\n",
" Detected 1 face(s)\n",
" Face 1: pitch=2.1°, yaw=5.0°\n",
" Face 1: pitch=5.0°, yaw=2.1°\n",
"\n",
"Processed 5 images\n"
]

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -48,6 +48,7 @@ theme:
- content.action.edit
- content.action.view
- content.tabs.link
- announce.dismiss
- toc.follow
icon:
@@ -134,6 +135,7 @@ nav:
- Quickstart: quickstart.md
- Notebooks: notebooks.md
- Model Zoo: models.md
- Datasets: datasets.md
- Tutorials:
- Image Pipeline: recipes/image-pipeline.md
- Video & Webcam: recipes/video-webcam.md
@@ -144,12 +146,15 @@ nav:
- API Reference:
- Detection: modules/detection.md
- Recognition: modules/recognition.md
- Tracking: modules/tracking.md
- Landmarks: modules/landmarks.md
- Attributes: modules/attributes.md
- Parsing: modules/parsing.md
- Gaze: modules/gaze.md
- Head Pose: modules/headpose.md
- Anti-Spoofing: modules/spoofing.md
- Privacy: modules/privacy.md
- Indexing: modules/indexing.md
- Guides:
- Overview: concepts/overview.md
- Inputs & Outputs: concepts/inputs-outputs.md

View File

@@ -1,7 +1,7 @@
[project]
name = "uniface"
version = "2.2.1"
description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Landmark Analysis, Face Parsing, Gaze Estimation, Age, and Gender Detection"
version = "3.2.0"
description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Tracking, Landmark Analysis, Face Parsing, Gaze Estimation, Age, and Gender Detection"
readme = "README.md"
license = "MIT"
authors = [{ name = "Yakhyokhuja Valikhujaev", email = "yakhyo9696@gmail.com" }]
@@ -9,10 +9,11 @@ maintainers = [
{ name = "Yakhyokhuja Valikhujaev", email = "yakhyo9696@gmail.com" },
]
requires-python = ">=3.10,<3.14"
requires-python = ">=3.11,<3.15"
keywords = [
"face-detection",
"face-recognition",
"face-tracking",
"facial-landmarks",
"face-parsing",
"face-segmentation",
@@ -28,23 +29,23 @@ keywords = [
]
classifiers = [
"Development Status :: 4 - Beta",
"Development Status :: 5 - Production/Stable",
"Intended Audience :: Developers",
"Intended Audience :: Science/Research",
"Operating System :: OS Independent",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Programming Language :: Python :: 3.14",
]
dependencies = [
"numpy>=1.21.0",
"opencv-python>=4.5.0",
"onnx>=1.12.0",
"onnxruntime>=1.16.0",
"scikit-image>=0.19.0",
"scikit-image>=0.26.0",
"scipy>=1.7.0",
"requests>=2.28.0",
"tqdm>=4.64.0",
]
@@ -56,9 +57,9 @@ gpu = ["onnxruntime-gpu>=1.16.0"]
[project.urls]
Homepage = "https://github.com/yakhyo/uniface"
Repository = "https://github.com/yakhyo/uniface"
Documentation = "https://github.com/yakhyo/uniface/blob/main/README.md"
"Quick Start" = "https://github.com/yakhyo/uniface/blob/main/QUICKSTART.md"
"Model Zoo" = "https://github.com/yakhyo/uniface/blob/main/MODELS.md"
Documentation = "https://yakhyo.github.io/uniface"
"Quick Start" = "https://yakhyo.github.io/uniface/quickstart/"
"Model Zoo" = "https://yakhyo.github.io/uniface/models/"
[build-system]
requires = ["setuptools>=64", "wheel"]
@@ -72,7 +73,7 @@ uniface = ["py.typed"]
[tool.ruff]
line-length = 120
target-version = "py310"
target-version = "py311"
exclude = [
".git",
".ruff_cache",

View File

@@ -1,8 +1,7 @@
numpy>=1.21.0
opencv-python>=4.5.0
onnx>=1.12.0
onnxruntime>=1.16.0
scikit-image>=0.19.0
scikit-image>=0.26.0
scipy>=1.7.0
requests>=2.28.0
pytest>=7.0.0
tqdm>=4.64.0

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for AgeGender attribute predictor."""
from __future__ import annotations
@@ -10,6 +9,14 @@ import numpy as np
import pytest
from uniface.attribute import AgeGender, AttributeResult
from uniface.types import Face
def _make_face(bbox: list[int] | np.ndarray) -> Face:
"""Helper: build a minimal Face from a bounding box."""
bbox = np.asarray(bbox)
landmarks = np.zeros((5, 2), dtype=np.float32)
return Face(bbox=bbox, confidence=0.99, landmarks=landmarks)
@pytest.fixture
@@ -23,30 +30,30 @@ def mock_image():
@pytest.fixture
def mock_bbox():
return [100, 100, 300, 300]
def mock_face():
return _make_face([100, 100, 300, 300])
def test_model_initialization(age_gender_model):
assert age_gender_model is not None, 'AgeGender model initialization failed.'
def test_prediction_output_format(age_gender_model, mock_image, mock_bbox):
result = age_gender_model.predict(mock_image, mock_bbox)
def test_prediction_output_format(age_gender_model, mock_image, mock_face):
result = age_gender_model.predict(mock_image, mock_face)
assert isinstance(result, AttributeResult), f'Result should be AttributeResult, got {type(result)}'
assert isinstance(result.gender, int), f'Gender should be int, got {type(result.gender)}'
assert isinstance(result.age, int), f'Age should be int, got {type(result.age)}'
assert isinstance(result.sex, str), f'Sex should be str, got {type(result.sex)}'
def test_gender_values(age_gender_model, mock_image, mock_bbox):
result = age_gender_model.predict(mock_image, mock_bbox)
def test_gender_values(age_gender_model, mock_image, mock_face):
result = age_gender_model.predict(mock_image, mock_face)
assert result.gender in [0, 1], f'Gender should be 0 (Female) or 1 (Male), got {result.gender}'
assert result.sex in ['Female', 'Male'], f'Sex should be Female or Male, got {result.sex}'
def test_age_range(age_gender_model, mock_image, mock_bbox):
result = age_gender_model.predict(mock_image, mock_bbox)
def test_age_range(age_gender_model, mock_image, mock_face):
result = age_gender_model.predict(mock_image, mock_face)
assert 0 <= result.age <= 120, f'Age should be between 0 and 120, got {result.age}'
@@ -58,39 +65,52 @@ def test_different_bbox_sizes(age_gender_model, mock_image):
]
for bbox in test_bboxes:
result = age_gender_model.predict(mock_image, bbox)
face = _make_face(bbox)
result = age_gender_model.predict(mock_image, face)
assert result.gender in [0, 1], f'Failed for bbox {bbox}'
assert 0 <= result.age <= 120, f'Age out of range for bbox {bbox}'
def test_different_image_sizes(age_gender_model, mock_bbox):
def test_different_image_sizes(age_gender_model):
test_sizes = [(480, 640, 3), (720, 1280, 3), (1080, 1920, 3)]
face = _make_face([100, 100, 300, 300])
for size in test_sizes:
mock_image = np.random.randint(0, 255, size, dtype=np.uint8)
result = age_gender_model.predict(mock_image, mock_bbox)
result = age_gender_model.predict(mock_image, face)
assert result.gender in [0, 1], f'Failed for image size {size}'
assert 0 <= result.age <= 120, f'Age out of range for image size {size}'
def test_consistency(age_gender_model, mock_image, mock_bbox):
result1 = age_gender_model.predict(mock_image, mock_bbox)
result2 = age_gender_model.predict(mock_image, mock_bbox)
def test_consistency(age_gender_model, mock_image, mock_face):
result1 = age_gender_model.predict(mock_image, mock_face)
result2 = age_gender_model.predict(mock_image, mock_face)
assert result1.gender == result2.gender, 'Same input should produce same gender prediction'
assert result1.age == result2.age, 'Same input should produce same age prediction'
def test_face_enrichment(age_gender_model, mock_image, mock_face):
"""predict() must write gender & age back to the Face object."""
assert mock_face.gender is None
assert mock_face.age is None
result = age_gender_model.predict(mock_image, mock_face)
assert mock_face.gender == result.gender
assert mock_face.age == result.age
def test_bbox_list_format(age_gender_model, mock_image):
bbox_list = [100, 100, 300, 300]
result = age_gender_model.predict(mock_image, bbox_list)
face = _make_face([100, 100, 300, 300])
result = age_gender_model.predict(mock_image, face)
assert result.gender in [0, 1], 'Should work with bbox as list'
assert 0 <= result.age <= 120, 'Age should be in valid range'
def test_bbox_array_format(age_gender_model, mock_image):
bbox_array = np.array([100, 100, 300, 300])
result = age_gender_model.predict(mock_image, bbox_array)
face = _make_face(np.array([100, 100, 300, 300]))
result = age_gender_model.predict(mock_image, face)
assert result.gender in [0, 1], 'Should work with bbox as numpy array'
assert 0 <= result.age <= 120, 'Age should be in valid range'
@@ -104,7 +124,8 @@ def test_multiple_predictions(age_gender_model, mock_image):
results = []
for bbox in bboxes:
result = age_gender_model.predict(mock_image, bbox)
face = _make_face(bbox)
result = age_gender_model.predict(mock_image, face)
results.append(result)
assert len(results) == 3, 'Should have 3 predictions'
@@ -113,28 +134,26 @@ def test_multiple_predictions(age_gender_model, mock_image):
assert 0 <= result.age <= 120
def test_age_is_positive(age_gender_model, mock_image, mock_bbox):
def test_age_is_positive(age_gender_model, mock_image, mock_face):
for _ in range(5):
result = age_gender_model.predict(mock_image, mock_bbox)
result = age_gender_model.predict(mock_image, mock_face)
assert result.age >= 0, f'Age should be non-negative, got {result.age}'
def test_output_format_for_visualization(age_gender_model, mock_image, mock_bbox):
result = age_gender_model.predict(mock_image, mock_bbox)
def test_output_format_for_visualization(age_gender_model, mock_image, mock_face):
result = age_gender_model.predict(mock_image, mock_face)
text = f'{result.sex}, {result.age}y'
assert isinstance(text, str), 'Should be able to format as string'
assert 'Male' in text or 'Female' in text, 'Text should contain gender'
assert 'y' in text, "Text should contain 'y' for years"
def test_attribute_result_fields(age_gender_model, mock_image, mock_bbox):
def test_attribute_result_fields(age_gender_model, mock_image, mock_face):
"""Test that AttributeResult has correct fields for AgeGender model."""
result = age_gender_model.predict(mock_image, mock_bbox)
result = age_gender_model.predict(mock_image, mock_face)
# AgeGender should set gender and age
assert result.gender is not None
assert result.age is not None
# AgeGender should NOT set race and age_group (FairFace only)
assert result.race is None
assert result.age_group is None

61
tests/test_draw.py Normal file
View File

@@ -0,0 +1,61 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
import numpy as np
from uniface.draw import draw_gaze
def _compute_gaze_delta(bbox: np.ndarray, pitch: float, yaw: float) -> tuple[int, int]:
"""Replicate draw_gaze dx/dy math for verification."""
x_min, _, x_max, _ = map(int, bbox[:4])
length = x_max - x_min
dx = int(-length * np.sin(yaw) * np.cos(pitch))
dy = int(-length * np.sin(pitch))
return dx, dy
def test_draw_gaze_yaw_only_moves_horizontally():
"""Yaw-only input (pitch=0) should produce horizontal displacement only."""
image = np.zeros((200, 200, 3), dtype=np.uint8)
bbox = np.array([50, 50, 150, 150], dtype=np.float32)
yaw = 0.5
pitch = 0.0
dx, dy = _compute_gaze_delta(bbox, pitch, yaw)
assert dx != 0, 'Yaw-only should produce horizontal displacement'
assert dy == 0, 'Yaw-only should produce zero vertical displacement'
# Should not raise
draw_gaze(image, bbox, pitch, yaw, draw_bbox=False, draw_angles=False)
def test_draw_gaze_pitch_only_moves_vertically():
"""Pitch-only input (yaw=0) should produce vertical displacement only."""
image = np.zeros((200, 200, 3), dtype=np.uint8)
bbox = np.array([50, 50, 150, 150], dtype=np.float32)
yaw = 0.0
pitch = 0.5
dx, dy = _compute_gaze_delta(bbox, pitch, yaw)
assert dx == 0, 'Pitch-only should produce zero horizontal displacement'
assert dy != 0, 'Pitch-only should produce vertical displacement'
# Should not raise
draw_gaze(image, bbox, pitch, yaw, draw_bbox=False, draw_angles=False)
def test_draw_gaze_modifies_image():
"""draw_gaze should modify the image in place."""
image = np.zeros((200, 200, 3), dtype=np.uint8)
bbox = np.array([50, 50, 150, 150], dtype=np.float32)
original = image.copy()
draw_gaze(image, bbox, 0.3, 0.3)
assert not np.array_equal(image, original), 'draw_gaze should modify the image'

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for factory functions (create_detector, create_recognizer, etc.)."""
from __future__ import annotations
@@ -10,13 +9,15 @@ import numpy as np
import pytest
from uniface import (
create_attribute_predictor,
create_detector,
create_landmarker,
create_recognizer,
detect_faces,
list_available_detectors,
)
from uniface.constants import RetinaFaceWeights, SCRFDWeights
from uniface.attribute import AgeGender, FairFace
from uniface.constants import AgeGenderWeights, FairFaceWeights, RetinaFaceWeights, SCRFDWeights
from uniface.spoofing import MiniFASNet, create_spoofer
# create_detector tests
@@ -123,62 +124,6 @@ def test_create_landmarker_invalid_method():
create_landmarker('invalid_method')
# detect_faces tests
def test_detect_faces_retinaface():
"""
Test high-level detect_faces function with RetinaFace.
"""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = detect_faces(mock_image, method='retinaface')
assert isinstance(faces, list), 'detect_faces should return a list'
def test_detect_faces_scrfd():
"""
Test high-level detect_faces function with SCRFD.
"""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = detect_faces(mock_image, method='scrfd')
assert isinstance(faces, list), 'detect_faces should return a list'
def test_detect_faces_with_threshold():
"""
Test detect_faces with custom confidence threshold.
"""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = detect_faces(mock_image, method='retinaface', confidence_threshold=0.8)
assert isinstance(faces, list), 'detect_faces should return a list'
# All detections should respect threshold
for face in faces:
assert face.confidence >= 0.8, 'All detections should meet confidence threshold'
def test_detect_faces_default_method():
"""
Test detect_faces with default method (should use retinaface).
"""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = detect_faces(mock_image) # No method specified
assert isinstance(faces, list), 'detect_faces should return a list with default method'
def test_detect_faces_empty_image():
"""
Test detect_faces on a blank image.
"""
empty_image = np.zeros((640, 640, 3), dtype=np.uint8)
faces = detect_faces(empty_image, method='retinaface')
assert isinstance(faces, list), 'Should return a list even for empty image'
assert len(faces) == 0, 'Should detect no faces in blank image'
# list_available_detectors tests
def test_list_available_detectors():
"""
@@ -222,7 +167,7 @@ def test_recognizer_inference_from_factory():
embedding = recognizer.get_embedding(mock_image)
assert embedding is not None, 'Recognizer should return embedding'
assert embedding.shape[1] == 512, 'Should return 512-dimensional embedding'
assert embedding.shape == (1, 512), 'get_embedding should return (1, 512) with batch dimension'
def test_landmarker_inference_from_factory():
@@ -280,3 +225,32 @@ def test_factory_returns_correct_types():
assert isinstance(detector, RetinaFace), 'Should return RetinaFace instance'
assert isinstance(recognizer, ArcFace), 'Should return ArcFace instance'
assert isinstance(landmarker, Landmark106), 'Should return Landmark106 instance'
# create_spoofer tests
def test_create_spoofer_default():
"""Test creating a spoofer with default parameters."""
spoofer = create_spoofer()
assert isinstance(spoofer, MiniFASNet), 'Should return MiniFASNet instance'
def test_create_spoofer_with_providers():
"""Test that create_spoofer forwards providers kwarg without TypeError."""
spoofer = create_spoofer(providers=['CPUExecutionProvider'])
assert isinstance(spoofer, MiniFASNet), 'Should return MiniFASNet instance'
# create_attribute_predictor tests
def test_create_attribute_predictor_age_gender():
predictor = create_attribute_predictor(AgeGenderWeights.DEFAULT)
assert isinstance(predictor, AgeGender), 'Should return AgeGender instance'
def test_create_attribute_predictor_fairface():
predictor = create_attribute_predictor(FairFaceWeights.DEFAULT)
assert isinstance(predictor, FairFace), 'Should return FairFace instance'
def test_create_attribute_predictor_invalid():
with pytest.raises(ValueError, match='Unsupported attribute model'):
create_attribute_predictor('invalid_model')

115
tests/test_headpose.py Normal file
View File

@@ -0,0 +1,115 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
import numpy as np
import pytest
from uniface import HeadPose, HeadPoseResult, create_head_pose_estimator
from uniface.headpose import BaseHeadPoseEstimator
from uniface.headpose.models import HeadPose as HeadPoseModel
def test_create_head_pose_estimator_default():
"""Test creating a head pose estimator with default parameters."""
estimator = create_head_pose_estimator()
assert isinstance(estimator, HeadPose), 'Should return HeadPose instance'
def test_create_head_pose_estimator_aliases():
"""Test that factory accepts all documented aliases."""
for alias in ('headpose', 'head_pose', '6drepnet'):
estimator = create_head_pose_estimator(alias)
assert isinstance(estimator, HeadPose), f"Alias '{alias}' should return HeadPose"
def test_create_head_pose_estimator_invalid():
"""Test that invalid method raises ValueError."""
with pytest.raises(ValueError, match='Unsupported head pose estimation method'):
create_head_pose_estimator('invalid_method')
def test_head_pose_inference():
"""Test that HeadPose can run inference on a mock image."""
estimator = HeadPose()
mock_image = np.random.randint(0, 255, (224, 224, 3), dtype=np.uint8)
result = estimator.estimate(mock_image)
assert isinstance(result, HeadPoseResult), 'Should return HeadPoseResult'
assert isinstance(result.pitch, float), 'pitch should be float'
assert isinstance(result.yaw, float), 'yaw should be float'
assert isinstance(result.roll, float), 'roll should be float'
def test_head_pose_callable():
"""Test that HeadPose is callable via __call__."""
estimator = HeadPose()
mock_image = np.random.randint(0, 255, (224, 224, 3), dtype=np.uint8)
result = estimator(mock_image)
assert isinstance(result, HeadPoseResult), '__call__ should return HeadPoseResult'
def test_head_pose_result_repr():
"""Test HeadPoseResult repr formatting."""
result = HeadPoseResult(pitch=10.5, yaw=-20.3, roll=5.1)
repr_str = repr(result)
assert 'HeadPoseResult' in repr_str
assert '10.5' in repr_str
assert '-20.3' in repr_str
assert '5.1' in repr_str
def test_head_pose_result_frozen():
"""Test that HeadPoseResult is immutable."""
result = HeadPoseResult(pitch=1.0, yaw=2.0, roll=3.0)
with pytest.raises(AttributeError):
result.pitch = 99.0 # type: ignore[misc]
def test_rotation_matrix_to_euler_identity():
"""Test that identity rotation matrix gives zero angles."""
identity = np.eye(3).reshape(1, 3, 3)
euler = HeadPoseModel.rotation_matrix_to_euler(identity)
assert euler.shape == (1, 3), 'Should return (1, 3) shaped array'
np.testing.assert_allclose(euler[0], [0.0, 0.0, 0.0], atol=1e-5)
def test_rotation_matrix_to_euler_90deg_yaw():
"""Test 90-degree yaw rotation."""
angle = np.radians(90)
R = np.array(
[
[np.cos(angle), 0, np.sin(angle)],
[0, 1, 0],
[-np.sin(angle), 0, np.cos(angle)],
]
).reshape(1, 3, 3)
euler = HeadPoseModel.rotation_matrix_to_euler(R)
np.testing.assert_allclose(euler[0, 1], 90.0, atol=1e-3)
def test_rotation_matrix_to_euler_batch():
"""Test batch processing of rotation matrices."""
batch = np.stack([np.eye(3), np.eye(3), np.eye(3)], axis=0)
euler = HeadPoseModel.rotation_matrix_to_euler(batch)
assert euler.shape == (3, 3), 'Batch of 3 should return (3, 3)'
np.testing.assert_allclose(euler, 0.0, atol=1e-5)
def test_factory_returns_correct_type():
"""Test that factory function returns BaseHeadPoseEstimator subclass."""
estimator = create_head_pose_estimator()
assert isinstance(estimator, BaseHeadPoseEstimator), 'Should be BaseHeadPoseEstimator subclass'
def test_head_pose_with_providers():
"""Test that HeadPose accepts providers kwarg."""
estimator = HeadPose(providers=['CPUExecutionProvider'])
assert isinstance(estimator, HeadPose), 'Should create with explicit providers'

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for 106-point facial landmark detector."""
from __future__ import annotations

View File

@@ -2,15 +2,14 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for BiSeNet face parsing model."""
from __future__ import annotations
import numpy as np
import pytest
from uniface.constants import ParsingWeights
from uniface.parsing import BiSeNet, create_face_parser
from uniface.constants import ParsingWeights, XSegWeights
from uniface.parsing import BiSeNet, XSeg, create_face_parser
def test_bisenet_initialization():
@@ -120,3 +119,151 @@ def test_bisenet_different_input_sizes():
assert mask.shape == (h, w), f'Failed for size {h}x{w}'
assert mask.dtype == np.uint8
# XSeg Tests
def test_xseg_initialization():
"""Test XSeg initialization."""
parser = XSeg()
assert parser is not None
assert parser.input_size == (256, 256)
assert parser.align_size == 256
assert parser.blur_sigma == 0
def test_xseg_with_custom_params():
"""Test XSeg with custom parameters."""
parser = XSeg(align_size=512, blur_sigma=5)
assert parser.align_size == 512
assert parser.blur_sigma == 5
def test_xseg_preprocess():
"""Test XSeg preprocessing."""
parser = XSeg()
# Create a dummy aligned face crop
face_crop = np.random.randint(0, 255, (256, 256, 3), dtype=np.uint8)
# Preprocess
preprocessed = parser.preprocess(face_crop)
assert preprocessed.shape == (1, 256, 256, 3) # NHWC format
assert preprocessed.dtype == np.float32
assert preprocessed.min() >= 0
assert preprocessed.max() <= 1
def test_xseg_postprocess():
"""Test XSeg postprocessing."""
parser = XSeg()
# Create dummy model output (NHWC format)
dummy_output = np.random.rand(1, 256, 256, 1).astype(np.float32)
# Postprocess
mask = parser.postprocess(dummy_output, crop_size=(256, 256))
assert mask.shape == (256, 256)
assert mask.dtype == np.float32
assert mask.min() >= 0
assert mask.max() <= 1
def test_xseg_parse_aligned():
"""Test XSeg parse_aligned method."""
parser = XSeg()
# Create a dummy aligned face crop
face_crop = np.random.randint(0, 255, (256, 256, 3), dtype=np.uint8)
# Parse
mask = parser.parse_aligned(face_crop)
assert mask.shape == (256, 256)
assert mask.dtype == np.float32
assert mask.min() >= 0
assert mask.max() <= 1
def test_xseg_parse_with_landmarks():
"""Test XSeg parse method with landmarks."""
parser = XSeg()
# Create a dummy image
image = np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
# Create dummy 5-point landmarks
landmarks = np.array(
[
[250, 200], # left eye
[390, 200], # right eye
[320, 280], # nose
[260, 350], # left mouth
[380, 350], # right mouth
],
dtype=np.float32,
)
# Parse
mask = parser.parse(image, landmarks=landmarks)
assert mask.shape == (480, 640)
assert mask.dtype == np.float32
assert mask.min() >= 0
assert mask.max() <= 1
def test_xseg_parse_invalid_landmarks():
"""Test XSeg parse with invalid landmarks shape."""
parser = XSeg()
image = np.random.randint(0, 255, (256, 256, 3), dtype=np.uint8)
# Wrong shape
invalid_landmarks = np.array([[0, 0], [1, 1], [2, 2]])
with pytest.raises(ValueError, match='Landmarks must have shape'):
parser.parse(image, landmarks=invalid_landmarks)
def test_xseg_parse_with_inverse():
"""Test XSeg parse_with_inverse method."""
parser = XSeg()
# Create a dummy image
image = np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
# Create dummy 5-point landmarks
landmarks = np.array(
[
[250, 200],
[390, 200],
[320, 280],
[260, 350],
[380, 350],
],
dtype=np.float32,
)
# Parse with inverse
mask, face_crop, inverse_matrix = parser.parse_with_inverse(image, landmarks)
assert mask.shape == (256, 256)
assert face_crop.shape == (256, 256, 3)
assert inverse_matrix.shape == (2, 3)
def test_create_face_parser_xseg_enum():
"""Test factory function with XSeg enum."""
parser = create_face_parser(XSegWeights.DEFAULT)
assert parser is not None
assert isinstance(parser, XSeg)
def test_create_face_parser_xseg_string():
"""Test factory function with XSeg string."""
parser = create_face_parser('xseg')
assert parser is not None
assert isinstance(parser, XSeg)

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for face recognition models (ArcFace, MobileFace, SphereFace)."""
from __future__ import annotations
@@ -75,7 +74,7 @@ def test_arcface_embedding_shape(arcface_model, mock_aligned_face):
"""
embedding = arcface_model.get_embedding(mock_aligned_face)
# ArcFace typically produces 512-dimensional embeddings
# ArcFace get_embedding returns raw ONNX output with batch dimension
assert embedding.shape[1] == 512, f'Expected 512-dim embedding, got {embedding.shape[1]}'
assert embedding.shape[0] == 1, 'Embedding should have batch dimension of 1'
@@ -89,7 +88,8 @@ def test_arcface_normalized_embedding(arcface_model, mock_landmarks):
embedding = arcface_model.get_normalized_embedding(mock_image, mock_landmarks)
# Check that embedding is normalized (L2 norm ≈ 1.0)
# Check shape and normalization
assert embedding.shape == (512,), f'Expected shape (512,), got {embedding.shape}'
norm = np.linalg.norm(embedding)
assert np.isclose(norm, 1.0, atol=1e-5), f'Normalized embedding should have norm 1.0, got {norm}'
@@ -126,7 +126,7 @@ def test_mobileface_embedding_shape(mobileface_model, mock_aligned_face):
"""
embedding = mobileface_model.get_embedding(mock_aligned_face)
# MobileFace typically produces 512-dimensional embeddings
# MobileFace get_embedding returns raw ONNX output with batch dimension
assert embedding.shape[1] == 512, f'Expected 512-dim embedding, got {embedding.shape[1]}'
assert embedding.shape[0] == 1, 'Embedding should have batch dimension of 1'
@@ -139,6 +139,7 @@ def test_mobileface_normalized_embedding(mobileface_model, mock_landmarks):
embedding = mobileface_model.get_normalized_embedding(mock_image, mock_landmarks)
assert embedding.shape == (512,), f'Expected shape (512,), got {embedding.shape}'
norm = np.linalg.norm(embedding)
assert np.isclose(norm, 1.0, atol=1e-5), f'Normalized embedding should have norm 1.0, got {norm}'
@@ -157,7 +158,7 @@ def test_sphereface_embedding_shape(sphereface_model, mock_aligned_face):
"""
embedding = sphereface_model.get_embedding(mock_aligned_face)
# SphereFace typically produces 512-dimensional embeddings
# SphereFace get_embedding returns raw ONNX output with batch dimension
assert embedding.shape[1] == 512, f'Expected 512-dim embedding, got {embedding.shape[1]}'
assert embedding.shape[0] == 1, 'Embedding should have batch dimension of 1'
@@ -170,6 +171,7 @@ def test_sphereface_normalized_embedding(sphereface_model, mock_landmarks):
embedding = sphereface_model.get_normalized_embedding(mock_image, mock_landmarks)
assert embedding.shape == (512,), f'Expected shape (512,), got {embedding.shape}'
norm = np.linalg.norm(embedding)
assert np.isclose(norm, 1.0, atol=1e-5), f'Normalized embedding should have norm 1.0, got {norm}'

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for RetinaFace detector."""
from __future__ import annotations

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for SCRFD detector."""
from __future__ import annotations

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for UniFace type definitions (dataclasses)."""
from __future__ import annotations

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for utility functions (compute_similarity, face_alignment, etc.)."""
from __future__ import annotations

View File

@@ -6,26 +6,29 @@ CLI utilities for testing and running UniFace features.
| Tool | Description |
|------|-------------|
| `detection.py` | Face detection on image, video, or webcam |
| `face_anonymize.py` | Face anonymization/blurring for privacy |
| `age_gender.py` | Age and gender prediction |
| `face_emotion.py` | Emotion detection (7 or 8 emotions) |
| `gaze_estimation.py` | Gaze direction estimation |
| `detect.py` | Face detection on image, video, or webcam |
| `track.py` | Face tracking on video with ByteTrack |
| `analyze.py` | Complete face analysis (detection + recognition + attributes) |
| `anonymize.py` | Face anonymization/blurring for privacy |
| `emotion.py` | Emotion detection (7 or 8 emotions) |
| `gaze.py` | Gaze direction estimation |
| `headpose.py` | Head pose estimation (pitch, yaw, roll) |
| `landmarks.py` | 106-point facial landmark detection |
| `recognition.py` | Face embedding extraction and comparison |
| `face_analyzer.py` | Complete face analysis (detection + recognition + attributes) |
| `face_search.py` | Real-time face matching against reference |
| `recognize.py` | Face embedding extraction and comparison |
| `search.py` | Real-time face matching against reference |
| `faiss_search.py` | FAISS index build and multi-identity face search |
| `fairface.py` | FairFace attribute prediction (race, gender, age) |
| `attribute.py` | Age and gender prediction |
| `spoofing.py` | Face anti-spoofing detection |
| `face_parsing.py` | Face semantic segmentation |
| `video_detection.py` | Face detection on video files with progress bar |
| `parse.py` | Face semantic segmentation (BiSeNet) |
| `xseg.py` | Face segmentation (XSeg) |
| `batch_process.py` | Batch process folder of images |
| `download_model.py` | Download model weights |
| `sha256_generate.py` | Generate SHA256 hash for model files |
## Unified `--source` Pattern
All tools use a unified `--source` argument that accepts:
Most tools use a unified `--source` argument that accepts:
- **Image path**: `--source photo.jpg`
- **Video path**: `--source video.mp4`
- **Camera ID**: `--source 0` (default webcam), `--source 1` (external camera)
@@ -34,26 +37,36 @@ All tools use a unified `--source` argument that accepts:
```bash
# Face detection
python tools/detection.py --source assets/test.jpg # image
python tools/detection.py --source video.mp4 # video
python tools/detection.py --source 0 # webcam
python tools/detect.py --source assets/test.jpg # image
python tools/detect.py --source video.mp4 # video
python tools/detect.py --source 0 # webcam
# Face tracking
python tools/track.py --source video.mp4
python tools/track.py --source video.mp4 --output tracked.mp4
python tools/track.py --source 0 # webcam
# Face anonymization
python tools/face_anonymize.py --source assets/test.jpg --method pixelate
python tools/face_anonymize.py --source video.mp4 --method gaussian
python tools/face_anonymize.py --source 0 --method pixelate
python tools/anonymize.py --source assets/test.jpg --method pixelate
python tools/anonymize.py --source video.mp4 --method gaussian
python tools/anonymize.py --source 0 --method pixelate
# Age and gender
python tools/age_gender.py --source assets/test.jpg
python tools/age_gender.py --source 0
python tools/attribute.py --source assets/test.jpg
python tools/attribute.py --source 0
# Emotion detection
python tools/face_emotion.py --source assets/test.jpg
python tools/face_emotion.py --source 0
python tools/emotion.py --source assets/test.jpg
python tools/emotion.py --source 0
# Gaze estimation
python tools/gaze_estimation.py --source assets/test.jpg
python tools/gaze_estimation.py --source 0
python tools/gaze.py --source assets/test.jpg
python tools/gaze.py --source 0
# Head pose estimation
python tools/headpose.py --source assets/test.jpg
python tools/headpose.py --source 0
python tools/headpose.py --source 0 --draw-type axis
# Landmarks
python tools/landmarks.py --source assets/test.jpg
@@ -63,31 +76,31 @@ python tools/landmarks.py --source 0
python tools/fairface.py --source assets/test.jpg
python tools/fairface.py --source 0
# Face parsing
python tools/face_parsing.py --source assets/test.jpg
python tools/face_parsing.py --source 0
# Face parsing (BiSeNet)
python tools/parse.py --source assets/test.jpg
python tools/parse.py --source 0
# Face segmentation (XSeg)
python tools/xseg.py --source assets/test.jpg
python tools/xseg.py --source 0
# Face anti-spoofing
python tools/spoofing.py --source assets/test.jpg
python tools/spoofing.py --source 0
# Face analyzer
python tools/face_analyzer.py --source assets/test.jpg
python tools/face_analyzer.py --source 0
python tools/analyze.py --source assets/test.jpg
python tools/analyze.py --source 0
# Face recognition (extract embedding)
python tools/recognition.py --image assets/test.jpg
python tools/recognize.py --image assets/test.jpg
# Face comparison
python tools/recognition.py --image1 face1.jpg --image2 face2.jpg
python tools/recognize.py --image1 face1.jpg --image2 face2.jpg
# Face search (match against reference)
python tools/face_search.py --reference person.jpg --source 0
python tools/face_search.py --reference person.jpg --source video.mp4
# Video processing with progress bar
python tools/video_detection.py --source video.mp4
python tools/video_detection.py --source video.mp4 --output output.mp4
python tools/search.py --reference person.jpg --source 0
python tools/search.py --reference person.jpg --source video.mp4
# Batch processing
python tools/batch_process.py --input images/ --output results/
@@ -102,7 +115,7 @@ python tools/download_model.py # downloads all
| Option | Description |
|--------|-------------|
| `--source` | Input source: image/video path or camera ID (0, 1, ...) |
| `--detector` | Choose detector: `retinaface`, `scrfd`, `yolov5face` |
| `--detector` | Choose detector: `retinaface`, `scrfd`, `yolov5face`, `yolov8face` |
| `--threshold` | Visualization confidence threshold (default: varies) |
| `--save-dir` | Output directory (default: `outputs`) |
@@ -117,5 +130,5 @@ python tools/download_model.py # downloads all
## Quick Test
```bash
python tools/detection.py --source assets/test.jpg
python tools/detect.py --source assets/test.jpg
```

29
tools/_common.py Normal file
View File

@@ -0,0 +1,29 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
from pathlib import Path
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera.
Args:
source: File path or camera ID string (e.g. ``"0"``).
Returns:
One of ``"image"``, ``"video"``, ``"camera"``, or ``"unknown"``.
"""
if source.isdigit():
return 'camera'
suffix = Path(source).suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
if suffix in VIDEO_EXTENSIONS:
return 'video'
return 'unknown'

View File

@@ -5,9 +5,9 @@
"""Face analysis using FaceAnalyzer.
Usage:
python tools/face_analyzer.py --source path/to/image.jpg
python tools/face_analyzer.py --source path/to/video.mp4
python tools/face_analyzer.py --source 0 # webcam
python tools/analyze.py --source path/to/image.jpg
python tools/analyze.py --source path/to/video.mp4
python tools/analyze.py --source 0 # webcam
"""
from __future__ import annotations
@@ -16,28 +16,15 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
import numpy as np
from uniface import AgeGender, ArcFace, FaceAnalyzer, RetinaFace
from uniface.visualization import draw_detections
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
from uniface.analyzer import FaceAnalyzer
from uniface.attribute import AgeGender
from uniface.detection import RetinaFace
from uniface.draw import draw_detections
from uniface.recognition import ArcFace
def draw_face_info(image, face, face_id):
@@ -111,7 +98,7 @@ def process_image(analyzer, image_path: str, save_dir: str = 'outputs', show_sim
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, fancy_bbox=True)
draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, corner_bbox=True)
for i, face in enumerate(faces, 1):
draw_face_info(image, face, i)
@@ -153,7 +140,7 @@ def process_video(analyzer, video_path: str, save_dir: str = 'outputs'):
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, fancy_bbox=True)
draw_detections(image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, corner_bbox=True)
for i, face in enumerate(faces, 1):
draw_face_info(frame, face, i)
@@ -180,16 +167,16 @@ def run_camera(analyzer, camera_id: int = 0):
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
frame = cv2.flip(frame, 1)
faces = analyzer.analyze(frame)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, fancy_bbox=True)
draw_detections(image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, corner_bbox=True)
for i, face in enumerate(faces, 1):
draw_face_info(frame, face, i)
@@ -214,7 +201,7 @@ def main():
detector = RetinaFace()
recognizer = ArcFace()
age_gender = AgeGender()
analyzer = FaceAnalyzer(detector, recognizer, age_gender)
analyzer = FaceAnalyzer(detector, recognizer=recognizer, attributes=[age_gender])
source_type = get_source_type(args.source)

View File

@@ -5,9 +5,9 @@
"""Face anonymization/blurring for privacy.
Usage:
python tools/face_anonymize.py --source path/to/image.jpg --method pixelate
python tools/face_anonymize.py --source path/to/video.mp4 --method gaussian
python tools/face_anonymize.py --source 0 --method pixelate # webcam
python tools/anonymize.py --source path/to/image.jpg --method pixelate
python tools/anonymize.py --source path/to/video.mp4 --method gaussian
python tools/anonymize.py --source 0 --method pixelate # webcam
"""
from __future__ import annotations
@@ -16,28 +16,12 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.privacy import BlurFace
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
def process_image(
detector,
@@ -56,7 +40,7 @@ def process_image(
print(f'Detected {len(faces)} face(s)')
if show_detections and faces:
from uniface.visualization import draw_detections
from uniface.draw import draw_detections
preview = image.copy()
bboxes = [face.bbox for face in faces]
@@ -137,9 +121,9 @@ def run_camera(detector, blurrer: BlurFace, camera_id: int = 0):
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)
if faces:
@@ -171,19 +155,19 @@ def main():
epilog="""
Examples:
# Anonymize image with pixelation (default)
python run_anonymization.py --source photo.jpg
python tools/anonymize.py --source photo.jpg
# Use Gaussian blur with custom strength
python run_anonymization.py --source photo.jpg --method gaussian --blur-strength 5.0
python tools/anonymize.py --source photo.jpg --method gaussian --blur-strength 5.0
# Real-time webcam anonymization
python run_anonymization.py --source 0 --method pixelate
python tools/anonymize.py --source 0 --method pixelate
# Black boxes for maximum privacy
python run_anonymization.py --source photo.jpg --method blackout
python tools/anonymize.py --source photo.jpg --method blackout
# Custom pixelation intensity
python run_anonymization.py --source photo.jpg --method pixelate --pixel-blocks 5
python tools/anonymize.py --source photo.jpg --method pixelate --pixel-blocks 5
""",
)

View File

@@ -5,9 +5,9 @@
"""Age and gender prediction on detected faces.
Usage:
python tools/age_gender.py --source path/to/image.jpg
python tools/age_gender.py --source path/to/video.mp4
python tools/age_gender.py --source 0 # webcam
python tools/attribute.py --source path/to/image.jpg
python tools/attribute.py --source path/to/video.mp4
python tools/attribute.py --source 0 # webcam
"""
from __future__ import annotations
@@ -16,27 +16,12 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
from uniface import SCRFD, AgeGender, RetinaFace
from uniface.visualization import draw_detections
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
from uniface.attribute import AgeGender
from uniface.detection import SCRFD, RetinaFace
from uniface.draw import draw_detections
def draw_age_gender_label(image, bbox, sex: str, age: int):
@@ -71,11 +56,11 @@ def process_image(
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for i, face in enumerate(faces):
result = age_gender.predict(image, face.bbox)
result = age_gender.predict(image, face)
print(f' Face {i + 1}: {result.sex}, {result.age} years old')
draw_age_gender_label(image, face.bbox, result.sex, result.age)
@@ -123,11 +108,11 @@ def process_video(
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for face in faces:
result = age_gender.predict(frame, face.bbox)
result = age_gender.predict(frame, face)
draw_age_gender_label(frame, face.bbox, result.sex, result.age)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
@@ -152,9 +137,9 @@ def run_camera(detector, age_gender, camera_id: int = 0, threshold: float = 0.6)
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)
@@ -162,11 +147,11 @@ def run_camera(detector, age_gender, camera_id: int = 0, threshold: float = 0.6)
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for face in faces:
result = age_gender.predict(frame, face.bbox)
result = age_gender.predict(frame, face)
draw_age_gender_label(frame, face.bbox, result.sex, result.age)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)

View File

@@ -14,8 +14,8 @@ from pathlib import Path
import cv2
from tqdm import tqdm
from uniface import SCRFD, RetinaFace
from uniface.visualization import draw_detections
from uniface.detection import SCRFD, RetinaFace
from uniface.draw import draw_detections
def get_image_files(input_dir: Path, extensions: tuple) -> list:
@@ -39,7 +39,7 @@ def process_image(detector, image_path: Path, output_path: Path, threshold: floa
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
cv2.putText(

View File

@@ -5,9 +5,9 @@
"""Face detection on image, video, or webcam.
Usage:
python tools/detection.py --source path/to/image.jpg
python tools/detection.py --source path/to/video.mp4
python tools/detection.py --source 0 # webcam
python tools/detect.py --source path/to/image.jpg
python tools/detect.py --source path/to/video.mp4
python tools/detect.py --source 0 # webcam
"""
from __future__ import annotations
@@ -15,28 +15,14 @@ from __future__ import annotations
import argparse
import os
from pathlib import Path
import time
from _common import get_source_type
import cv2
from tqdm import tqdm
from uniface.detection import SCRFD, RetinaFace, YOLOv5Face, YOLOv8Face
from uniface.visualization import draw_detections
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
from uniface.draw import draw_detections
def process_image(detector, image_path: str, threshold: float = 0.6, save_dir: str = 'outputs'):
@@ -52,7 +38,7 @@ def process_image(detector, image_path: str, threshold: float = 0.6, save_dir: s
bboxes = [face.bbox for face in faces]
scores = [face.confidence for face in faces]
landmarks = [face.landmarks for face in faces]
draw_detections(image, bboxes, scores, landmarks, vis_threshold=threshold)
draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold)
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{os.path.splitext(os.path.basename(image_path))[0]}_out.jpg')
@@ -60,34 +46,48 @@ def process_image(detector, image_path: str, threshold: float = 0.6, save_dir: s
print(f'Detected {len(faces)} face(s). Output saved: {output_path}')
def process_video(detector, video_path: str, threshold: float = 0.6, save_dir: str = 'outputs'):
"""Process a video file."""
cap = cv2.VideoCapture(video_path)
def process_video(
detector,
input_path: str,
output_path: str,
threshold: float = 0.6,
show_preview: bool = False,
):
"""Process a video file with progress bar."""
cap = cv2.VideoCapture(input_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{video_path}'")
print(f"Error: Cannot open video file '{input_path}'")
return
# Get video properties
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(video_path).stem}_out.mp4')
print(f'Input: {input_path} ({width}x{height}, {fps:.1f} fps, {total_frames} frames)')
print(f'Output: {output_path}')
Path(output_path).parent.mkdir(parents=True, exist_ok=True)
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
print(f'Processing video: {video_path} ({total_frames} frames)')
frame_count = 0
if not out.isOpened():
print(f"Error: Cannot create output video '{output_path}'")
cap.release()
return
while True:
frame_count = 0
total_faces = 0
for _ in tqdm(range(total_frames), desc='Processing', unit='frames'):
ret, frame = cap.read()
if not ret:
break
t0 = time.perf_counter()
frame_count += 1
faces = detector.detect(frame)
total_faces += len(faces)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
@@ -99,19 +99,28 @@ def process_video(detector, video_path: str, threshold: float = 0.6, save_dir: s
landmarks=landmarks,
vis_threshold=threshold,
draw_score=True,
fancy_bbox=True,
corner_bbox=True,
)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
inference_fps = 1.0 / max(time.perf_counter() - t0, 1e-9)
cv2.putText(frame, f'FPS: {inference_fps:.1f}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 65), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
out.write(frame)
# Show progress
if frame_count % 100 == 0:
print(f' Processed {frame_count}/{total_frames} frames...')
if show_preview:
cv2.imshow("Processing - Press 'q' to cancel", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
print('\nCancelled by user')
break
cap.release()
out.release()
print(f'Done! Output saved: {output_path}')
if show_preview:
cv2.destroyAllWindows()
avg_faces = total_faces / frame_count if frame_count > 0 else 0
print(f'\nDone! {frame_count} frames, {total_faces} faces ({avg_faces:.1f} avg/frame)')
print(f'Saved: {output_path}')
def run_camera(detector, camera_id: int = 0, threshold: float = 0.6):
@@ -123,11 +132,12 @@ def run_camera(detector, camera_id: int = 0, threshold: float = 0.6):
print("Press 'q' to quit")
prev_time = time.perf_counter()
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1) # mirror for natural interaction
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)
@@ -141,10 +151,14 @@ def run_camera(detector, camera_id: int = 0, threshold: float = 0.6):
landmarks=landmarks,
vis_threshold=threshold,
draw_score=True,
fancy_bbox=True,
corner_bbox=True,
)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
curr_time = time.perf_counter()
fps = 1.0 / max(curr_time - prev_time, 1e-9)
prev_time = curr_time
cv2.putText(frame, f'FPS: {fps:.1f}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 65), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.imshow('Face Detection', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
@@ -158,18 +172,24 @@ def main():
parser = argparse.ArgumentParser(description='Run face detection')
parser.add_argument('--source', type=str, required=True, help='Image/video path or camera ID (0, 1, ...)')
parser.add_argument(
'--method', type=str, default='retinaface', choices=['retinaface', 'scrfd', 'yolov5face', 'yolov8face']
'--detector',
'--method',
type=str,
default='retinaface',
choices=['retinaface', 'scrfd', 'yolov5face', 'yolov8face'],
)
parser.add_argument('--threshold', type=float, default=0.25, help='Visualization threshold')
parser.add_argument('--preview', action='store_true', help='Show live preview during video processing')
parser.add_argument('--save-dir', type=str, default='outputs', help='Output directory')
parser.add_argument('--output', type=str, default=None, help='Output video path (auto-generated if not specified)')
args = parser.parse_args()
# Initialize detector
if args.method == 'retinaface':
if args.detector == 'retinaface':
detector = RetinaFace()
elif args.method == 'scrfd':
elif args.detector == 'scrfd':
detector = SCRFD()
elif args.method == 'yolov5face':
elif args.detector == 'yolov5face':
from uniface.constants import YOLOv5FaceWeights
detector = YOLOv5Face(model_name=YOLOv5FaceWeights.YOLOV5M)
@@ -178,7 +198,6 @@ def main():
detector = YOLOv8Face(model_name=YOLOv8FaceWeights.YOLOV8N)
# Determine source type and process
source_type = get_source_type(args.source)
if source_type == 'camera':
@@ -192,7 +211,12 @@ def main():
if not os.path.exists(args.source):
print(f'Error: Video not found: {args.source}')
return
process_video(detector, args.source, args.threshold, args.save_dir)
if args.output:
output_path = args.output
else:
os.makedirs(args.save_dir, exist_ok=True)
output_path = os.path.join(args.save_dir, f'{Path(args.source).stem}_detected.mp4')
process_video(detector, args.source, output_path, args.threshold, args.preview)
else:
print(f"Error: Unknown source type for '{args.source}'")
print('Supported formats: images (.jpg, .png, ...), videos (.mp4, .avi, ...), or camera ID (0, 1, ...)')

View File

@@ -4,6 +4,7 @@ from uniface.constants import (
AgeGenderWeights,
ArcFaceWeights,
DDAMFNWeights,
HeadPoseWeights,
LandmarkWeights,
MobileFaceWeights,
RetinaFaceWeights,
@@ -21,6 +22,7 @@ MODEL_TYPES = {
'ddamfn': DDAMFNWeights,
'agegender': AgeGenderWeights,
'landmark': LandmarkWeights,
'headpose': HeadPoseWeights,
}

View File

@@ -5,9 +5,9 @@
"""Emotion detection on detected faces.
Usage:
python tools/face_emotion.py --source path/to/image.jpg
python tools/face_emotion.py --source path/to/video.mp4
python tools/face_emotion.py --source 0 # webcam
python tools/emotion.py --source path/to/image.jpg
python tools/emotion.py --source path/to/video.mp4
python tools/emotion.py --source 0 # webcam
"""
from __future__ import annotations
@@ -16,27 +16,12 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
from uniface import SCRFD, Emotion, RetinaFace
from uniface.visualization import draw_detections
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
from uniface.attribute import Emotion
from uniface.detection import SCRFD, RetinaFace
from uniface.draw import draw_detections
def draw_emotion_label(image, bbox, emotion: str, confidence: float):
@@ -71,11 +56,11 @@ def process_image(
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for i, face in enumerate(faces):
result = emotion_predictor.predict(image, face.landmarks)
result = emotion_predictor.predict(image, face)
print(f' Face {i + 1}: {result.emotion} (confidence: {result.confidence:.3f})')
draw_emotion_label(image, face.bbox, result.emotion, result.confidence)
@@ -123,11 +108,11 @@ def process_video(
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for face in faces:
result = emotion_predictor.predict(frame, face.landmarks)
result = emotion_predictor.predict(frame, face)
draw_emotion_label(frame, face.bbox, result.emotion, result.confidence)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
@@ -152,9 +137,9 @@ def run_camera(detector, emotion_predictor, camera_id: int = 0, threshold: float
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)
@@ -162,11 +147,11 @@ def run_camera(detector, emotion_predictor, camera_id: int = 0, threshold: float
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for face in faces:
result = emotion_predictor.predict(frame, face.landmarks)
result = emotion_predictor.predict(frame, face)
draw_emotion_label(frame, face.bbox, result.emotion, result.confidence)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)

View File

@@ -16,28 +16,12 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
from uniface import SCRFD, RetinaFace
from uniface.attribute import FairFace
from uniface.visualization import draw_detections
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
from uniface.detection import SCRFD, RetinaFace
from uniface.draw import draw_detections
def draw_fairface_label(image, bbox, sex: str, age_group: str, race: str):
@@ -72,11 +56,11 @@ def process_image(
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for i, face in enumerate(faces):
result = fairface.predict(image, face.bbox)
result = fairface.predict(image, face)
print(f' Face {i + 1}: {result.sex}, {result.age_group}, {result.race}')
draw_fairface_label(image, face.bbox, result.sex, result.age_group, result.race)
@@ -124,11 +108,11 @@ def process_video(
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for face in faces:
result = fairface.predict(frame, face.bbox)
result = fairface.predict(frame, face)
draw_fairface_label(frame, face.bbox, result.sex, result.age_group, result.race)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
@@ -153,9 +137,9 @@ def run_camera(detector, fairface, camera_id: int = 0, threshold: float = 0.6):
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)
@@ -163,11 +147,11 @@ def run_camera(detector, fairface, camera_id: int = 0, threshold: float = 0.6):
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
for face in faces:
result = fairface.predict(frame, face.bbox)
result = fairface.predict(frame, face)
draw_fairface_label(frame, face.bbox, result.sex, result.age_group, result.race)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)

208
tools/faiss_search.py Normal file
View File

@@ -0,0 +1,208 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""FAISS index build and multi-identity face search.
Build a vector index from a directory of person sub-folders, then search
against it in a video or webcam stream.
Usage:
python tools/faiss_search.py build --faces-dir dataset/ --db-path ./vector_index
python tools/faiss_search.py run --db-path ./vector_index --source video.mp4
python tools/faiss_search.py run --db-path ./vector_index --source 0 # webcam
"""
from __future__ import annotations
import argparse
import os
from pathlib import Path
from _common import IMAGE_EXTENSIONS, get_source_type
import cv2
from uniface import create_detector, create_recognizer
from uniface.draw import draw_corner_bbox, draw_text_label
from uniface.indexing import FAISS
def _draw_face(image, bbox, text: str, color: tuple[int, int, int]) -> None:
x1, y1, x2, y2 = map(int, bbox[:4])
thickness = max(round(sum(image.shape[:2]) / 2 * 0.003), 2)
font_scale = max(0.4, min(0.7, (y2 - y1) / 200))
draw_corner_bbox(image, (x1, y1, x2, y2), color=color, thickness=thickness)
draw_text_label(image, text, x1, y1, bg_color=color, font_scale=font_scale)
def process_frame(frame, detector, recognizer, store: FAISS, threshold: float = 0.4):
faces = detector.detect(frame)
if not faces:
return frame
for face in faces:
embedding = recognizer.get_normalized_embedding(frame, face.landmarks)
result, sim = store.search(embedding, threshold=threshold)
text = f'{result["person_id"]} ({sim:.2f})' if result else f'Unknown ({sim:.2f})'
color = (0, 255, 0) if result else (0, 0, 255)
_draw_face(frame, face.bbox, text, color)
return frame
def process_video(detector, recognizer, store: FAISS, video_path: str, save_dir: str, threshold: float = 0.4):
cap = cv2.VideoCapture(video_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{video_path}'")
return
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(video_path).stem}_faiss_search.mp4')
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
print(f'Processing video: {video_path} ({total_frames} frames)')
frame_count = 0
while True:
ret, frame = cap.read()
if not ret:
break
frame_count += 1
frame = process_frame(frame, detector, recognizer, store, threshold)
out.write(frame)
if frame_count % 100 == 0:
print(f' Processed {frame_count}/{total_frames} frames...')
cap.release()
out.release()
print(f'Done! Output saved: {output_path}')
def run_camera(detector, recognizer, store: FAISS, camera_id: int = 0, threshold: float = 0.4):
cap = cv2.VideoCapture(camera_id)
if not cap.isOpened():
print(f'Cannot open camera {camera_id}')
return
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
if not ret:
break
frame = cv2.flip(frame, 1)
frame = process_frame(frame, detector, recognizer, store, threshold)
cv2.imshow('Vector Search', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
def build(args: argparse.Namespace) -> None:
faces_dir = Path(args.faces_dir)
if not faces_dir.is_dir():
print(f"Error: '{faces_dir}' is not a directory")
return
detector = create_detector()
recognizer = create_recognizer()
store = FAISS(db_path=args.db_path)
persons = sorted(p.name for p in faces_dir.iterdir() if p.is_dir())
if not persons:
print(f"Error: No sub-folders found in '{faces_dir}'")
return
print(f'Found {len(persons)} persons: {", ".join(persons)}')
total_added = 0
for person_id in persons:
person_dir = faces_dir / person_id
images = [f for f in person_dir.iterdir() if f.suffix.lower() in IMAGE_EXTENSIONS]
added = 0
for img_path in images:
image = cv2.imread(str(img_path))
if image is None:
print(f' Warning: Failed to read {img_path}, skipping')
continue
faces = detector.detect(image)
if not faces:
print(f' Warning: No face detected in {img_path}, skipping')
continue
embedding = recognizer.get_normalized_embedding(image, faces[0].landmarks)
store.add(embedding, {'person_id': person_id, 'source': str(img_path)})
added += 1
total_added += added
if added:
print(f' {person_id}: {added} embeddings added')
else:
print(f' {person_id}: no valid faces found')
store.save()
print(f'\nIndex saved to {args.db_path} ({total_added} vectors, {len(persons)} persons)')
def run(args: argparse.Namespace) -> None:
detector = create_detector()
recognizer = create_recognizer()
store = FAISS(db_path=args.db_path)
if not store.load():
print(f"Error: No index found at '{args.db_path}'")
return
print(f'Loaded FAISS index: {store}')
source_type = get_source_type(args.source)
if source_type == 'camera':
run_camera(detector, recognizer, store, int(args.source), args.threshold)
elif source_type == 'video':
if not os.path.exists(args.source):
print(f'Error: Video not found: {args.source}')
return
process_video(detector, recognizer, store, args.source, args.save_dir, args.threshold)
else:
print(f"Error: Source must be a video file or camera ID, not '{args.source}'")
def main():
parser = argparse.ArgumentParser(description='FAISS vector search')
sub = parser.add_subparsers(dest='command', required=True)
build_p = sub.add_parser('build', help='Build a FAISS index from person sub-folders')
build_p.add_argument('--faces-dir', type=str, required=True, help='Directory with person sub-folders')
build_p.add_argument('--db-path', type=str, default='./vector_index', help='Where to save the index')
run_p = sub.add_parser('run', help='Search faces against a FAISS index')
run_p.add_argument('--db-path', type=str, required=True, help='Path to saved FAISS index')
run_p.add_argument('--source', type=str, required=True, help='Video path or camera ID')
run_p.add_argument('--threshold', type=float, default=0.4, help='Similarity threshold')
run_p.add_argument('--save-dir', type=str, default='outputs', help='Output directory')
args = parser.parse_args()
if args.command == 'build':
build(args)
elif args.command == 'run':
run(args)
if __name__ == '__main__':
main()

View File

@@ -5,9 +5,9 @@
"""Gaze estimation on detected faces.
Usage:
python tools/gaze_estimation.py --source path/to/image.jpg
python tools/gaze_estimation.py --source path/to/video.mp4
python tools/gaze_estimation.py --source 0 # webcam
python tools/gaze.py --source path/to/image.jpg
python tools/gaze.py --source path/to/video.mp4
python tools/gaze.py --source 0 # webcam
"""
from __future__ import annotations
@@ -16,29 +16,13 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
import numpy as np
from uniface import RetinaFace
from uniface.detection import RetinaFace
from uniface.draw import draw_gaze
from uniface.gaze import MobileGaze
from uniface.visualization import draw_gaze
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
def process_image(detector, gaze_estimator, image_path: str, save_dir: str = 'outputs'):

181
tools/headpose.py Normal file
View File

@@ -0,0 +1,181 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Head pose estimation on detected faces.
Usage:
python tools/headpose.py --source path/to/image.jpg
python tools/headpose.py --source path/to/video.mp4
python tools/headpose.py --source 0 # webcam
python tools/headpose.py --source path/to/image.jpg --draw-type axis
"""
from __future__ import annotations
import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
from uniface.detection import RetinaFace
from uniface.draw import draw_head_pose
from uniface.headpose import HeadPose
def process_image(detector, head_pose_estimator, image_path: str, save_dir: str = 'outputs', draw_type: str = 'cube'):
"""Process a single image."""
image = cv2.imread(image_path)
if image is None:
print(f"Error: Failed to load image from '{image_path}'")
return
faces = detector.detect(image)
print(f'Detected {len(faces)} face(s)')
for i, face in enumerate(faces):
bbox = face.bbox
x1, y1, x2, y2 = map(int, bbox[:4])
face_crop = image[y1:y2, x1:x2]
if face_crop.size == 0:
continue
result = head_pose_estimator.estimate(face_crop)
print(f' Face {i + 1}: pitch={result.pitch:.1f}°, yaw={result.yaw:.1f}°, roll={result.roll:.1f}°')
draw_head_pose(image, bbox, result.pitch, result.yaw, result.roll, draw_type=draw_type)
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(image_path).stem}_headpose.jpg')
cv2.imwrite(output_path, image)
print(f'Output saved: {output_path}')
def process_video(detector, head_pose_estimator, video_path: str, save_dir: str = 'outputs', draw_type: str = 'cube'):
"""Process a video file."""
cap = cv2.VideoCapture(video_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{video_path}'")
return
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(video_path).stem}_headpose.mp4')
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
print(f'Processing video: {video_path} ({total_frames} frames)')
frame_count = 0
while True:
ret, frame = cap.read()
if not ret:
break
frame_count += 1
faces = detector.detect(frame)
for face in faces:
bbox = face.bbox
x1, y1, x2, y2 = map(int, bbox[:4])
face_crop = frame[y1:y2, x1:x2]
if face_crop.size == 0:
continue
result = head_pose_estimator.estimate(face_crop)
draw_head_pose(frame, bbox, result.pitch, result.yaw, result.roll, draw_type=draw_type)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
out.write(frame)
if frame_count % 100 == 0:
print(f' Processed {frame_count}/{total_frames} frames...')
cap.release()
out.release()
print(f'Done! Output saved: {output_path}')
def run_camera(detector, head_pose_estimator, camera_id: int = 0, draw_type: str = 'cube'):
"""Run real-time detection on webcam."""
cap = cv2.VideoCapture(camera_id)
if not cap.isOpened():
print(f'Cannot open camera {camera_id}')
return
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)
for face in faces:
bbox = face.bbox
x1, y1, x2, y2 = map(int, bbox[:4])
face_crop = frame[y1:y2, x1:x2]
if face_crop.size == 0:
continue
result = head_pose_estimator.estimate(face_crop)
draw_head_pose(frame, bbox, result.pitch, result.yaw, result.roll, draw_type=draw_type)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.imshow('Head Pose Estimation', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
def main():
parser = argparse.ArgumentParser(description='Run head pose estimation')
parser.add_argument('--source', type=str, required=True, help='Image/video path or camera ID (0, 1, ...)')
parser.add_argument('--save-dir', type=str, default='outputs', help='Output directory')
parser.add_argument(
'--draw-type',
type=str,
default='cube',
choices=['cube', 'axis'],
help='Visualization type: cube (default) or axis',
)
args = parser.parse_args()
detector = RetinaFace()
head_pose_estimator = HeadPose()
source_type = get_source_type(args.source)
if source_type == 'camera':
run_camera(detector, head_pose_estimator, int(args.source), args.draw_type)
elif source_type == 'image':
if not os.path.exists(args.source):
print(f'Error: Image not found: {args.source}')
return
process_image(detector, head_pose_estimator, args.source, args.save_dir, args.draw_type)
elif source_type == 'video':
if not os.path.exists(args.source):
print(f'Error: Video not found: {args.source}')
return
process_video(detector, head_pose_estimator, args.source, args.save_dir, args.draw_type)
else:
print(f"Error: Unknown source type for '{args.source}'")
print('Supported formats: images (.jpg, .png, ...), videos (.mp4, .avi, ...), or camera ID (0, 1, ...)')
if __name__ == '__main__':
main()

View File

@@ -16,26 +16,11 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
from uniface import SCRFD, Landmark106, RetinaFace
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
from uniface.detection import SCRFD, RetinaFace
from uniface.landmark import Landmark106
def process_image(detector, landmarker, image_path: str, save_dir: str = 'outputs'):
@@ -129,9 +114,9 @@ def run_camera(detector, landmarker, camera_id: int = 0):
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)

View File

@@ -5,9 +5,9 @@
"""Face parsing on detected faces.
Usage:
python tools/face_parsing.py --source path/to/image.jpg
python tools/face_parsing.py --source path/to/video.mp4
python tools/face_parsing.py --source 0 # webcam
python tools/parse.py --source path/to/image.jpg
python tools/parse.py --source path/to/video.mp4
python tools/parse.py --source 0 # webcam
"""
from __future__ import annotations
@@ -16,30 +16,14 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
import numpy as np
from uniface import RetinaFace
from uniface.constants import ParsingWeights
from uniface.detection import RetinaFace
from uniface.draw import vis_parsing_maps
from uniface.parsing import BiSeNet
from uniface.visualization import vis_parsing_maps
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
def expand_bbox(
@@ -225,7 +209,7 @@ def main():
args = parser_arg.parse_args()
detector = RetinaFace()
parser = BiSeNet(model_name=ParsingWeights.RESNET34)
parser = BiSeNet(model_name=args.model)
source_type = get_source_type(args.source)

View File

@@ -5,8 +5,8 @@
"""Face recognition: extract embeddings or compare two faces.
Usage:
python tools/recognition.py --image path/to/image.jpg
python tools/recognition.py --image1 face1.jpg --image2 face2.jpg
python tools/recognize.py --image path/to/image.jpg
python tools/recognize.py --image1 face1.jpg --image2 face2.jpg
"""
import argparse
@@ -41,12 +41,13 @@ def run_inference(detector, recognizer, image_path: str):
print(f'Detected {len(faces)} face(s). Extracting embedding for the first face...')
landmarks = faces[0]['landmarks'] # 5-point landmarks for alignment (already np.ndarray)
landmarks = faces[0].landmarks
embedding = recognizer.get_embedding(image, landmarks)
norm_embedding = recognizer.get_normalized_embedding(image, landmarks) # L2 normalized
raw_norm = np.linalg.norm(embedding)
norm_embedding = embedding.ravel() / raw_norm if raw_norm > 0 else embedding.ravel()
print(f' Embedding shape: {embedding.shape}')
print(f' L2 norm (raw): {np.linalg.norm(embedding):.4f}')
print(f' L2 norm (raw): {raw_norm:.4f}')
print(f' L2 norm (normalized): {np.linalg.norm(norm_embedding):.4f}')
@@ -65,8 +66,8 @@ def compare_faces(detector, recognizer, image1_path: str, image2_path: str, thre
print('Error: No faces detected in one or both images')
return
landmarks1 = faces1[0]['landmarks']
landmarks2 = faces2[0]['landmarks']
landmarks1 = faces1[0].landmarks
landmarks2 = faces2[0].landmarks
embedding1 = recognizer.get_normalized_embedding(img1, landmarks1)
embedding2 = recognizer.get_normalized_embedding(img2, landmarks2)

View File

@@ -2,11 +2,14 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Real-time face search: match faces against a reference image.
"""Single-reference face search on video or webcam.
Given a reference face image, detects faces in the source and shows
whether each face matches the reference.
Usage:
python tools/face_search.py --reference person.jpg --source 0 # webcam
python tools/face_search.py --reference person.jpg --source video.mp4
python tools/search.py --reference ref.jpg --source video.mp4
python tools/search.py --reference ref.jpg --source 0 # webcam
"""
from __future__ import annotations
@@ -15,43 +18,16 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
import numpy as np
from uniface.detection import SCRFD, RetinaFace
from uniface import create_detector, create_recognizer
from uniface.draw import draw_corner_bbox, draw_text_label
from uniface.face_utils import compute_similarity
from uniface.recognition import ArcFace, MobileFace, SphereFace
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
def get_recognizer(name: str):
"""Get recognizer by name."""
if name == 'arcface':
return ArcFace()
elif name == 'mobileface':
return MobileFace()
else:
return SphereFace()
def extract_reference_embedding(detector, recognizer, image_path: str) -> np.ndarray:
"""Extract embedding from reference image."""
image = cv2.imread(image_path)
if image is None:
raise RuntimeError(f'Failed to load image: {image_path}')
@@ -60,33 +36,34 @@ def extract_reference_embedding(detector, recognizer, image_path: str) -> np.nda
if not faces:
raise RuntimeError('No faces found in reference image.')
landmarks = faces[0].landmarks
return recognizer.get_normalized_embedding(image, landmarks)
return recognizer.get_normalized_embedding(image, faces[0].landmarks)
def _draw_face(image, bbox, text: str, color: tuple[int, int, int]) -> None:
x1, y1, x2, y2 = map(int, bbox[:4])
thickness = max(round(sum(image.shape[:2]) / 2 * 0.003), 2)
font_scale = max(0.4, min(0.7, (y2 - y1) / 200))
draw_corner_bbox(image, (x1, y1, x2, y2), color=color, thickness=thickness)
draw_text_label(image, text, x1, y1, bg_color=color, font_scale=font_scale)
def process_frame(frame, detector, recognizer, ref_embedding: np.ndarray, threshold: float = 0.4):
"""Process a single frame and return annotated frame."""
faces = detector.detect(frame)
for face in faces:
bbox = face.bbox
landmarks = face.landmarks
x1, y1, x2, y2 = map(int, bbox)
embedding = recognizer.get_normalized_embedding(frame, landmarks)
embedding = recognizer.get_normalized_embedding(frame, face.landmarks)
sim = compute_similarity(ref_embedding, embedding)
label = f'Match ({sim:.2f})' if sim > threshold else f'Unknown ({sim:.2f})'
text = f'Match ({sim:.2f})' if sim > threshold else f'Unknown ({sim:.2f})'
color = (0, 255, 0) if sim > threshold else (0, 0, 255)
cv2.rectangle(frame, (x1, y1), (x2, y2), color, 2)
cv2.putText(frame, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
_draw_face(frame, face.bbox, text, color)
return frame
def process_video(detector, recognizer, ref_embedding: np.ndarray, video_path: str, save_dir: str, threshold: float):
"""Process a video file."""
def process_video(
detector, recognizer, video_path: str, save_dir: str, ref_embedding: np.ndarray, threshold: float = 0.4
):
cap = cv2.VideoCapture(video_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{video_path}'")
@@ -123,7 +100,6 @@ def process_video(detector, recognizer, ref_embedding: np.ndarray, video_path: s
def run_camera(detector, recognizer, ref_embedding: np.ndarray, camera_id: int = 0, threshold: float = 0.4):
"""Run real-time face search on webcam."""
cap = cv2.VideoCapture(camera_id)
if not cap.isOpened():
print(f'Cannot open camera {camera_id}')
@@ -133,13 +109,13 @@ def run_camera(detector, recognizer, ref_embedding: np.ndarray, camera_id: int =
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
frame = cv2.flip(frame, 1)
frame = process_frame(frame, detector, recognizer, ref_embedding, threshold)
cv2.imshow('Face Recognition', frame)
cv2.imshow('Face Search', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
@@ -148,17 +124,10 @@ def run_camera(detector, recognizer, ref_embedding: np.ndarray, camera_id: int =
def main():
parser = argparse.ArgumentParser(description='Face search using a reference image')
parser = argparse.ArgumentParser(description='Single-reference face search')
parser.add_argument('--reference', type=str, required=True, help='Reference face image')
parser.add_argument('--source', type=str, required=True, help='Video path or camera ID (0, 1, ...)')
parser.add_argument('--threshold', type=float, default=0.4, help='Match threshold')
parser.add_argument('--detector', type=str, default='scrfd', choices=['retinaface', 'scrfd'])
parser.add_argument(
'--recognizer',
type=str,
default='arcface',
choices=['arcface', 'mobileface', 'sphereface'],
)
parser.add_argument('--source', type=str, required=True, help='Video path or camera ID')
parser.add_argument('--threshold', type=float, default=0.4, help='Similarity threshold')
parser.add_argument('--save-dir', type=str, default='outputs', help='Output directory')
args = parser.parse_args()
@@ -166,8 +135,8 @@ def main():
print(f'Error: Reference image not found: {args.reference}')
return
detector = RetinaFace() if args.detector == 'retinaface' else SCRFD()
recognizer = get_recognizer(args.recognizer)
detector = create_detector()
recognizer = create_recognizer()
print(f'Loading reference: {args.reference}')
ref_embedding = extract_reference_embedding(detector, recognizer, args.reference)
@@ -180,10 +149,9 @@ def main():
if not os.path.exists(args.source):
print(f'Error: Video not found: {args.source}')
return
process_video(detector, recognizer, ref_embedding, args.source, args.save_dir, args.threshold)
process_video(detector, recognizer, args.source, args.save_dir, ref_embedding, args.threshold)
else:
print(f"Error: Source must be a video file or camera ID, not '{args.source}'")
print('Supported formats: videos (.mp4, .avi, ...) or camera ID (0, 1, ...)')
if __name__ == '__main__':

View File

@@ -16,30 +16,14 @@ import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
import numpy as np
from uniface import RetinaFace
from uniface.constants import MiniFASNetWeights
from uniface.detection import RetinaFace
from uniface.spoofing import create_spoofer
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
def draw_spoofing_result(
image: np.ndarray,

199
tools/track.py Normal file
View File

@@ -0,0 +1,199 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Face tracking on video files using ByteTrack.
Usage:
python tools/track.py --source video.mp4
python tools/track.py --source video.mp4 --output outputs/tracked.mp4
python tools/track.py --source 0 # webcam
"""
from __future__ import annotations
import argparse
import os
from pathlib import Path
from _common import VIDEO_EXTENSIONS
import cv2
import numpy as np
from tqdm import tqdm
from uniface.common import xyxy_to_cxcywh
from uniface.detection import SCRFD, RetinaFace
from uniface.draw import draw_tracks
from uniface.tracking import BYTETracker
def _assign_track_ids(faces, tracks) -> list:
"""Match tracker outputs back to Face objects by center distance."""
if len(tracks) == 0 or len(faces) == 0:
return []
face_bboxes = np.array([f.bbox for f in faces], dtype=np.float32)
track_ids = tracks[:, 4].astype(int)
face_centers = xyxy_to_cxcywh(face_bboxes)[:, :2] # (N, 2) -> [cx, cy]
track_centers = xyxy_to_cxcywh(tracks[:, :4])[:, :2] # (M, 2) -> [cx, cy]
for ti in range(len(tracks)):
dists = (track_centers[ti, 0] - face_centers[:, 0]) ** 2 + (track_centers[ti, 1] - face_centers[:, 1]) ** 2
faces[int(np.argmin(dists))].track_id = track_ids[ti]
return [f for f in faces if f.track_id is not None]
def process_video(
detector,
tracker: BYTETracker,
input_path: str,
output_path: str,
threshold: float = 0.5,
show_preview: bool = False,
):
"""Process a video file with face tracking."""
cap = cv2.VideoCapture(input_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{input_path}'")
return
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
print(f'Input: {input_path} ({width}x{height}, {fps:.1f} fps, {total_frames} frames)')
print(f'Output: {output_path}')
Path(output_path).parent.mkdir(parents=True, exist_ok=True)
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
if not out.isOpened():
print(f"Error: Cannot create output video '{output_path}'")
cap.release()
return
frame_count = 0
total_tracks = 0
for _ in tqdm(range(total_frames), desc='Tracking', unit='frames'):
ret, frame = cap.read()
if not ret:
break
frame_count += 1
# Detect faces
faces = detector.detect(frame)
dets = np.array([[*f.bbox, f.confidence] for f in faces if f.confidence >= threshold])
dets = dets if len(dets) > 0 else np.empty((0, 5))
# Update tracker
tracks = tracker.update(dets)
tracked_faces = _assign_track_ids(faces, tracks)
total_tracks += len(tracked_faces)
# Draw tracked faces
draw_tracks(image=frame, faces=tracked_faces)
cv2.putText(frame, f'Tracks: {len(tracked_faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
out.write(frame)
if show_preview:
cv2.imshow("Tracking - Press 'q' to cancel", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
print('\nCancelled by user')
break
cap.release()
out.release()
if show_preview:
cv2.destroyAllWindows()
avg_tracks = total_tracks / frame_count if frame_count > 0 else 0
print(f'\nDone! {frame_count} frames, {total_tracks} tracks ({avg_tracks:.1f} avg/frame)')
print(f'Saved: {output_path}')
def run_camera(
detector,
tracker: BYTETracker,
camera_id: int = 0,
threshold: float = 0.5,
):
"""Run real-time face tracking on webcam."""
cap = cv2.VideoCapture(camera_id)
if not cap.isOpened():
print(f'Cannot open camera {camera_id}')
return
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
if not ret:
break
frame = cv2.flip(frame, 1)
# Detect faces
faces = detector.detect(frame)
dets = np.array([[*f.bbox, f.confidence] for f in faces if f.confidence >= threshold])
dets = dets if len(dets) > 0 else np.empty((0, 5))
# Update tracker
tracks = tracker.update(dets)
tracked_faces = _assign_track_ids(faces, tracks)
# Draw tracked faces
draw_tracks(image=frame, faces=tracked_faces)
cv2.putText(frame, f'Tracks: {len(tracked_faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.imshow('Face Tracking', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
def main():
parser = argparse.ArgumentParser(description='Face tracking on video using ByteTrack')
parser.add_argument('--source', type=str, required=True, help='Video path or camera ID (0, 1, ...)')
parser.add_argument('--output', type=str, default=None, help='Output video path')
parser.add_argument('--detector', type=str, default='scrfd', choices=['retinaface', 'scrfd'])
parser.add_argument('--threshold', type=float, default=0.5, help='Detection confidence threshold')
parser.add_argument('--track-buffer', type=int, default=30, help='Max frames to keep lost tracks')
parser.add_argument('--preview', action='store_true', help='Show live preview')
parser.add_argument('--save-dir', type=str, default='outputs', help='Output directory')
args = parser.parse_args()
detector = RetinaFace() if args.detector == 'retinaface' else SCRFD()
tracker = BYTETracker(track_thresh=args.threshold, track_buffer=args.track_buffer)
if args.source.isdigit():
run_camera(detector, tracker, int(args.source), args.threshold)
else:
if not os.path.exists(args.source):
print(f'Error: Video not found: {args.source}')
return
ext = Path(args.source).suffix.lower()
if ext not in VIDEO_EXTENSIONS:
print(f"Error: Unsupported format '{ext}'. Supported: {VIDEO_EXTENSIONS}")
return
if args.output:
output_path = args.output
else:
os.makedirs(args.save_dir, exist_ok=True)
output_path = os.path.join(args.save_dir, f'{Path(args.source).stem}_tracked.mp4')
process_video(detector, tracker, args.source, output_path, args.threshold, args.preview)
if __name__ == '__main__':
main()

View File

@@ -1,180 +0,0 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Face detection on video files with progress tracking.
Usage:
python tools/video_detection.py --source video.mp4
python tools/video_detection.py --source video.mp4 --output output.mp4
python tools/video_detection.py --source 0 # webcam
"""
from __future__ import annotations
import argparse
import os
from pathlib import Path
import cv2
from tqdm import tqdm
from uniface import SCRFD, RetinaFace
from uniface.visualization import draw_detections
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
def process_video(
detector,
input_path: str,
output_path: str,
threshold: float = 0.6,
show_preview: bool = False,
):
"""Process a video file with progress bar."""
cap = cv2.VideoCapture(input_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{input_path}'")
return
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
print(f'Input: {input_path} ({width}x{height}, {fps:.1f} fps, {total_frames} frames)')
print(f'Output: {output_path}')
Path(output_path).parent.mkdir(parents=True, exist_ok=True)
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
if not out.isOpened():
print(f"Error: Cannot create output video '{output_path}'")
cap.release()
return
frame_count = 0
total_faces = 0
for _ in tqdm(range(total_frames), desc='Processing', unit='frames'):
ret, frame = cap.read()
if not ret:
break
frame_count += 1
faces = detector.detect(frame)
total_faces += len(faces)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
out.write(frame)
if show_preview:
cv2.imshow("Processing - Press 'q' to cancel", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
print('\nCancelled by user')
break
cap.release()
out.release()
if show_preview:
cv2.destroyAllWindows()
avg_faces = total_faces / frame_count if frame_count > 0 else 0
print(f'\nDone! {frame_count} frames, {total_faces} faces ({avg_faces:.1f} avg/frame)')
print(f'Saved: {output_path}')
def run_camera(detector, camera_id: int = 0, threshold: float = 0.6):
"""Run real-time detection on webcam."""
cap = cv2.VideoCapture(camera_id)
if not cap.isOpened():
print(f'Cannot open camera {camera_id}')
return
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
faces = detector.detect(frame)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.imshow('Face Detection', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
def main():
parser = argparse.ArgumentParser(description='Process video with face detection')
parser.add_argument('--source', type=str, required=True, help='Video path or camera ID (0, 1, ...)')
parser.add_argument('--output', type=str, default=None, help='Output video path (auto-generated if not specified)')
parser.add_argument('--detector', type=str, default='retinaface', choices=['retinaface', 'scrfd'])
parser.add_argument('--threshold', type=float, default=0.6, help='Visualization threshold')
parser.add_argument('--preview', action='store_true', help='Show live preview')
parser.add_argument('--save-dir', type=str, default='outputs', help='Output directory (if --output not specified)')
args = parser.parse_args()
detector = RetinaFace() if args.detector == 'retinaface' else SCRFD()
source_type = get_source_type(args.source)
if source_type == 'camera':
run_camera(detector, int(args.source), args.threshold)
elif source_type == 'video':
if not os.path.exists(args.source):
print(f'Error: Video not found: {args.source}')
return
# Determine output path
if args.output:
output_path = args.output
else:
os.makedirs(args.save_dir, exist_ok=True)
output_path = os.path.join(args.save_dir, f'{Path(args.source).stem}_detected.mp4')
process_video(detector, args.source, output_path, args.threshold, args.preview)
else:
print(f"Error: Unknown source type for '{args.source}'")
print('Supported formats: videos (.mp4, .avi, ...) or camera ID (0, 1, ...)')
if __name__ == '__main__':
main()

235
tools/xseg.py Normal file
View File

@@ -0,0 +1,235 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""XSeg face segmentation on detected faces.
Usage:
python tools/xseg.py --source path/to/image.jpg
python tools/xseg.py --source path/to/video.mp4
python tools/xseg.py --source 0 # webcam
"""
from __future__ import annotations
import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
import numpy as np
from uniface.detection import RetinaFace
from uniface.parsing import XSeg
def apply_mask_visualization(image: np.ndarray, mask: np.ndarray, alpha: float = 0.5) -> np.ndarray:
"""Apply colored mask overlay for visualization."""
overlay = image.copy().astype(np.float32)
mask_3ch = np.stack([mask * 0.3, mask * 0.7, mask * 0.3], axis=-1)
overlay = overlay * (1 - mask[..., None] * alpha) + mask_3ch * 255 * alpha
return overlay.clip(0, 255).astype(np.uint8)
def process_image(
detector: RetinaFace,
parser: XSeg,
image_path: str,
save_dir: str = 'outputs',
) -> None:
"""Process a single image."""
image = cv2.imread(image_path)
if image is None:
print(f"Error: Failed to load image from '{image_path}'")
return
faces = detector.detect(image)
print(f'Detected {len(faces)} face(s)')
if len(faces) == 0:
print('No faces detected.')
return
# Accumulate masks from all faces
full_mask = np.zeros((image.shape[0], image.shape[1]), dtype=np.float32)
for i, face in enumerate(faces):
if face.landmarks is None:
print(f' Face {i + 1}: skipped (no landmarks)')
continue
mask = parser.parse(image, landmarks=face.landmarks)
full_mask = np.maximum(full_mask, mask)
print(f' Face {i + 1}: done')
# Apply visualization
result_image = apply_mask_visualization(image, full_mask)
# Draw bounding boxes
for face in faces:
x1, y1, x2, y2 = map(int, face.bbox[:4])
cv2.rectangle(result_image, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Save results
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(image_path).stem}_xseg.jpg')
cv2.imwrite(output_path, result_image)
print(f'Output saved: {output_path}')
mask_path = os.path.join(save_dir, f'{Path(image_path).stem}_xseg_mask.png')
mask_uint8 = (full_mask * 255).astype(np.uint8)
cv2.imwrite(mask_path, mask_uint8)
print(f'Mask saved: {mask_path}')
def process_video(
detector: RetinaFace,
parser: XSeg,
video_path: str,
save_dir: str = 'outputs',
) -> None:
"""Process a video file."""
cap = cv2.VideoCapture(video_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{video_path}'")
return
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(video_path).stem}_xseg.mp4')
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
print(f'Processing video: {video_path} ({total_frames} frames)')
frame_count = 0
while True:
ret, frame = cap.read()
if not ret:
break
frame_count += 1
faces = detector.detect(frame)
# Accumulate masks from all faces
full_mask = np.zeros((frame.shape[0], frame.shape[1]), dtype=np.float32)
for face in faces:
if face.landmarks is None:
continue
mask = parser.parse(frame, landmarks=face.landmarks)
full_mask = np.maximum(full_mask, mask)
# Apply visualization
result_frame = apply_mask_visualization(frame, full_mask)
# Draw bounding boxes
for face in faces:
x1, y1, x2, y2 = map(int, face.bbox[:4])
cv2.rectangle(result_frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
cv2.putText(result_frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
out.write(result_frame)
if frame_count % 100 == 0:
print(f' Processed {frame_count}/{total_frames} frames...')
cap.release()
out.release()
print(f'Done! Output saved: {output_path}')
def run_camera(
detector: RetinaFace,
parser: XSeg,
camera_id: int = 0,
) -> None:
"""Run real-time detection on webcam."""
cap = cv2.VideoCapture(camera_id)
if not cap.isOpened():
print(f'Cannot open camera {camera_id}')
return
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)
# Accumulate masks from all faces
full_mask = np.zeros((frame.shape[0], frame.shape[1]), dtype=np.float32)
for face in faces:
if face.landmarks is None:
continue
mask = parser.parse(frame, landmarks=face.landmarks)
full_mask = np.maximum(full_mask, mask)
# Apply visualization
result_frame = apply_mask_visualization(frame, full_mask)
# Draw bounding boxes
for face in faces:
x1, y1, x2, y2 = map(int, face.bbox[:4])
cv2.rectangle(result_frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
cv2.putText(result_frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.imshow('XSeg Face Segmentation', result_frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
def main() -> None:
arg_parser = argparse.ArgumentParser(description='XSeg face segmentation')
arg_parser.add_argument('--source', type=str, required=True, help='Image/video path or camera ID (0, 1, ...)')
arg_parser.add_argument('--save-dir', type=str, default='outputs', help='Output directory')
arg_parser.add_argument(
'--blur',
type=float,
default=0,
help='Gaussian blur sigma for mask smoothing (default: 0 = raw)',
)
arg_parser.add_argument(
'--align-size',
type=int,
default=256,
help='Face alignment size (default: 256)',
)
args = arg_parser.parse_args()
# Initialize models
detector = RetinaFace()
parser = XSeg(blur_sigma=args.blur, align_size=args.align_size)
source_type = get_source_type(args.source)
if source_type == 'camera':
run_camera(detector, parser, int(args.source))
elif source_type == 'image':
if not os.path.exists(args.source):
print(f'Error: Image not found: {args.source}')
return
process_image(detector, parser, args.source, args.save_dir)
elif source_type == 'video':
if not os.path.exists(args.source):
print(f'Error: Video not found: {args.source}')
return
process_video(detector, parser, args.source, args.save_dir)
else:
print(f"Error: Unknown source type for '{args.source}'")
print('Supported formats: images (.jpg, .png, ...), videos (.mp4, .avi, ...), or camera ID (0, 1, ...)')
if __name__ == '__main__':
main()

View File

@@ -16,9 +16,11 @@
This library provides unified APIs for:
- Face detection (RetinaFace, SCRFD, YOLOv5Face, YOLOv8Face)
- Face recognition (AdaFace, ArcFace, MobileFace, SphereFace)
- Face tracking (ByteTrack with Kalman filtering)
- Facial landmarks (106-point detection)
- Face parsing (semantic segmentation)
- Gaze estimation
- Head pose estimation
- Age, gender, and emotion prediction
- Face anti-spoofing
- Privacy/anonymization
@@ -28,38 +30,37 @@ from __future__ import annotations
__license__ = 'MIT'
__author__ = 'Yakhyokhuja Valikhujaev'
__version__ = '2.2.1'
__version__ = '3.2.0'
import contextlib
from uniface.face_utils import compute_similarity, face_alignment
from uniface.log import Logger, enable_logging
from uniface.model_store import verify_model_weights
from uniface.visualization import draw_detections, vis_parsing_maps
from uniface.model_store import download_models, get_cache_dir, set_cache_dir, verify_model_weights
from .analyzer import FaceAnalyzer
from .attribute import AgeGender, FairFace
from .attribute import AgeGender, Emotion, FairFace, create_attribute_predictor
from .detection import (
SCRFD,
RetinaFace,
YOLOv5Face,
YOLOv8Face,
create_detector,
detect_faces,
list_available_detectors,
)
from .gaze import MobileGaze, create_gaze_estimator
from .headpose import HeadPose, create_head_pose_estimator
from .landmark import Landmark106, create_landmarker
from .parsing import BiSeNet, create_face_parser
from .privacy import BlurFace, anonymize_faces
from .parsing import BiSeNet, XSeg, create_face_parser
from .privacy import BlurFace
from .recognition import AdaFace, ArcFace, MobileFace, SphereFace, create_recognizer
from .spoofing import MiniFASNet, create_spoofer
from .types import AttributeResult, EmotionResult, Face, GazeResult, SpoofingResult
from .tracking import BYTETracker
from .types import AttributeResult, EmotionResult, Face, GazeResult, HeadPoseResult, SpoofingResult
# Optional: Emotion requires PyTorch
Emotion: type | None
try:
from .attribute import Emotion
except ImportError:
Emotion = None
# Optional: FAISS vector store (requires `pip install faiss-cpu`)
with contextlib.suppress(ImportError):
from .indexing import FAISS
__all__ = [
# Metadata
@@ -73,10 +74,10 @@ __all__ = [
'create_detector',
'create_face_parser',
'create_gaze_estimator',
'create_head_pose_estimator',
'create_landmarker',
'create_recognizer',
'create_spoofer',
'detect_faces',
'list_available_detectors',
# Detection models
'RetinaFace',
@@ -93,26 +94,35 @@ __all__ = [
# Gaze models
'GazeResult',
'MobileGaze',
# Head pose models
'HeadPose',
'HeadPoseResult',
# Parsing models
'BiSeNet',
'XSeg',
# Attribute models
'AgeGender',
'AttributeResult',
'create_attribute_predictor',
'Emotion',
'EmotionResult',
'FairFace',
# Spoofing models
'MiniFASNet',
'SpoofingResult',
# Tracking
'BYTETracker',
# Privacy
'BlurFace',
'anonymize_faces',
# Indexing (optional)
'FAISS',
# Utilities
'Logger',
'compute_similarity',
'draw_detections',
'download_models',
'enable_logging',
'face_alignment',
'get_cache_dir',
'set_cache_dir',
'verify_model_weights',
'vis_parsing_maps',
]

View File

@@ -6,8 +6,7 @@ from __future__ import annotations
import numpy as np
from uniface.attribute.age_gender import AgeGender
from uniface.attribute.fairface import FairFace
from uniface.attribute.base import Attribute
from uniface.detection.base import BaseDetector
from uniface.log import Logger
from uniface.recognition.base import BaseRecognizer
@@ -21,19 +20,24 @@ class FaceAnalyzer:
This class provides a high-level interface for face analysis by combining
multiple components: face detection, recognition (embedding extraction),
and attribute prediction (age, gender, race).
and an extensible list of attribute predictors (age, gender, race,
emotion, etc.).
Any :class:`~uniface.attribute.base.Attribute` subclass can be passed
via the ``attributes`` list. Each predictor's ``predict(image, face)``
is called once per detected face, enriching the :class:`Face` in-place.
Args:
detector: Face detector instance for detecting faces in images.
recognizer: Optional face recognizer for extracting embeddings.
age_gender: Optional age/gender predictor.
fairface: Optional FairFace predictor for demographics.
attributes: Optional list of ``Attribute`` predictors to run on
each detected face (e.g. ``[AgeGender(), FairFace(), Emotion()]``).
Example:
>>> from uniface import RetinaFace, ArcFace, FaceAnalyzer
>>> from uniface import RetinaFace, ArcFace, AgeGender, FaceAnalyzer
>>> detector = RetinaFace()
>>> recognizer = ArcFace()
>>> analyzer = FaceAnalyzer(detector, recognizer=recognizer)
>>> analyzer = FaceAnalyzer(detector, recognizer=recognizer, attributes=[AgeGender()])
>>> faces = analyzer.analyze(image)
"""
@@ -41,27 +45,23 @@ class FaceAnalyzer:
self,
detector: BaseDetector,
recognizer: BaseRecognizer | None = None,
age_gender: AgeGender | None = None,
fairface: FairFace | None = None,
attributes: list[Attribute] | None = None,
) -> None:
self.detector = detector
self.recognizer = recognizer
self.age_gender = age_gender
self.fairface = fairface
self.attributes: list[Attribute] = attributes or []
Logger.info(f'Initialized FaceAnalyzer with detector={detector.__class__.__name__}')
if recognizer:
Logger.info(f' - Recognition enabled: {recognizer.__class__.__name__}')
if age_gender:
Logger.info(f' - Age/Gender enabled: {age_gender.__class__.__name__}')
if fairface:
Logger.info(f' - FairFace enabled: {fairface.__class__.__name__}')
for attr in self.attributes:
Logger.info(f' - Attribute enabled: {attr.__class__.__name__}')
def analyze(self, image: np.ndarray) -> list[Face]:
"""Analyze faces in an image.
Performs face detection and optionally extracts embeddings and
predicts attributes for each detected face.
Performs face detection, optionally extracts embeddings, and runs
every registered attribute predictor on each detected face.
Args:
image: Input image as numpy array with shape (H, W, C) in BGR format.
@@ -80,24 +80,13 @@ class FaceAnalyzer:
except Exception as e:
Logger.warning(f' Face {idx + 1}: Failed to extract embedding: {e}')
if self.age_gender is not None:
for attr in self.attributes:
attr_name = attr.__class__.__name__
try:
result = self.age_gender.predict(image, face.bbox)
face.gender = result.gender
face.age = result.age
Logger.debug(f' Face {idx + 1}: Age={face.age}, Gender={face.sex}')
attr.predict(image, face)
Logger.debug(f' Face {idx + 1}: {attr_name} prediction succeeded')
except Exception as e:
Logger.warning(f' Face {idx + 1}: Failed to predict age/gender: {e}')
if self.fairface is not None:
try:
result = self.fairface.predict(image, face.bbox)
face.gender = result.gender
face.age_group = result.age_group
face.race = result.race
Logger.debug(f' Face {idx + 1}: AgeGroup={face.age_group}, Gender={face.sex}, Race={face.race}')
except Exception as e:
Logger.warning(f' Face {idx + 1}: Failed to predict FairFace attributes: {e}')
Logger.warning(f' Face {idx + 1}: {attr_name} prediction failed: {e}')
Logger.info(f'Analysis complete: {len(faces)} face(s) processed')
return faces
@@ -106,8 +95,6 @@ class FaceAnalyzer:
parts = [f'FaceAnalyzer(detector={self.detector.__class__.__name__}']
if self.recognizer:
parts.append(f'recognizer={self.recognizer.__class__.__name__}')
if self.age_gender:
parts.append(f'age_gender={self.age_gender.__class__.__name__}')
if self.fairface:
parts.append(f'fairface={self.fairface.__class__.__name__}')
for attr in self.attributes:
parts.append(f'{attr.__class__.__name__}')
return ', '.join(parts) + ')'

View File

@@ -14,16 +14,25 @@ from uniface.attribute.fairface import FairFace
from uniface.constants import AgeGenderWeights, DDAMFNWeights, FairFaceWeights
from uniface.types import AttributeResult, EmotionResult, Face
# Emotion requires PyTorch - make it optional
try:
from uniface.attribute.emotion import Emotion
_EMOTION_AVAILABLE = True
except ImportError:
Emotion = None
_EMOTION_AVAILABLE = False
# Public API for the attribute module
class Emotion(Attribute): # type: ignore[no-redef]
"""Stub for Emotion when PyTorch is not installed."""
def __init__(self, *args: Any, **kwargs: Any) -> None:
raise ImportError("Emotion requires optional dependency 'torch'. Install with: pip install torch")
def _initialize_model(self) -> None: ...
def preprocess(self, image: np.ndarray, *args: Any) -> Any: ...
def postprocess(self, prediction: Any) -> Any: ...
def predict(self, image: np.ndarray, face: Face) -> Any: ...
__all__ = [
'AgeGender',
'AttributeResult',
@@ -31,16 +40,13 @@ __all__ = [
'EmotionResult',
'FairFace',
'create_attribute_predictor',
'predict_attributes',
]
# A mapping from model enums to their corresponding attribute classes
_ATTRIBUTE_MODELS = {
**dict.fromkeys(AgeGenderWeights, AgeGender),
**dict.fromkeys(FairFaceWeights, FairFace),
}
# Add Emotion models only if PyTorch is available
if _EMOTION_AVAILABLE:
_ATTRIBUTE_MODELS.update(dict.fromkeys(DDAMFNWeights, Emotion))
@@ -48,21 +54,16 @@ if _EMOTION_AVAILABLE:
def create_attribute_predictor(
model_name: AgeGenderWeights | DDAMFNWeights | FairFaceWeights, **kwargs: Any
) -> Attribute:
"""
Factory function to create an attribute predictor instance.
This high-level API simplifies the creation of attribute models by
dynamically selecting the correct class based on the provided model enum.
"""Factory function to create an attribute predictor instance.
Args:
model_name: The enum corresponding to the desired attribute model
(e.g., AgeGenderWeights.DEFAULT, DDAMFNWeights.AFFECNET7,
or FairFaceWeights.DEFAULT).
**kwargs: Additional keyword arguments to pass to the model's constructor.
(e.g., AgeGenderWeights.DEFAULT, DDAMFNWeights.AFFECNET7,
or FairFaceWeights.DEFAULT).
**kwargs: Additional keyword arguments passed to the model constructor.
Returns:
An initialized instance of an Attribute predictor class
(e.g., AgeGender, FairFace, or Emotion).
An initialized Attribute predictor (AgeGender, FairFace, or Emotion).
Raises:
ValueError: If the provided model_name is not a supported enum.
@@ -75,40 +76,4 @@ def create_attribute_predictor(
f'Please choose from AgeGenderWeights, FairFaceWeights, or DDAMFNWeights.'
)
# Pass model_name to the constructor, as some classes might need it
return model_class(model_name=model_name, **kwargs)
def predict_attributes(image: np.ndarray, faces: list[Face], predictor: Attribute) -> list[Face]:
"""
High-level API to predict attributes for multiple detected faces.
This function iterates through a list of Face objects, runs the
specified attribute predictor on each one, and updates the Face
objects with the predicted attributes.
Args:
image (np.ndarray): The full input image in BGR format.
faces (List[Face]): A list of Face objects from face detection.
predictor (Attribute): An initialized attribute predictor instance,
created by `create_attribute_predictor`.
Returns:
List[Face]: The list of Face objects with updated attribute fields.
"""
for face in faces:
if isinstance(predictor, AgeGender):
result = predictor(image, face.bbox)
face.gender = result.gender
face.age = result.age
elif isinstance(predictor, FairFace):
result = predictor(image, face.bbox)
face.gender = result.gender
face.age_group = result.age_group
face.race = result.race
elif isinstance(predictor, Emotion):
result = predictor(image, face.landmarks)
face.emotion = result.emotion
face.emotion_confidence = result.confidence
return faces

View File

@@ -12,7 +12,7 @@ from uniface.face_utils import bbox_center_alignment
from uniface.log import Logger
from uniface.model_store import verify_model_weights
from uniface.onnx_utils import create_onnx_session
from uniface.types import AttributeResult
from uniface.types import AttributeResult, Face
__all__ = ['AgeGender']
@@ -133,17 +133,20 @@ class AgeGender(Attribute):
age = int(np.round(prediction[2] * 100))
return AttributeResult(gender=gender, age=age)
def predict(self, image: np.ndarray, bbox: list | np.ndarray) -> AttributeResult:
"""
Predicts age and gender for a single face specified by a bounding box.
def predict(self, image: np.ndarray, face: Face) -> AttributeResult:
"""Predict age and gender and enrich the Face in-place.
Args:
image (np.ndarray): The full input image in BGR format.
bbox (Union[List, np.ndarray]): The face bounding box coordinates [x1, y1, x2, y2].
image: The full input image in BGR format.
face: Detected face; ``face.bbox`` is used for alignment.
Returns:
AttributeResult: Result containing gender (0=Female, 1=Male) and age (in years).
``AttributeResult`` with gender (0=Female, 1=Male) and age (years).
"""
face_blob = self.preprocess(image, bbox)
face_blob = self.preprocess(image, face.bbox)
prediction = self.session.run(self.output_names, {self.input_name: face_blob})[0][0]
return self.postprocess(prediction)
result = self.postprocess(prediction)
face.gender = result.gender
face.age = result.age
return result

View File

@@ -2,95 +2,78 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
from abc import ABC, abstractmethod
from typing import Any
import numpy as np
from uniface.types import AttributeResult, EmotionResult
from uniface.types import AttributeResult, EmotionResult, Face
__all__ = ['Attribute', 'AttributeResult', 'EmotionResult']
class Attribute(ABC):
"""
Abstract base class for face attribute models.
"""Abstract base class for face attribute models.
This class defines the common interface that all attribute models
(e.g., age-gender, emotion) must implement. It ensures a consistent API
across different attribute prediction modules in the library, making them
interchangeable and easy to use.
All attribute models (age-gender, emotion, FairFace, etc.) implement this
interface so they can be used interchangeably inside ``FaceAnalyzer``.
The ``predict`` method accepts an image and a :class:`Face` object. Each
subclass extracts what it needs (bbox, landmarks) from the Face, runs
inference, writes the results back to the Face **and** returns a typed
result dataclass.
"""
@abstractmethod
def _initialize_model(self) -> None:
"""
Initializes the underlying model for inference.
This method should handle loading model weights, creating the
inference session (e.g., ONNX Runtime, PyTorch), and any necessary
warm-up procedures to prepare the model for prediction.
"""
"""Load model weights and create the inference session."""
raise NotImplementedError('Subclasses must implement the _initialize_model method.')
@abstractmethod
def preprocess(self, image: np.ndarray, *args: Any) -> Any:
"""
Preprocesses the input data for the model.
This method should take a raw image and any other necessary data
(like bounding boxes or landmarks) and convert it into the format
expected by the model's inference engine (e.g., a blob or tensor).
"""Preprocess the input data for the model.
Args:
image (np.ndarray): The input image containing the face, typically
in BGR format.
*args: Additional arguments required for preprocessing, such as
bounding boxes or facial landmarks.
image: The input image in BGR format.
*args: Subclass-specific data (bbox, landmarks, etc.).
Returns:
The preprocessed data ready for model inference.
Preprocessed data ready for model inference.
"""
raise NotImplementedError('Subclasses must implement the preprocess method.')
@abstractmethod
def postprocess(self, prediction: Any) -> Any:
"""
Postprocesses the raw model output into a human-readable format.
This method takes the raw output from the model's inference and
converts it into a meaningful result, such as an age value, a gender
label, or an emotion category.
"""Convert raw model output into a typed result dataclass.
Args:
prediction (Any): The raw output from the model's inference.
prediction: Raw output from the model.
Returns:
The final, processed attributes.
An ``AttributeResult`` or ``EmotionResult``.
"""
raise NotImplementedError('Subclasses must implement the postprocess method.')
@abstractmethod
def predict(self, image: np.ndarray, *args: Any) -> Any:
"""
Performs end-to-end attribute prediction on a given image.
def predict(self, image: np.ndarray, face: Face) -> AttributeResult | EmotionResult:
"""Run end-to-end prediction and enrich the Face in-place.
This method orchestrates the full pipeline: it calls the preprocess,
inference, and postprocess steps to return the final, user-friendly
attribute prediction.
Each subclass extracts what it needs from *face* (e.g. ``face.bbox``
or ``face.landmarks``), runs the full preprocess-infer-postprocess
pipeline, writes relevant fields back to *face*, and returns the
result dataclass.
Args:
image (np.ndarray): The input image containing the face.
*args: Additional data required for prediction, such as a bounding
box or landmarks.
image: The full input image in BGR format.
face: Detected face whose attribute fields will be populated.
Returns:
The final predicted attributes.
The prediction result (``AttributeResult`` or ``EmotionResult``).
"""
raise NotImplementedError('Subclasses must implement the predict method.')
def __call__(self, *args, **kwargs) -> Any:
"""
Provides a convenient, callable shortcut for the `predict` method.
"""
return self.predict(*args, **kwargs)
def __call__(self, image: np.ndarray, face: Face) -> AttributeResult | EmotionResult:
"""Callable shortcut for :meth:`predict`."""
return self.predict(image, face)

Some files were not shown because too many files have changed in this diff Show More