refactor: Standardize naming conventions (#47)

* refactor: Standardize naming conventions

* chore: Update the version and re-run experiments

* chore: Improve code quality tooling and documentation

- Add pre-commit job to CI workflow for automated linting on PRs
- Update uniface/__init__.py with copyright header, module docstring,
  and logically grouped exports
- Revise CONTRIBUTING.md to reflect pre-commit handles all formatting
- Remove redundant ruff check from CI (now handled by pre-commit)
- Update build job Python version to 3.11 (matches requires-python)
This commit is contained in:
Yakhyokhuja Valikhujaev
2025-12-30 00:20:34 +09:00
committed by GitHub
parent 64ad0d2f53
commit 50226041c9
72 changed files with 1200 additions and 774 deletions

View File

@@ -15,9 +15,20 @@ concurrency:
cancel-in-progress: true
jobs:
lint:
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- uses: pre-commit/action@v3.0.1
test:
runs-on: ${{ matrix.os }}
timeout-minutes: 15
needs: lint
strategy:
fail-fast: false
@@ -44,9 +55,6 @@ jobs:
run: |
python -c "import onnxruntime as ort; print('Available providers:', ort.get_available_providers())"
- name: Lint with ruff
run: ruff check .
- name: Run tests
run: pytest -v --tb=short
@@ -65,7 +73,7 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.10"
python-version: "3.11"
cache: "pip"
- name: Install build tools

View File

@@ -117,4 +117,3 @@ jobs:
with:
files: dist/*
generate_release_notes: true

40
.pre-commit-config.yaml Normal file
View File

@@ -0,0 +1,40 @@
# Pre-commit configuration for UniFace
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
# General file checks
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-toml
- id: check-added-large-files
args: ['--maxkb=1000']
- id: check-merge-conflict
- id: debug-statements
- id: check-ast
# Ruff - Fast Python linter and formatter
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.8.4
hooks:
- id: ruff
args: [--fix, --unsafe-fixes, --exit-non-zero-on-fix]
- id: ruff-format
# Security checks
- repo: https://github.com/PyCQA/bandit
rev: 1.7.10
hooks:
- id: bandit
args: [-c, pyproject.toml]
additional_dependencies: ['bandit[toml]']
exclude: ^tests/
# Configuration
ci:
autofix_commit_msg: 'style: auto-fix by pre-commit hooks'
autoupdate_commit_msg: 'chore: update pre-commit hooks'

View File

@@ -16,33 +16,9 @@ Thank you for considering contributing to UniFace! We welcome contributions of a
2. Create a new branch for your feature
3. Write clear, documented code with type hints
4. Add tests for new functionality
5. Ensure all tests pass
5. Ensure all tests pass and pre-commit hooks are satisfied
6. Submit a pull request with a clear description
### Code Style
This project uses [Ruff](https://docs.astral.sh/ruff/) for linting and formatting.
```bash
# Check for linting errors
ruff check .
# Auto-fix linting errors
ruff check . --fix
# Format code
ruff format .
```
**Guidelines:**
- Follow PEP8 guidelines
- Use type hints (Python 3.10+)
- Write docstrings for public APIs
- Line length: 120 characters
- Keep code simple and readable
All PRs must pass `ruff check .` before merging.
## Development Setup
```bash
@@ -51,31 +27,164 @@ cd uniface
pip install -e ".[dev]"
```
### Setting Up Pre-commit Hooks
We use [pre-commit](https://pre-commit.com/) to ensure code quality and consistency. Install and configure it:
```bash
# Install pre-commit
pip install pre-commit
# Install the git hooks
pre-commit install
# (Optional) Run against all files
pre-commit run --all-files
```
Once installed, pre-commit will automatically run on every commit to check:
- Code formatting and linting (Ruff)
- Security issues (Bandit)
- General file hygiene (trailing whitespace, YAML/TOML validity, etc.)
**Note:** All PRs are automatically checked by CI. The merge button will only be available after all checks pass.
## Code Style
This project uses [Ruff](https://docs.astral.sh/ruff/) for linting and formatting, following modern Python best practices. Pre-commit handles all formatting automatically.
### Style Guidelines
#### General Rules
- **Line length:** 120 characters maximum
- **Python version:** 3.11+ (use modern syntax)
- **Quote style:** Single quotes for strings, double quotes for docstrings
#### Type Hints
Use modern Python 3.11+ type hints (PEP 585 and PEP 604):
```python
# Preferred (modern)
def process(items: list[str], config: dict[str, int] | None = None) -> tuple[int, str]:
...
# Avoid (legacy)
from typing import List, Dict, Optional, Tuple
def process(items: List[str], config: Optional[Dict[str, int]] = None) -> Tuple[int, str]:
...
```
#### Docstrings
Use [Google-style docstrings](https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings) for all public APIs:
```python
def detect_faces(image: np.ndarray, threshold: float = 0.5) -> list[Face]:
"""Detect faces in an image.
Args:
image: Input image as a numpy array with shape (H, W, C) in BGR format.
threshold: Confidence threshold for filtering detections. Defaults to 0.5.
Returns:
List of Face objects containing bounding boxes, confidence scores,
and facial landmarks.
Raises:
ValueError: If the input image has invalid dimensions.
Example:
>>> from uniface import detect_faces
>>> faces = detect_faces(image, threshold=0.8)
>>> print(f"Found {len(faces)} faces")
"""
```
#### Import Order
Imports are automatically sorted by Ruff with the following order:
1. **Future** imports (`from __future__ import annotations`)
2. **Standard library** (`os`, `sys`, `typing`, etc.)
3. **Third-party** (`numpy`, `cv2`, `onnxruntime`, etc.)
4. **First-party** (`uniface.*`)
5. **Local** (relative imports like `.base`, `.models`)
```python
from __future__ import annotations
import os
from typing import Any
import cv2
import numpy as np
from uniface.constants import RetinaFaceWeights
from uniface.log import Logger
from .base import BaseDetector
```
#### Code Comments
- Add comments for complex logic, magic numbers, and non-obvious behavior
- Avoid comments that merely restate the code
- Use `# TODO:` with issue links for planned improvements
```python
# RetinaFace FPN strides and corresponding anchor sizes per level
steps = [8, 16, 32]
min_sizes = [[16, 32], [64, 128], [256, 512]]
# Add small epsilon to prevent division by zero
similarity = np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b) + 1e-5)
```
## Running Tests
```bash
# Run all tests
pytest tests/
# Run with verbose output
pytest tests/ -v
# Run specific test file
pytest tests/test_factory.py
# Run with coverage
pytest tests/ --cov=uniface --cov-report=html
```
## Adding New Features
When adding a new model or feature:
1. **Create the model class** in the appropriate submodule (e.g., `uniface/detection/`)
2. **Add weight constants** to `uniface/constants.py` with URLs and SHA256 hashes
3. **Export in `__init__.py`** files at both module and package levels
4. **Write tests** in `tests/` directory
5. **Add example usage** in `scripts/` or update existing notebooks
6. **Update documentation** if needed
## Examples
Example notebooks demonstrating library usage:
| Example | Notebook |
|---------|----------|
| Face Detection | [face_detection.ipynb](examples/face_detection.ipynb) |
| Face Alignment | [face_alignment.ipynb](examples/face_alignment.ipynb) |
| Face Recognition | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
| Face Verification | [face_verification.ipynb](examples/face_verification.ipynb) |
| Face Search | [face_search.ipynb](examples/face_search.ipynb) |
| Face Anonymization | [face_anonymization.ipynb](examples/face_anonymization.ipynb) |
| Face Detection | [01_face_detection.ipynb](examples/01_face_detection.ipynb) |
| Face Alignment | [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) |
| Face Verification | [03_face_verification.ipynb](examples/03_face_verification.ipynb) |
| Face Search | [04_face_search.ipynb](examples/04_face_search.ipynb) |
| Face Analyzer | [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) |
| Face Parsing | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
| Face Anonymization | [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) |
| Gaze Estimation | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
## Questions?
Open an issue or start a discussion on GitHub.

View File

@@ -34,7 +34,7 @@ detector = RetinaFace() # Uses MNET_V2
# Specific model
detector = RetinaFace(
model_name=RetinaFaceWeights.MNET_025, # Fastest
conf_thresh=0.5,
confidence_threshold=0.5,
nms_thresh=0.4,
input_size=(640, 640)
)
@@ -63,14 +63,14 @@ from uniface.constants import SCRFDWeights
# Fast real-time detection
detector = SCRFD(
model_name=SCRFDWeights.SCRFD_500M_KPS,
conf_thresh=0.5,
confidence_threshold=0.5,
input_size=(640, 640)
)
# High accuracy
detector = SCRFD(
model_name=SCRFDWeights.SCRFD_10G_KPS,
conf_thresh=0.5
confidence_threshold=0.5
)
```
@@ -99,29 +99,29 @@ from uniface.constants import YOLOv5FaceWeights
# Lightweight/Mobile
detector = YOLOv5Face(
model_name=YOLOv5FaceWeights.YOLOV5N,
conf_thresh=0.6,
confidence_threshold=0.6,
nms_thresh=0.5
)
# Real-time detection (recommended)
detector = YOLOv5Face(
model_name=YOLOv5FaceWeights.YOLOV5S,
conf_thresh=0.6,
confidence_threshold=0.6,
nms_thresh=0.5
)
# High accuracy
detector = YOLOv5Face(
model_name=YOLOv5FaceWeights.YOLOV5M,
conf_thresh=0.6
confidence_threshold=0.6
)
# Detect faces with landmarks
faces = detector.detect(image)
for face in faces:
bbox = face['bbox'] # [x1, y1, x2, y2]
confidence = face['confidence']
landmarks = face['landmarks'] # 5-point landmarks (5, 2)
bbox = face.bbox # [x1, y1, x2, y2]
confidence = face.confidence
landmarks = face.landmarks # 5-point landmarks (5, 2)
```
---
@@ -466,7 +466,7 @@ spoofer = MiniFASNet(model_name=MiniFASNetWeights.V1SE)
# Detect and check liveness
faces = detector.detect(image)
for face in faces:
label_idx, score = spoofer.predict(image, face['bbox'])
label_idx, score = spoofer.predict(image, face.bbox)
# label_idx: 0 = Fake, 1 = Real
label = 'Real' if label_idx == 1 else 'Fake'
print(f"{label}: {score:.1%}")

View File

@@ -545,7 +545,7 @@ from uniface.constants import RetinaFaceWeights, SCRFDWeights, YOLOv5FaceWeights
# Fast detection (mobile/edge devices)
detector = RetinaFace(
model_name=RetinaFaceWeights.MNET_025,
conf_thresh=0.7
confidence_threshold=0.7
)
# Balanced (recommended)
@@ -556,14 +556,14 @@ detector = RetinaFace(
# Real-time with high accuracy
detector = YOLOv5Face(
model_name=YOLOv5FaceWeights.YOLOV5S,
conf_thresh=0.6,
confidence_threshold=0.6,
nms_thresh=0.5
)
# High accuracy (server/GPU)
detector = SCRFD(
model_name=SCRFDWeights.SCRFD_10G_KPS,
conf_thresh=0.5
confidence_threshold=0.5
)
```
@@ -668,14 +668,14 @@ Explore interactive examples for common tasks:
| Example | Description | Notebook |
|---------|-------------|----------|
| **Face Detection** | Detect faces and facial landmarks | [face_detection.ipynb](examples/face_detection.ipynb) |
| **Face Alignment** | Align and crop faces for recognition | [face_alignment.ipynb](examples/face_alignment.ipynb) |
| **Face Recognition** | Extract face embeddings and compare faces | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
| **Face Verification** | Compare two faces to verify identity | [face_verification.ipynb](examples/face_verification.ipynb) |
| **Face Search** | Find a person in a group photo | [face_search.ipynb](examples/face_search.ipynb) |
| **Face Parsing** | Segment face into semantic components | [face_parsing.ipynb](examples/face_parsing.ipynb) |
| **Face Anonymization** | Blur or pixelate faces for privacy protection | [face_anonymization.ipynb](examples/face_anonymization.ipynb) |
| **Gaze Estimation** | Estimate gaze direction | [gaze_estimation.ipynb](examples/gaze_estimation.ipynb) |
| **Face Detection** | Detect faces and facial landmarks | [01_face_detection.ipynb](examples/01_face_detection.ipynb) |
| **Face Alignment** | Align and crop faces for recognition | [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) |
| **Face Verification** | Compare two faces to verify identity | [03_face_verification.ipynb](examples/03_face_verification.ipynb) |
| **Face Search** | Find a person in a group photo | [04_face_search.ipynb](examples/04_face_search.ipynb) |
| **Face Analyzer** | All-in-one detection, recognition & attributes | [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) |
| **Face Parsing** | Segment face into semantic components | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
| **Face Anonymization** | Blur or pixelate faces for privacy protection | [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) |
| **Gaze Estimation** | Estimate gaze direction | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
### Additional Resources

View File

@@ -321,7 +321,7 @@ detector = RetinaFace()
# Create with custom config
detector = SCRFD(
model_name=SCRFDWeights.SCRFD_10G_KPS, # SCRFDWeights.SCRFD_500M_KPS
conf_thresh=0.4,
confidence_threshold=0.4,
input_size=(640, 640)
)
# Or with defaults settings: detector = SCRFD()
@@ -340,16 +340,16 @@ from uniface.constants import RetinaFaceWeights, YOLOv5FaceWeights
# Detection
detector = RetinaFace(
model_name=RetinaFaceWeights.MNET_V2,
conf_thresh=0.5,
nms_thresh=0.4
confidence_threshold=0.5,
nms_threshold=0.4
)
# Or detector = RetinaFace()
# YOLOv5-Face detection
detector = YOLOv5Face(
model_name=YOLOv5FaceWeights.YOLOV5S,
conf_thresh=0.6,
nms_thresh=0.5
confidence_threshold=0.6,
nms_threshold=0.5
)
# Or detector = YOLOv5Face
@@ -365,7 +365,7 @@ recognizer = SphereFace() # Angular softmax alternative
from uniface import detect_faces
# One-line face detection
faces = detect_faces(image, method='retinaface', conf_thresh=0.8) # methods: retinaface, scrfd, yolov5face
faces = detect_faces(image, method='retinaface', confidence_threshold=0.8) # methods: retinaface, scrfd, yolov5face
```
### Key Parameters (quick reference)
@@ -374,9 +374,9 @@ faces = detect_faces(image, method='retinaface', conf_thresh=0.8) # methods: re
| Class | Key params (defaults) | Notes |
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------- |
| `RetinaFace` | `model_name=RetinaFaceWeights.MNET_V2`, `conf_thresh=0.5`, `nms_thresh=0.4`, `input_size=(640, 640)`, `dynamic_size=False` | Supports 5-point landmarks |
| `SCRFD` | `model_name=SCRFDWeights.SCRFD_10G_KPS`, `conf_thresh=0.5`, `nms_thresh=0.4`, `input_size=(640, 640)` | Supports 5-point landmarks |
| `YOLOv5Face` | `model_name=YOLOv5FaceWeights.YOLOV5S`, `conf_thresh=0.6`, `nms_thresh=0.5`, `input_size=640` (fixed) | Supports 5-point landmarks; models: YOLOV5N/S/M; `input_size` must be 640 |
| `RetinaFace` | `model_name=RetinaFaceWeights.MNET_V2`, `confidence_threshold=0.5`, `nms_threshold=0.4`, `input_size=(640, 640)`, `dynamic_size=False` | Supports 5-point landmarks |
| `SCRFD` | `model_name=SCRFDWeights.SCRFD_10G_KPS`, `confidence_threshold=0.5`, `nms_threshold=0.4`, `input_size=(640, 640)` | Supports 5-point landmarks |
| `YOLOv5Face` | `model_name=YOLOv5FaceWeights.YOLOV5S`, `confidence_threshold=0.6`, `nms_threshold=0.5`, `input_size=640` (fixed) | Supports 5-point landmarks; models: YOLOV5N/S/M; `input_size` must be 640 |
**Recognition**
@@ -454,14 +454,14 @@ Interactive examples covering common face analysis tasks:
| Example | Description | Notebook |
|---------|-------------|----------|
| **Face Detection** | Detect faces and facial landmarks | [face_detection.ipynb](examples/face_detection.ipynb) |
| **Face Alignment** | Align and crop faces for recognition | [face_alignment.ipynb](examples/face_alignment.ipynb) |
| **Face Recognition** | Extract face embeddings and compare faces | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
| **Face Verification** | Compare two faces to verify identity | [face_verification.ipynb](examples/face_verification.ipynb) |
| **Face Search** | Find a person in a group photo | [face_search.ipynb](examples/face_search.ipynb) |
| **Face Parsing** | Segment face into semantic components | [face_parsing.ipynb](examples/face_parsing.ipynb) |
| **Face Anonymization** | Blur or pixelate faces for privacy protection | [face_anonymization.ipynb](examples/face_anonymization.ipynb) |
| **Gaze Estimation** | Estimate gaze direction from face images | [gaze_estimation.ipynb](examples/gaze_estimation.ipynb) |
| **Face Detection** | Detect faces and facial landmarks | [01_face_detection.ipynb](examples/01_face_detection.ipynb) |
| **Face Alignment** | Align and crop faces for recognition | [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) |
| **Face Verification** | Compare two faces to verify identity | [03_face_verification.ipynb](examples/03_face_verification.ipynb) |
| **Face Search** | Find a person in a group photo | [04_face_search.ipynb](examples/04_face_search.ipynb) |
| **Face Analyzer** | All-in-one detection, recognition & attributes | [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) |
| **Face Parsing** | Segment face into semantic components | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
| **Face Anonymization** | Blur or pixelate faces for privacy protection | [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) |
| **Gaze Estimation** | Estimate gaze direction from face images | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
### Webcam Face Detection

View File

View File

@@ -44,7 +44,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"1.6.0\n"
"2.0.0\n"
]
}
],
@@ -82,8 +82,8 @@
],
"source": [
"detector = RetinaFace(\n",
" conf_thresh=0.5,\n",
" nms_thresh=0.4,\n",
" confidence_threshold=0.5,\n",
" nms_threshold=0.4,\n",
")"
]
},

View File

@@ -48,7 +48,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"1.6.0\n"
"2.0.0\n"
]
}
],
@@ -87,8 +87,8 @@
],
"source": [
"detector = RetinaFace(\n",
" conf_thresh=0.5,\n",
" nms_thresh=0.4,\n",
" confidence_threshold=0.5,\n",
" nms_threshold=0.4,\n",
")"
]
},

View File

@@ -37,7 +37,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"1.6.0\n"
"2.0.0\n"
]
}
],
@@ -78,7 +78,7 @@
],
"source": [
"analyzer = FaceAnalyzer(\n",
" detector=RetinaFace(conf_thresh=0.5),\n",
" detector=RetinaFace(confidence_threshold=0.5),\n",
" recognizer=ArcFace()\n",
")"
]

View File

@@ -42,7 +42,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"1.6.0\n"
"2.0.0\n"
]
}
],
@@ -74,7 +74,7 @@
],
"source": [
"analyzer = FaceAnalyzer(\n",
" detector=RetinaFace(conf_thresh=0.5),\n",
" detector=RetinaFace(confidence_threshold=0.5),\n",
" recognizer=ArcFace()\n",
")"
]

View File

@@ -44,7 +44,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"1.6.0\n"
"2.0.0\n"
]
}
],
@@ -88,7 +88,7 @@
],
"source": [
"analyzer = FaceAnalyzer(\n",
" detector=RetinaFace(conf_thresh=0.5),\n",
" detector=RetinaFace(confidence_threshold=0.5),\n",
" recognizer=ArcFace(),\n",
" age_gender=AgeGender()\n",
")"

View File

@@ -46,7 +46,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"UniFace version: 1.6.0\n"
"UniFace version: 2.0.0\n"
]
}
],

File diff suppressed because one or more lines are too long

View File

@@ -44,7 +44,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"UniFace version: 1.6.0\n"
"UniFace version: 2.0.0\n"
]
}
],
@@ -86,7 +86,7 @@
],
"source": [
"# Initialize face detector\n",
"detector = RetinaFace(conf_thresh=0.5)\n",
"detector = RetinaFace(confidence_threshold=0.5)\n",
"\n",
"# Initialize gaze estimator (uses ResNet34 by default)\n",
"gaze_estimator = MobileGaze()"

View File

@@ -1,6 +1,6 @@
[project]
name = "uniface"
version = "1.6.0"
version = "2.0.0"
description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Landmark Analysis, Face Parsing, Gaze Estimation, Age, and Gender Detection"
readme = "README.md"
license = { text = "MIT" }
@@ -89,13 +89,60 @@ exclude = [
[tool.ruff.format]
quote-style = "single"
docstring-code-format = true
[tool.ruff.lint]
select = ["E", "F", "I", "W"]
select = [
"E", # pycodestyle errors
"F", # pyflakes
"I", # isort
"W", # pycodestyle warnings
"UP", # pyupgrade (modern Python syntax)
"B", # flake8-bugbear
"C4", # flake8-comprehensions
"SIM", # flake8-simplify
"RUF", # Ruff-specific rules
]
ignore = [
"E501", # Line too long (handled by formatter)
"B008", # Function call in default argument (common in FastAPI/Click)
"SIM108", # Use ternary operator (can reduce readability)
"RUF022", # Allow logical grouping in __all__ instead of alphabetical sorting
]
[tool.ruff.lint.flake8-quotes]
docstring-quotes = "double"
[tool.ruff.lint.isort]
force-single-line = false
force-sort-within-sections = true
known-first-party = ["uniface"]
section-order = [
"future",
"standard-library",
"third-party",
"first-party",
"local-folder",
]
[tool.ruff.lint.pydocstyle]
convention = "google"
[tool.mypy]
python_version = "3.11"
warn_return_any = false
warn_unused_ignores = true
ignore_missing_imports = true
exclude = ["tests/", "scripts/", "examples/"]
# Disable strict return type checking for numpy operations
disable_error_code = ["no-any-return"]
[tool.bandit]
exclude_dirs = ["tests", "scripts", "examples"]
skips = ["B101", "B614"] # B101: assert, B614: torch.jit.load (models are SHA256 verified)
[tool.pytest.ini_options]
testpaths = ["tests"]
python_files = ["test_*.py"]
python_functions = ["test_*"]
addopts = "-v --tb=short"

View File

@@ -28,9 +28,9 @@ def process_image(detector, image_path: Path, output_path: Path, threshold: floa
faces = detector.detect(image)
# unpack face data for visualization
bboxes = [f['bbox'] for f in faces]
scores = [f['confidence'] for f in faces]
landmarks = [f['landmarks'] for f in faces]
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
)

View File

@@ -39,17 +39,17 @@ def process_image(
if not faces:
return
bboxes = [f['bbox'] for f in faces]
scores = [f['confidence'] for f in faces]
landmarks = [f['landmarks'] for f in faces]
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
)
for i, face in enumerate(faces):
result = age_gender.predict(image, face['bbox'])
result = age_gender.predict(image, face.bbox)
print(f' Face {i + 1}: {result.sex}, {result.age} years old')
draw_age_gender_label(image, face['bbox'], result.sex, result.age)
draw_age_gender_label(image, face.bbox, result.sex, result.age)
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(image_path).stem}_age_gender.jpg')
@@ -74,16 +74,16 @@ def run_webcam(detector, age_gender, threshold: float = 0.6):
faces = detector.detect(frame)
# unpack face data for visualization
bboxes = [f['bbox'] for f in faces]
scores = [f['confidence'] for f in faces]
landmarks = [f['landmarks'] for f in faces]
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
)
for face in faces:
result = age_gender.predict(frame, face['bbox'])
draw_age_gender_label(frame, face['bbox'], result.sex, result.age)
result = age_gender.predict(frame, face.bbox)
draw_age_gender_label(frame, face.bbox, result.sex, result.age)
cv2.putText(
frame,

View File

@@ -33,9 +33,9 @@ def process_image(
from uniface.visualization import draw_detections
preview = image.copy()
bboxes = [face['bbox'] for face in faces]
scores = [face['confidence'] for face in faces]
landmarks = [face['landmarks'] for face in faces]
bboxes = [face.bbox for face in faces]
scores = [face.confidence for face in faces]
landmarks = [face.landmarks for face in faces]
draw_detections(preview, bboxes, scores, landmarks)
# Show preview
@@ -157,7 +157,7 @@ Examples:
# Detection
parser.add_argument(
'--conf-thresh',
'--confidence-threshold',
type=float,
default=0.5,
help='Detection confidence threshold (default: 0.5)',
@@ -183,8 +183,8 @@ Examples:
color = tuple(color_values)
# Initialize detector
print(f'Initializing face detector (conf_thresh={args.conf_thresh})...')
detector = RetinaFace(conf_thresh=args.conf_thresh)
print(f'Initializing face detector (confidence_threshold={args.confidence_threshold})...')
detector = RetinaFace(confidence_threshold=args.confidence_threshold)
# Initialize blurrer
print(f'Initializing blur method: {args.method}')

View File

@@ -1,6 +1,15 @@
# Face detection on image or webcam
# Usage: python run_detection.py --image path/to/image.jpg
# python run_detection.py --webcam
# Copyright 2025 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Face detection on image or webcam.
Usage:
python run_detection.py --image path/to/image.jpg
python run_detection.py --webcam
"""
from __future__ import annotations
import argparse
import os
@@ -20,9 +29,9 @@ def process_image(detector, image_path: str, threshold: float = 0.6, save_dir: s
faces = detector.detect(image)
if faces:
bboxes = [face['bbox'] for face in faces]
scores = [face['confidence'] for face in faces]
landmarks = [face['landmarks'] for face in faces]
bboxes = [face.bbox for face in faces]
scores = [face.confidence for face in faces]
landmarks = [face.landmarks for face in faces]
draw_detections(image, bboxes, scores, landmarks, vis_threshold=threshold)
os.makedirs(save_dir, exist_ok=True)
@@ -48,9 +57,9 @@ def run_webcam(detector, threshold: float = 0.6):
faces = detector.detect(frame)
# unpack face data for visualization
bboxes = [f['bbox'] for f in faces]
scores = [f['confidence'] for f in faces]
landmarks = [f['landmarks'] for f in faces]
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame,
bboxes=bboxes,

View File

@@ -39,17 +39,17 @@ def process_image(
if not faces:
return
bboxes = [f['bbox'] for f in faces]
scores = [f['confidence'] for f in faces]
landmarks = [f['landmarks'] for f in faces]
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
)
for i, face in enumerate(faces):
emotion, confidence = emotion_predictor.predict(image, face['landmarks'])
emotion, confidence = emotion_predictor.predict(image, face.landmarks)
print(f' Face {i + 1}: {emotion} (confidence: {confidence:.3f})')
draw_emotion_label(image, face['bbox'], emotion, confidence)
draw_emotion_label(image, face.bbox, emotion, confidence)
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(image_path).stem}_emotion.jpg')
@@ -74,14 +74,16 @@ def run_webcam(detector, emotion_predictor, threshold: float = 0.6):
faces = detector.detect(frame)
# unpack face data for visualization
bboxes = [f['bbox'] for f in faces]
scores = [f['confidence'] for f in faces]
landmarks = [f['landmarks'] for f in faces]
draw_detections(frame, bboxes, scores, landmarks, vis_threshold=threshold)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
)
for face in faces:
emotion, confidence = emotion_predictor.predict(frame, face['landmarks'])
draw_emotion_label(frame, face['bbox'], emotion, confidence)
emotion, confidence = emotion_predictor.predict(frame, face.landmarks)
draw_emotion_label(frame, face.bbox, emotion, confidence)
cv2.putText(
frame,

View File

@@ -7,6 +7,7 @@ import os
from pathlib import Path
import cv2
import numpy as np
from uniface import RetinaFace
from uniface.constants import ParsingWeights
@@ -14,7 +15,49 @@ from uniface.parsing import BiSeNet
from uniface.visualization import vis_parsing_maps
def process_image(detector, parser, image_path: str, save_dir: str = 'outputs'):
def expand_bbox(
bbox: np.ndarray,
image_shape: tuple[int, int],
expand_ratio: float = 0.2,
expand_top_ratio: float = 0.4,
) -> tuple[int, int, int, int]:
"""
Expand bounding box to include full head region for face parsing.
Face detection typically returns tight face boxes, but face parsing
requires the full head including hair, ears, and neck.
Args:
bbox: Original bounding box [x1, y1, x2, y2].
image_shape: Image dimensions as (height, width).
expand_ratio: Expansion ratio for left, right, and bottom (default: 0.2 = 20%).
expand_top_ratio: Expansion ratio for top to capture hair/forehead (default: 0.4 = 40%).
Returns:
Tuple[int, int, int, int]: Expanded bbox (x1, y1, x2, y2) clamped to image bounds.
"""
x1, y1, x2, y2 = map(int, bbox[:4])
height, width = image_shape[:2]
# Calculate face dimensions
face_width = x2 - x1
face_height = y2 - y1
# Calculate expansion amounts
expand_x = int(face_width * expand_ratio)
expand_y_bottom = int(face_height * expand_ratio)
expand_y_top = int(face_height * expand_top_ratio)
# Expand and clamp to image boundaries
new_x1 = max(0, x1 - expand_x)
new_y1 = max(0, y1 - expand_y_top)
new_x2 = min(width, x2 + expand_x)
new_y2 = min(height, y2 + expand_y_bottom)
return new_x1, new_y1, new_x2, new_y2
def process_image(detector, parser, image_path: str, save_dir: str = 'outputs', expand_ratio: float = 0.2):
image = cv2.imread(image_path)
if image is None:
print(f"Error: Failed to load image from '{image_path}'")
@@ -26,8 +69,8 @@ def process_image(detector, parser, image_path: str, save_dir: str = 'outputs'):
result_image = image.copy()
for i, face in enumerate(faces):
bbox = face['bbox']
x1, y1, x2, y2 = map(int, bbox[:4])
# Expand bbox to include full head for parsing
x1, y1, x2, y2 = expand_bbox(face.bbox, image.shape, expand_ratio=expand_ratio)
face_crop = image[y1:y2, x1:x2]
if face_crop.size == 0:
@@ -44,7 +87,7 @@ def process_image(detector, parser, image_path: str, save_dir: str = 'outputs'):
# Place the visualization back on the original image
result_image[y1:y2, x1:x2] = vis_result
# Draw bounding box
# Draw expanded bounding box
cv2.rectangle(result_image, (x1, y1), (x2, y2), (0, 255, 0), 2)
os.makedirs(save_dir, exist_ok=True)
@@ -53,7 +96,7 @@ def process_image(detector, parser, image_path: str, save_dir: str = 'outputs'):
print(f'Output saved: {output_path}')
def run_webcam(detector, parser):
def run_webcam(detector, parser, expand_ratio: float = 0.2):
cap = cv2.VideoCapture(0)
if not cap.isOpened():
print('Cannot open webcam')
@@ -70,8 +113,8 @@ def run_webcam(detector, parser):
faces = detector.detect(frame)
for face in faces:
bbox = face['bbox']
x1, y1, x2, y2 = map(int, bbox[:4])
# Expand bbox to include full head for parsing
x1, y1, x2, y2 = expand_bbox(face.bbox, frame.shape, expand_ratio=expand_ratio)
face_crop = frame[y1:y2, x1:x2]
if face_crop.size == 0:
@@ -87,7 +130,7 @@ def run_webcam(detector, parser):
# Place the visualization back on the frame
frame[y1:y2, x1:x2] = vis_result
# Draw bounding box
# Draw expanded bounding box
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
@@ -108,6 +151,12 @@ def main():
parser_arg.add_argument(
'--model', type=str, default=ParsingWeights.RESNET18, choices=[ParsingWeights.RESNET18, ParsingWeights.RESNET34]
)
parser_arg.add_argument(
'--expand-ratio',
type=float,
default=0.2,
help='Bbox expansion ratio for full head coverage (default: 0.2 = 20%%)',
)
args = parser_arg.parse_args()
if not args.image and not args.webcam:
@@ -117,9 +166,9 @@ def main():
parser = BiSeNet(model_name=ParsingWeights.RESNET34)
if args.webcam:
run_webcam(detector, parser)
run_webcam(detector, parser, expand_ratio=args.expand_ratio)
else:
process_image(detector, parser, args.image, args.save_dir)
process_image(detector, parser, args.image, args.save_dir, expand_ratio=args.expand_ratio)
if __name__ == '__main__':

View File

@@ -29,7 +29,7 @@ def extract_reference_embedding(detector, recognizer, image_path: str) -> np.nda
if not faces:
raise RuntimeError('No faces found in reference image.')
landmarks = faces[0]['landmarks']
landmarks = faces[0].landmarks
return recognizer.get_normalized_embedding(image, landmarks)
@@ -49,8 +49,8 @@ def run_webcam(detector, recognizer, ref_embedding: np.ndarray, threshold: float
faces = detector.detect(frame)
for face in faces:
bbox = face['bbox']
landmarks = face['landmarks']
bbox = face.bbox
landmarks = face.landmarks
x1, y1, x2, y2 = map(int, bbox)
embedding = recognizer.get_normalized_embedding(frame, landmarks)

View File

@@ -24,7 +24,7 @@ def process_image(detector, gaze_estimator, image_path: str, save_dir: str = 'ou
print(f'Detected {len(faces)} face(s)')
for i, face in enumerate(faces):
bbox = face['bbox']
bbox = face.bbox
x1, y1, x2, y2 = map(int, bbox[:4])
face_crop = image[y1:y2, x1:x2]
@@ -60,7 +60,7 @@ def run_webcam(detector, gaze_estimator):
faces = detector.detect(frame)
for face in faces:
bbox = face['bbox']
bbox = face.bbox
x1, y1, x2, y2 = map(int, bbox[:4])
face_crop = frame[y1:y2, x1:x2]

View File

@@ -24,7 +24,7 @@ def process_image(detector, landmarker, image_path: str, save_dir: str = 'output
return
for i, face in enumerate(faces):
bbox = face['bbox']
bbox = face.bbox
x1, y1, x2, y2 = map(int, bbox)
cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
@@ -67,7 +67,7 @@ def run_webcam(detector, landmarker):
faces = detector.detect(frame)
for face in faces:
bbox = face['bbox']
bbox = face.bbox
x1, y1, x2, y2 = map(int, bbox)
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)

View File

@@ -70,13 +70,13 @@ def process_image(detector, spoofer, image_path: str, save_dir: str = 'outputs')
# Run anti-spoofing on each face
for i, face in enumerate(faces, 1):
label_idx, score = spoofer.predict(image, face['bbox'])
label_idx, score = spoofer.predict(image, face.bbox)
# label_idx: 0 = Fake, 1 = Real
label = 'Real' if label_idx == 1 else 'Fake'
print(f' Face {i}: {label} ({score:.1%})')
# Draw result on image
draw_spoofing_result(image, face['bbox'], label_idx, score)
draw_spoofing_result(image, face.bbox, label_idx, score)
# Save output
os.makedirs(save_dir, exist_ok=True)
@@ -128,8 +128,8 @@ def process_video(detector, spoofer, source, save_dir: str = 'outputs') -> None:
# Run anti-spoofing on each face
for face in faces:
label_idx, score = spoofer.predict(frame, face['bbox'])
draw_spoofing_result(frame, face['bbox'], label_idx, score)
label_idx, score = spoofer.predict(frame, face.bbox)
draw_spoofing_result(frame, face.bbox, label_idx, score)
# Write frame
writer.write(frame)

View File

@@ -52,9 +52,9 @@ def process_video(
faces = detector.detect(frame)
total_faces += len(faces)
bboxes = [f['bbox'] for f in faces]
scores = [f['confidence'] for f in faces]
landmarks = [f['landmarks'] for f in faces]
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
)

View File

@@ -1,3 +1,11 @@
# Copyright 2025 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for AgeGender attribute predictor."""
from __future__ import annotations
import numpy as np
import pytest

View File

@@ -1,3 +1,11 @@
# Copyright 2025 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for factory functions (create_detector, create_recognizer, etc.)."""
from __future__ import annotations
import numpy as np
import pytest
@@ -35,8 +43,8 @@ def test_create_detector_with_config():
detector = create_detector(
'retinaface',
model_name=RetinaFaceWeights.MNET_V2,
conf_thresh=0.8,
nms_thresh=0.3,
confidence_threshold=0.8,
nms_threshold=0.3,
)
assert detector is not None, 'Failed to create detector with custom config'
@@ -53,7 +61,7 @@ def test_create_detector_scrfd_with_model():
"""
Test creating SCRFD detector with specific model.
"""
detector = create_detector('scrfd', model_name=SCRFDWeights.SCRFD_10G_KPS, conf_thresh=0.5)
detector = create_detector('scrfd', model_name=SCRFDWeights.SCRFD_10G_KPS, confidence_threshold=0.5)
assert detector is not None, 'Failed to create SCRFD with specific model'
@@ -141,13 +149,13 @@ def test_detect_faces_with_threshold():
Test detect_faces with custom confidence threshold.
"""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = detect_faces(mock_image, method='retinaface', conf_thresh=0.8)
faces = detect_faces(mock_image, method='retinaface', confidence_threshold=0.8)
assert isinstance(faces, list), 'detect_faces should return a list'
# All detections should respect threshold
for face in faces:
assert face['confidence'] >= 0.8, 'All detections should meet confidence threshold'
assert face.confidence >= 0.8, 'All detections should meet confidence threshold'
def test_detect_faces_default_method():
@@ -246,8 +254,8 @@ def test_detector_with_different_configs():
"""
Test creating multiple detectors with different configurations.
"""
detector_high_thresh = create_detector('retinaface', conf_thresh=0.9)
detector_low_thresh = create_detector('retinaface', conf_thresh=0.3)
detector_high_thresh = create_detector('retinaface', confidence_threshold=0.9)
detector_low_thresh = create_detector('retinaface', confidence_threshold=0.3)
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)

View File

@@ -1,3 +1,11 @@
# Copyright 2025 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for 106-point facial landmark detector."""
from __future__ import annotations
import numpy as np
import pytest

View File

@@ -2,6 +2,10 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for BiSeNet face parsing model."""
from __future__ import annotations
import numpy as np
import pytest

View File

@@ -1,3 +1,11 @@
# Copyright 2025 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for face recognition models (ArcFace, MobileFace, SphereFace)."""
from __future__ import annotations
import numpy as np
import pytest

View File

@@ -1,3 +1,11 @@
# Copyright 2025 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for RetinaFace detector."""
from __future__ import annotations
import numpy as np
import pytest
@@ -9,9 +17,9 @@ from uniface.detection import RetinaFace
def retinaface_model():
return RetinaFace(
model_name=RetinaFaceWeights.MNET_V2,
conf_thresh=0.5,
confidence_threshold=0.5,
pre_nms_topk=5000,
nms_thresh=0.4,
nms_threshold=0.4,
post_nms_topk=750,
)
@@ -27,15 +35,15 @@ def test_inference_on_640x640_image(retinaface_model):
assert isinstance(faces, list), 'Detections should be a list.'
for face in faces:
assert isinstance(face, dict), 'Each detection should be a dictionary.'
assert 'bbox' in face, "Each detection should have a 'bbox' key."
assert 'confidence' in face, "Each detection should have a 'confidence' key."
assert 'landmarks' in face, "Each detection should have a 'landmarks' key."
# Face is a dataclass, check attributes exist
assert hasattr(face, 'bbox'), "Each detection should have a 'bbox' attribute."
assert hasattr(face, 'confidence'), "Each detection should have a 'confidence' attribute."
assert hasattr(face, 'landmarks'), "Each detection should have a 'landmarks' attribute."
bbox = face['bbox']
bbox = face.bbox
assert len(bbox) == 4, 'BBox should have 4 values (x1, y1, x2, y2).'
landmarks = face['landmarks']
landmarks = face.landmarks
assert len(landmarks) == 5, 'Should have 5 landmark points.'
assert all(len(pt) == 2 for pt in landmarks), 'Each landmark should be (x, y).'
@@ -45,7 +53,7 @@ def test_confidence_threshold(retinaface_model):
faces = retinaface_model.detect(mock_image)
for face in faces:
confidence = face['confidence']
confidence = face.confidence
assert confidence >= 0.5, f'Detection has confidence {confidence} below threshold 0.5'

View File

@@ -1,3 +1,11 @@
# Copyright 2025 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for SCRFD detector."""
from __future__ import annotations
import numpy as np
import pytest
@@ -9,8 +17,8 @@ from uniface.detection import SCRFD
def scrfd_model():
return SCRFD(
model_name=SCRFDWeights.SCRFD_500M_KPS,
conf_thresh=0.5,
nms_thresh=0.4,
confidence_threshold=0.5,
nms_threshold=0.4,
)
@@ -25,15 +33,15 @@ def test_inference_on_640x640_image(scrfd_model):
assert isinstance(faces, list), 'Detections should be a list.'
for face in faces:
assert isinstance(face, dict), 'Each detection should be a dictionary.'
assert 'bbox' in face, "Each detection should have a 'bbox' key."
assert 'confidence' in face, "Each detection should have a 'confidence' key."
assert 'landmarks' in face, "Each detection should have a 'landmarks' key."
# Face is a dataclass, check attributes exist
assert hasattr(face, 'bbox'), "Each detection should have a 'bbox' attribute."
assert hasattr(face, 'confidence'), "Each detection should have a 'confidence' attribute."
assert hasattr(face, 'landmarks'), "Each detection should have a 'landmarks' attribute."
bbox = face['bbox']
bbox = face.bbox
assert len(bbox) == 4, 'BBox should have 4 values (x1, y1, x2, y2).'
landmarks = face['landmarks']
landmarks = face.landmarks
assert len(landmarks) == 5, 'Should have 5 landmark points.'
assert all(len(pt) == 2 for pt in landmarks), 'Each landmark should be (x, y).'
@@ -43,7 +51,7 @@ def test_confidence_threshold(scrfd_model):
faces = scrfd_model.detect(mock_image)
for face in faces:
confidence = face['confidence']
confidence = face.confidence
assert confidence >= 0.5, f'Detection has confidence {confidence} below threshold 0.5'
@@ -63,7 +71,7 @@ def test_different_input_sizes(scrfd_model):
def test_scrfd_10g_model():
model = SCRFD(model_name=SCRFDWeights.SCRFD_10G_KPS, conf_thresh=0.5)
model = SCRFD(model_name=SCRFDWeights.SCRFD_10G_KPS, confidence_threshold=0.5)
assert model is not None, 'SCRFD 10G model initialization failed.'
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)

View File

@@ -1,3 +1,11 @@
# Copyright 2025 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for utility functions (compute_similarity, face_alignment, etc.)."""
from __future__ import annotations
import numpy as np
import pytest
@@ -116,7 +124,7 @@ def test_compute_similarity_dtype():
emb2 = emb2 / np.linalg.norm(emb2)
similarity = compute_similarity(emb1, emb2)
assert isinstance(similarity, (float, np.floating)), f'Similarity should be float, got {type(similarity)}'
assert isinstance(similarity, float | np.floating), f'Similarity should be float, got {type(similarity)}'
# face_alignment tests
@@ -259,4 +267,4 @@ def test_compute_similarity_with_recognition_embeddings():
# Should be a valid similarity score
assert -1.0 <= similarity <= 1.0
assert isinstance(similarity, (float, np.floating))
assert isinstance(similarity, float | np.floating)

View File

@@ -11,10 +11,24 @@
# See the License for the specific language governing permissions and
# limitations under the License.
"""UniFace: A comprehensive library for face analysis.
This library provides unified APIs for:
- Face detection (RetinaFace, SCRFD, YOLOv5Face)
- Face recognition (ArcFace, MobileFace, SphereFace)
- Facial landmarks (106-point detection)
- Face parsing (semantic segmentation)
- Gaze estimation
- Age, gender, and emotion prediction
- Face anti-spoofing
- Privacy/anonymization
"""
from __future__ import annotations
__license__ = 'MIT'
__author__ = 'Yakhyokhuja Valikhujaev'
__version__ = '1.6.0'
__version__ = '2.0.0'
from uniface.face_utils import compute_similarity, face_alignment
from uniface.log import Logger, enable_logging
@@ -23,12 +37,6 @@ from uniface.visualization import draw_detections, vis_parsing_maps
from .analyzer import FaceAnalyzer
from .attribute import AgeGender, AttributeResult, FairFace
from .face import Face
try:
from .attribute import Emotion
except ImportError:
Emotion = None # PyTorch not installed
from .detection import (
SCRFD,
RetinaFace,
@@ -37,6 +45,7 @@ from .detection import (
detect_faces,
list_available_detectors,
)
from .face import Face
from .gaze import MobileGaze, create_gaze_estimator
from .landmark import Landmark106, create_landmarker
from .parsing import BiSeNet, create_face_parser
@@ -44,7 +53,15 @@ from .privacy import BlurFace, anonymize_faces
from .recognition import ArcFace, MobileFace, SphereFace, create_recognizer
from .spoofing import MiniFASNet, create_spoofer
# Optional: Emotion requires PyTorch
Emotion: type | None
try:
from .attribute import Emotion
except ImportError:
Emotion = None
__all__ = [
# Metadata
'__author__',
'__license__',
'__version__',
@@ -85,11 +102,11 @@ __all__ = [
'BlurFace',
'anonymize_faces',
# Utilities
'Logger',
'compute_similarity',
'draw_detections',
'vis_parsing_maps',
'enable_logging',
'face_alignment',
'verify_model_weights',
'Logger',
'enable_logging',
'vis_parsing_maps',
]

View File

@@ -2,7 +2,7 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import List, Optional
from __future__ import annotations
import numpy as np
@@ -17,14 +17,32 @@ __all__ = ['FaceAnalyzer']
class FaceAnalyzer:
"""Unified face analyzer combining detection, recognition, and attributes."""
"""Unified face analyzer combining detection, recognition, and attributes.
This class provides a high-level interface for face analysis by combining
multiple components: face detection, recognition (embedding extraction),
and attribute prediction (age, gender, race).
Args:
detector: Face detector instance for detecting faces in images.
recognizer: Optional face recognizer for extracting embeddings.
age_gender: Optional age/gender predictor.
fairface: Optional FairFace predictor for demographics.
Example:
>>> from uniface import RetinaFace, ArcFace, FaceAnalyzer
>>> detector = RetinaFace()
>>> recognizer = ArcFace()
>>> analyzer = FaceAnalyzer(detector, recognizer=recognizer)
>>> faces = analyzer.analyze(image)
"""
def __init__(
self,
detector: BaseDetector,
recognizer: Optional[BaseRecognizer] = None,
age_gender: Optional[AgeGender] = None,
fairface: Optional[FairFace] = None,
recognizer: BaseRecognizer | None = None,
age_gender: AgeGender | None = None,
fairface: FairFace | None = None,
) -> None:
self.detector = detector
self.recognizer = recognizer
@@ -39,8 +57,18 @@ class FaceAnalyzer:
if fairface:
Logger.info(f' - FairFace enabled: {fairface.__class__.__name__}')
def analyze(self, image: np.ndarray) -> List[Face]:
"""Analyze faces in an image."""
def analyze(self, image: np.ndarray) -> list[Face]:
"""Analyze faces in an image.
Performs face detection and optionally extracts embeddings and
predicts attributes for each detected face.
Args:
image: Input image as numpy array with shape (H, W, C) in BGR format.
Returns:
List of Face objects with detection results and any predicted attributes.
"""
faces = self.detector.detect(image)
Logger.debug(f'Detected {len(faces)} face(s)')

View File

@@ -2,7 +2,9 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Any, Dict, List, Union
from __future__ import annotations
from typing import Any
import numpy as np
@@ -10,6 +12,7 @@ from uniface.attribute.age_gender import AgeGender
from uniface.attribute.base import Attribute, AttributeResult
from uniface.attribute.fairface import FairFace
from uniface.constants import AgeGenderWeights, DDAMFNWeights, FairFaceWeights
from uniface.face import Face
# Emotion requires PyTorch - make it optional
try:
@@ -32,17 +35,17 @@ __all__ = [
# A mapping from model enums to their corresponding attribute classes
_ATTRIBUTE_MODELS = {
**{model: AgeGender for model in AgeGenderWeights},
**{model: FairFace for model in FairFaceWeights},
**dict.fromkeys(AgeGenderWeights, AgeGender),
**dict.fromkeys(FairFaceWeights, FairFace),
}
# Add Emotion models only if PyTorch is available
if _EMOTION_AVAILABLE:
_ATTRIBUTE_MODELS.update({model: Emotion for model in DDAMFNWeights})
_ATTRIBUTE_MODELS.update(dict.fromkeys(DDAMFNWeights, Emotion))
def create_attribute_predictor(
model_name: Union[AgeGenderWeights, DDAMFNWeights, FairFaceWeights], **kwargs: Any
model_name: AgeGenderWeights | DDAMFNWeights | FairFaceWeights, **kwargs: Any
) -> Attribute:
"""
Factory function to create an attribute predictor instance.
@@ -75,46 +78,36 @@ def create_attribute_predictor(
return model_class(model_name=model_name, **kwargs)
def predict_attributes(
image: np.ndarray, detections: List[Dict[str, np.ndarray]], predictor: Attribute
) -> List[Dict[str, Any]]:
def predict_attributes(image: np.ndarray, faces: list[Face], predictor: Attribute) -> list[Face]:
"""
High-level API to predict attributes for multiple detected faces.
This function iterates through a list of face detections, runs the
specified attribute predictor on each one, and appends the results back
into the detection dictionary.
This function iterates through a list of Face objects, runs the
specified attribute predictor on each one, and updates the Face
objects with the predicted attributes.
Args:
image (np.ndarray): The full input image in BGR format.
detections (List[Dict]): A list of detection results, where each dict
must contain a 'bbox' and optionally 'landmark'.
faces (List[Face]): A list of Face objects from face detection.
predictor (Attribute): An initialized attribute predictor instance,
created by `create_attribute_predictor`.
Returns:
The list of detections, where each dictionary is updated with a new
'attributes' key containing the prediction result.
List[Face]: The list of Face objects with updated attribute fields.
"""
for face in detections:
# Initialize attributes dict if it doesn't exist
if 'attributes' not in face:
face['attributes'] = {}
for face in faces:
if isinstance(predictor, AgeGender):
result = predictor(image, face['bbox'])
face['attributes']['gender'] = result.gender
face['attributes']['sex'] = result.sex
face['attributes']['age'] = result.age
result = predictor(image, face.bbox)
face.gender = result.gender
face.age = result.age
elif isinstance(predictor, FairFace):
result = predictor(image, face['bbox'])
face['attributes']['gender'] = result.gender
face['attributes']['sex'] = result.sex
face['attributes']['age_group'] = result.age_group
face['attributes']['race'] = result.race
result = predictor(image, face.bbox)
face.gender = result.gender
face.age_group = result.age_group
face.race = result.race
elif isinstance(predictor, Emotion):
emotion, confidence = predictor(image, face['landmark'])
face['attributes']['emotion'] = emotion
face['attributes']['confidence'] = confidence
emotion, confidence = predictor(image, face.landmarks)
face.emotion = emotion
face.emotion_confidence = confidence
return detections
return faces

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import List, Optional, Tuple, Union
import cv2
import numpy as np
@@ -35,7 +34,7 @@ class AgeGender(Attribute):
def __init__(
self,
model_name: AgeGenderWeights = AgeGenderWeights.DEFAULT,
input_size: Optional[Tuple[int, int]] = None,
input_size: tuple[int, int] | None = None,
) -> None:
"""
Initializes the AgeGender prediction model.
@@ -81,7 +80,7 @@ class AgeGender(Attribute):
)
raise RuntimeError(f'Failed to initialize AgeGender model: {e}') from e
def preprocess(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> np.ndarray:
def preprocess(self, image: np.ndarray, bbox: list | np.ndarray) -> np.ndarray:
"""
Aligns the face based on the bounding box and preprocesses it for inference.
@@ -127,7 +126,7 @@ class AgeGender(Attribute):
age = int(np.round(prediction[2] * 100))
return AttributeResult(gender=gender, age=age)
def predict(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> AttributeResult:
def predict(self, image: np.ndarray, bbox: list | np.ndarray) -> AttributeResult:
"""
Predicts age and gender for a single face specified by a bounding box.

View File

@@ -4,7 +4,7 @@
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import Any, Optional
from typing import Any
import numpy as np
@@ -38,7 +38,7 @@ class AttributeResult:
25
>>> # FairFace result
>>> result = AttributeResult(gender=0, age_group="20-29", race="East Asian")
>>> result = AttributeResult(gender=0, age_group='20-29', race='East Asian')
>>> result.sex
'Female'
>>> result.race
@@ -46,9 +46,9 @@ class AttributeResult:
"""
gender: int
age: Optional[int] = None
age_group: Optional[str] = None
race: Optional[str] = None
age: int | None = None
age_group: str | None = None
race: str | None = None
@property
def sex(self) -> str:

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import List, Tuple, Union
import cv2
import numpy as np
@@ -29,7 +28,7 @@ class Emotion(Attribute):
def __init__(
self,
model_weights: DDAMFNWeights = DDAMFNWeights.AFFECNET7,
input_size: Tuple[int, int] = (112, 112),
input_size: tuple[int, int] = (112, 112),
) -> None:
"""
Initializes the emotion recognition model.
@@ -81,7 +80,7 @@ class Emotion(Attribute):
Logger.error(f"Failed to load Emotion model from '{self.model_path}'", exc_info=True)
raise RuntimeError(f'Failed to initialize Emotion model: {e}') from e
def preprocess(self, image: np.ndarray, landmark: Union[List, np.ndarray]) -> torch.Tensor:
def preprocess(self, image: np.ndarray, landmark: list | np.ndarray) -> torch.Tensor:
"""
Aligns the face using landmarks and preprocesses it into a tensor.
@@ -106,7 +105,7 @@ class Emotion(Attribute):
return torch.from_numpy(transposed_image).unsqueeze(0).to(self.device)
def postprocess(self, prediction: torch.Tensor) -> Tuple[str, float]:
def postprocess(self, prediction: torch.Tensor) -> tuple[str, float]:
"""
Processes the raw model output to get the emotion label and confidence score.
"""
@@ -116,7 +115,7 @@ class Emotion(Attribute):
confidence = float(probabilities[pred_index])
return emotion_label, confidence
def predict(self, image: np.ndarray, landmark: Union[List, np.ndarray]) -> Tuple[str, float]:
def predict(self, image: np.ndarray, landmark: list | np.ndarray) -> tuple[str, float]:
"""
Predicts the emotion from a single face specified by its landmarks.
"""

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import List, Optional, Tuple, Union
import cv2
import numpy as np
@@ -13,7 +12,7 @@ from uniface.log import Logger
from uniface.model_store import verify_model_weights
from uniface.onnx_utils import create_onnx_session
__all__ = ['FairFace', 'RACE_LABELS', 'AGE_LABELS']
__all__ = ['AGE_LABELS', 'RACE_LABELS', 'FairFace']
# Label definitions
RACE_LABELS = [
@@ -49,7 +48,7 @@ class FairFace(Attribute):
def __init__(
self,
model_name: FairFaceWeights = FairFaceWeights.DEFAULT,
input_size: Optional[Tuple[int, int]] = None,
input_size: tuple[int, int] | None = None,
) -> None:
"""
Initializes the FairFace prediction model.
@@ -82,7 +81,7 @@ class FairFace(Attribute):
)
raise RuntimeError(f'Failed to initialize FairFace model: {e}') from e
def preprocess(self, image: np.ndarray, bbox: Optional[Union[List, np.ndarray]] = None) -> np.ndarray:
def preprocess(self, image: np.ndarray, bbox: list | np.ndarray | None = None) -> np.ndarray:
"""
Preprocesses the face image for inference.
@@ -130,7 +129,7 @@ class FairFace(Attribute):
return image
def postprocess(self, prediction: Tuple[np.ndarray, np.ndarray, np.ndarray]) -> AttributeResult:
def postprocess(self, prediction: tuple[np.ndarray, np.ndarray, np.ndarray]) -> AttributeResult:
"""
Processes the raw model output to extract race, gender, and age.
@@ -162,7 +161,7 @@ class FairFace(Attribute):
race=RACE_LABELS[race_idx],
)
def predict(self, image: np.ndarray, bbox: Optional[Union[List, np.ndarray]] = None) -> AttributeResult:
def predict(self, image: np.ndarray, bbox: list | np.ndarray | None = None) -> AttributeResult:
"""
Predicts race, gender, and age for a face.

View File

@@ -2,34 +2,42 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
import itertools
import math
from typing import List, Optional, Tuple
import cv2
import numpy as np
__all__ = [
'resize_image',
'generate_anchors',
'non_max_suppression',
'decode_boxes',
'decode_landmarks',
'distance2bbox',
'distance2kps',
'generate_anchors',
'non_max_suppression',
'resize_image',
]
def resize_image(frame, target_shape: Tuple[int, int] = (640, 640)) -> Tuple[np.ndarray, float]:
"""
Resize an image to fit within a target shape while keeping its aspect ratio.
def resize_image(
frame: np.ndarray,
target_shape: tuple[int, int] = (640, 640),
) -> tuple[np.ndarray, float]:
"""Resize an image to fit within a target shape while keeping its aspect ratio.
The image is resized to fit within the target dimensions and placed on a
blank canvas (zero-padded to target size).
Args:
frame (np.ndarray): Input image.
target_shape (Tuple[int, int]): Target size (width, height). Defaults to (640, 640).
frame: Input image with shape (H, W, C).
target_shape: Target size as (width, height). Defaults to (640, 640).
Returns:
Tuple[np.ndarray, float]: Resized image on a blank canvas and the resize factor.
A tuple containing:
- Resized image on a blank canvas with shape (height, width, 3).
- The resize factor as a float.
"""
width, height = target_shape
@@ -53,16 +61,16 @@ def resize_image(frame, target_shape: Tuple[int, int] = (640, 640)) -> Tuple[np.
return image, resize_factor
def generate_anchors(image_size: Tuple[int, int] = (640, 640)) -> np.ndarray:
"""
Generate anchor boxes for a given image size (RetinaFace specific).
def generate_anchors(image_size: tuple[int, int] = (640, 640)) -> np.ndarray:
"""Generate anchor boxes for a given image size (RetinaFace specific).
Args:
image_size (Tuple[int, int]): Input image size (width, height). Defaults to (640, 640).
image_size: Input image size as (width, height). Defaults to (640, 640).
Returns:
np.ndarray: Anchor box coordinates as a NumPy array with shape (num_anchors, 4).
Anchor box coordinates as a numpy array with shape (num_anchors, 4).
"""
# RetinaFace FPN strides and corresponding anchor sizes per level
steps = [8, 16, 32]
min_sizes = [[16, 32], [64, 128], [256, 512]]
@@ -85,16 +93,15 @@ def generate_anchors(image_size: Tuple[int, int] = (640, 640)) -> np.ndarray:
return output
def non_max_suppression(dets: np.ndarray, threshold: float) -> List[int]:
"""
Apply Non-Maximum Suppression (NMS) to reduce overlapping bounding boxes based on a threshold.
def non_max_suppression(dets: np.ndarray, threshold: float) -> list[int]:
"""Apply Non-Maximum Suppression (NMS) to reduce overlapping bounding boxes.
Args:
dets (np.ndarray): Array of detections with each row as [x1, y1, x2, y2, score].
threshold (float): IoU threshold for suppression.
dets: Array of detections with each row as [x1, y1, x2, y2, score].
threshold: IoU threshold for suppression.
Returns:
List[int]: Indices of bounding boxes retained after suppression.
Indices of bounding boxes retained after suppression.
"""
x1 = dets[:, 0]
y1 = dets[:, 1]
@@ -125,18 +132,22 @@ def non_max_suppression(dets: np.ndarray, threshold: float) -> List[int]:
return keep
def decode_boxes(loc: np.ndarray, priors: np.ndarray, variances: Optional[List[float]] = None) -> np.ndarray:
"""
Decode locations from predictions using priors to undo
the encoding done for offset regression at train time (RetinaFace specific).
def decode_boxes(
loc: np.ndarray,
priors: np.ndarray,
variances: list[float] | None = None,
) -> np.ndarray:
"""Decode locations from predictions using priors (RetinaFace specific).
Undoes the encoding done for offset regression at train time.
Args:
loc (np.ndarray): Location predictions for loc layers, shape: [num_priors, 4]
priors (np.ndarray): Prior boxes in center-offset form, shape: [num_priors, 4]
variances (Optional[List[float]]): Variances of prior boxes. Defaults to [0.1, 0.2].
loc: Location predictions for loc layers, shape: [num_priors, 4].
priors: Prior boxes in center-offset form, shape: [num_priors, 4].
variances: Variances of prior boxes. Defaults to [0.1, 0.2].
Returns:
np.ndarray: Decoded bounding box predictions with shape [num_priors, 4]
Decoded bounding box predictions with shape [num_priors, 4].
"""
if variances is None:
variances = [0.1, 0.2]
@@ -155,18 +166,19 @@ def decode_boxes(loc: np.ndarray, priors: np.ndarray, variances: Optional[List[f
def decode_landmarks(
predictions: np.ndarray, priors: np.ndarray, variances: Optional[List[float]] = None
predictions: np.ndarray,
priors: np.ndarray,
variances: list[float] | None = None,
) -> np.ndarray:
"""
Decode landmark predictions using prior boxes (RetinaFace specific).
"""Decode landmark predictions using prior boxes (RetinaFace specific).
Args:
predictions (np.ndarray): Landmark predictions, shape: [num_priors, 10]
priors (np.ndarray): Prior boxes, shape: [num_priors, 4]
variances (Optional[List[float]]): Scaling factors for landmark offsets. Defaults to [0.1, 0.2].
predictions: Landmark predictions, shape: [num_priors, 10].
priors: Prior boxes, shape: [num_priors, 4].
variances: Scaling factors for landmark offsets. Defaults to [0.1, 0.2].
Returns:
np.ndarray: Decoded landmarks, shape: [num_priors, 10]
Decoded landmarks, shape: [num_priors, 10].
"""
if variances is None:
variances = [0.1, 0.2]
@@ -187,18 +199,21 @@ def decode_landmarks(
return landmarks
def distance2bbox(points: np.ndarray, distance: np.ndarray, max_shape: Optional[Tuple[int, int]] = None) -> np.ndarray:
"""
Decode distance prediction to bounding box (SCRFD specific).
def distance2bbox(
points: np.ndarray,
distance: np.ndarray,
max_shape: tuple[int, int] | None = None,
) -> np.ndarray:
"""Decode distance prediction to bounding box (SCRFD specific).
Args:
points (np.ndarray): Anchor points with shape (n, 2), [x, y].
distance (np.ndarray): Distance from the given point to 4
boundaries (left, top, right, bottom) with shape (n, 4).
max_shape (Optional[Tuple[int, int]]): Shape of the image (height, width) for clipping.
points: Anchor points with shape (n, 2), [x, y].
distance: Distance from the given point to 4 boundaries
(left, top, right, bottom) with shape (n, 4).
max_shape: Shape of the image (height, width) for clipping.
Returns:
np.ndarray: Decoded bounding boxes with shape (n, 4) as [x1, y1, x2, y2].
Decoded bounding boxes with shape (n, 4) as [x1, y1, x2, y2].
"""
x1 = points[:, 0] - distance[:, 0]
y1 = points[:, 1] - distance[:, 1]
@@ -219,17 +234,20 @@ def distance2bbox(points: np.ndarray, distance: np.ndarray, max_shape: Optional[
return np.stack([x1, y1, x2, y2], axis=-1)
def distance2kps(points: np.ndarray, distance: np.ndarray, max_shape: Optional[Tuple[int, int]] = None) -> np.ndarray:
"""
Decode distance prediction to keypoints (SCRFD specific).
def distance2kps(
points: np.ndarray,
distance: np.ndarray,
max_shape: tuple[int, int] | None = None,
) -> np.ndarray:
"""Decode distance prediction to keypoints (SCRFD specific).
Args:
points (np.ndarray): Anchor points with shape (n, 2), [x, y].
distance (np.ndarray): Distance from the given point to keypoints with shape (n, 2k).
max_shape (Optional[Tuple[int, int]]): Shape of the image (height, width) for clipping.
points: Anchor points with shape (n, 2), [x, y].
distance: Distance from the given point to keypoints with shape (n, 2k).
max_shape: Shape of the image (height, width) for clipping.
Returns:
np.ndarray: Decoded keypoints with shape (n, 2k).
Decoded keypoints with shape (n, 2k).
"""
preds = []
for i in range(0, distance.shape[1], 2):

View File

@@ -3,7 +3,6 @@
# GitHub: https://github.com/yakhyo
from enum import Enum
from typing import Dict
# fmt: off
@@ -142,7 +141,7 @@ class MiniFASNetWeights(str, Enum):
V2 = "minifasnet_v2"
MODEL_URLS: Dict[Enum, str] = {
MODEL_URLS: dict[Enum, str] = {
# RetinaFace
RetinaFaceWeights.MNET_025: 'https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1_0.25.onnx',
RetinaFaceWeights.MNET_050: 'https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1_0.50.onnx',
@@ -191,7 +190,7 @@ MODEL_URLS: Dict[Enum, str] = {
MiniFASNetWeights.V2: 'https://github.com/yakhyo/face-anti-spoofing/releases/download/weights/MiniFASNetV2.onnx',
}
MODEL_SHA256: Dict[Enum, str] = {
MODEL_SHA256: dict[Enum, str] = {
# RetinaFace
RetinaFaceWeights.MNET_025: 'b7a7acab55e104dce6f32cdfff929bd83946da5cd869b9e2e9bdffafd1b7e4a5',
RetinaFaceWeights.MNET_050: 'd8977186f6037999af5b4113d42ba77a84a6ab0c996b17c713cc3d53b88bfc37',

View File

@@ -2,8 +2,9 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
from typing import Any, Dict, List
from typing import Any
import numpy as np
@@ -14,37 +15,40 @@ from .retinaface import RetinaFace
from .scrfd import SCRFD
from .yolov5 import YOLOv5Face
# Global cache for detector instances
_detector_cache: Dict[str, BaseDetector] = {}
# Global cache for detector instances (keyed by method name + config hash)
_detector_cache: dict[str, BaseDetector] = {}
def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs) -> List[Face]:
"""
High-level face detection function.
def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs: Any) -> list[Face]:
"""High-level face detection function.
Detects faces in an image using the specified detection method.
Results are cached for repeated calls with the same configuration.
Args:
image (np.ndarray): Input image as numpy array.
method (str): Detection method to use. Options: 'retinaface', 'scrfd', 'yolov5face'.
image: Input image as numpy array with shape (H, W, C) in BGR format.
method: Detection method to use. Options: 'retinaface', 'scrfd', 'yolov5face'.
**kwargs: Additional arguments passed to the detector.
Returns:
List[Face]: A list of Face objects, each containing:
- bbox (np.ndarray): [x1, y1, x2, y2] bounding box coordinates.
- confidence (float): The confidence score of the detection.
- landmarks (np.ndarray): 5-point facial landmarks with shape (5, 2).
A list of Face objects, each containing:
- bbox: [x1, y1, x2, y2] bounding box coordinates.
- confidence: The confidence score of the detection.
- landmarks: 5-point facial landmarks with shape (5, 2).
Example:
>>> from uniface import detect_faces
>>> image = cv2.imread("your_image.jpg")
>>> faces = detect_faces(image, method='retinaface', conf_thresh=0.8)
>>> import cv2
>>> image = cv2.imread('your_image.jpg')
>>> faces = detect_faces(image, method='retinaface', confidence_threshold=0.8)
>>> for face in faces:
... print(f"Found face with confidence: {face.confidence}")
... print(f"BBox: {face.bbox}")
... print(f'Found face with confidence: {face.confidence}')
... print(f'BBox: {face.bbox}')
"""
method_name = method.lower()
sorted_kwargs = sorted(kwargs.items())
cache_key = f'{method_name}_{str(sorted_kwargs)}'
cache_key = f'{method_name}_{sorted_kwargs!s}'
if cache_key not in _detector_cache:
# Pass kwargs to create the correctly configured detector
@@ -54,49 +58,36 @@ def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs) -> Lis
return detector.detect(image)
def create_detector(method: str = 'retinaface', **kwargs) -> BaseDetector:
"""
Factory function to create face detectors.
def create_detector(method: str = 'retinaface', **kwargs: Any) -> BaseDetector:
"""Factory function to create face detectors.
Args:
method (str): Detection method. Options:
method: Detection method. Options:
- 'retinaface': RetinaFace detector (default)
- 'scrfd': SCRFD detector (fast and accurate)
- 'yolov5face': YOLOv5-Face detector (accurate with landmarks)
**kwargs: Detector-specific parameters
**kwargs: Detector-specific parameters.
Returns:
BaseDetector: Initialized detector instance
Initialized detector instance.
Raises:
ValueError: If method is not supported
ValueError: If method is not supported.
Examples:
Example:
>>> # Basic usage
>>> detector = create_detector('retinaface')
>>> # SCRFD detector with custom parameters
>>> from uniface.constants import SCRFDWeights
>>> detector = create_detector(
... 'scrfd',
... model_name=SCRFDWeights.SCRFD_10G_KPS,
... conf_thresh=0.8,
... input_size=(640, 640)
... 'scrfd', model_name=SCRFDWeights.SCRFD_10G_KPS, confidence_threshold=0.8, input_size=(640, 640)
... )
>>> # RetinaFace detector
>>> from uniface.constants import RetinaFaceWeights
>>> detector = create_detector(
... 'retinaface',
... model_name=RetinaFaceWeights.MNET_V2,
... conf_thresh=0.8,
... nms_thresh=0.4
... )
>>> # YOLOv5-Face detector
>>> detector = create_detector(
... 'yolov5face',
... model_name=YOLOv5FaceWeights.YOLOV5S,
... conf_thresh=0.25,
... nms_thresh=0.45
... 'retinaface', model_name=RetinaFaceWeights.MNET_V2, confidence_threshold=0.8, nms_threshold=0.4
... )
"""
method = method.lower()
@@ -115,12 +106,12 @@ def create_detector(method: str = 'retinaface', **kwargs) -> BaseDetector:
raise ValueError(f"Unsupported detection method: '{method}'. Available methods: {available_methods}")
def list_available_detectors() -> Dict[str, Dict[str, Any]]:
"""
List all available detection methods with their descriptions and parameters.
def list_available_detectors() -> dict[str, dict[str, Any]]:
"""List all available detection methods with their descriptions and parameters.
Returns:
Dict[str, Dict[str, Any]]: Dictionary of detector information
Dictionary mapping detector names to their information including
description, landmark support, paper reference, and default parameters.
"""
return {
'retinaface': {
@@ -129,8 +120,8 @@ def list_available_detectors() -> Dict[str, Dict[str, Any]]:
'paper': 'https://arxiv.org/abs/1905.00641',
'default_params': {
'model_name': 'mnet_v2',
'conf_thresh': 0.5,
'nms_thresh': 0.4,
'confidence_threshold': 0.5,
'nms_threshold': 0.4,
'input_size': (640, 640),
},
},
@@ -140,8 +131,8 @@ def list_available_detectors() -> Dict[str, Dict[str, Any]]:
'paper': 'https://arxiv.org/abs/2105.04714',
'default_params': {
'model_name': 'scrfd_10g_kps',
'conf_thresh': 0.5,
'nms_thresh': 0.4,
'confidence_threshold': 0.5,
'nms_threshold': 0.4,
'input_size': (640, 640),
},
},
@@ -151,8 +142,8 @@ def list_available_detectors() -> Dict[str, Dict[str, Any]]:
'paper': 'https://arxiv.org/abs/2105.12931',
'default_params': {
'model_name': 'yolov5s_face',
'conf_thresh': 0.25,
'nms_thresh': 0.45,
'confidence_threshold': 0.25,
'nms_threshold': 0.45,
'input_size': 640,
},
},
@@ -160,11 +151,11 @@ def list_available_detectors() -> Dict[str, Dict[str, Any]]:
__all__ = [
'detect_faces',
'create_detector',
'list_available_detectors',
'SCRFD',
'BaseDetector',
'RetinaFace',
'YOLOv5Face',
'BaseDetector',
'create_detector',
'detect_faces',
'list_available_detectors',
]

View File

@@ -2,40 +2,51 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
from abc import ABC, abstractmethod
from typing import Any, Dict, List
from typing import Any
import numpy as np
from uniface.face import Face
__all__ = ['BaseDetector']
class BaseDetector(ABC):
"""
Abstract base class for all face detectors.
"""Abstract base class for all face detectors.
This class defines the interface that all face detectors must implement,
ensuring consistency across different detection methods.
Attributes:
config: Dictionary containing detector configuration parameters.
_supports_landmarks: Flag indicating if detector supports landmark detection.
"""
def __init__(self, **kwargs):
"""Initialize the detector with configuration parameters."""
self.config = kwargs
@abstractmethod
def detect(self, image: np.ndarray, **kwargs) -> List[Face]:
"""
Detect faces in an image.
def __init__(self, **kwargs: Any) -> None:
"""Initialize the detector with configuration parameters.
Args:
image (np.ndarray): Input image as numpy array with shape (H, W, C)
**kwargs: Additional detection parameters
**kwargs: Detector-specific configuration parameters.
"""
self.config: dict[str, Any] = kwargs
self._supports_landmarks: bool = False
@abstractmethod
def detect(self, image: np.ndarray, **kwargs: Any) -> list[Face]:
"""Detect faces in an image.
Args:
image: Input image as numpy array with shape (H, W, C) in BGR format.
**kwargs: Additional detection parameters.
Returns:
List[Face]: List of detected Face objects, each containing:
- bbox (np.ndarray): Bounding box coordinates with shape (4,) as [x1, y1, x2, y2]
- confidence (float): Detection confidence score (0.0 to 1.0)
- landmarks (np.ndarray): Facial landmarks with shape (5, 2) for 5-point landmarks
List of detected Face objects, each containing:
- bbox: Bounding box coordinates with shape (4,) as [x1, y1, x2, y2].
- confidence: Detection confidence score (0.0 to 1.0).
- landmarks: Facial landmarks with shape (5, 2) for 5-point landmarks.
Example:
>>> faces = detector.detect(image)
@@ -44,34 +55,29 @@ class BaseDetector(ABC):
... confidence = face.confidence # float
... landmarks = face.landmarks # np.ndarray with shape (5, 2)
"""
pass
@abstractmethod
def preprocess(self, image: np.ndarray) -> np.ndarray:
"""
Preprocess input image for detection.
"""Preprocess input image for detection.
Args:
image (np.ndarray): Input image
image: Input image with shape (H, W, C).
Returns:
np.ndarray: Preprocessed image tensor
Preprocessed image tensor ready for inference.
"""
pass
@abstractmethod
def postprocess(self, outputs, **kwargs) -> Any:
"""
Postprocess model outputs to get final detections.
def postprocess(self, outputs: Any, **kwargs: Any) -> Any:
"""Postprocess model outputs to get final detections.
Args:
outputs: Raw model outputs
**kwargs: Additional postprocessing parameters
outputs: Raw model outputs.
**kwargs: Additional postprocessing parameters.
Returns:
Any: Processed outputs (implementation-specific format, typically tuple of arrays)
Processed outputs (implementation-specific format, typically tuple of arrays).
"""
pass
def __str__(self) -> str:
"""String representation of the detector."""
@@ -83,20 +89,18 @@ class BaseDetector(ABC):
@property
def supports_landmarks(self) -> bool:
"""
Whether this detector supports landmark detection.
"""Whether this detector supports landmark detection.
Returns:
bool: True if landmarks are supported, False otherwise
True if landmarks are supported, False otherwise.
"""
return hasattr(self, '_supports_landmarks') and self._supports_landmarks
def get_info(self) -> Dict[str, Any]:
"""
Get detector information and configuration.
def get_info(self) -> dict[str, Any]:
"""Get detector information and configuration.
Returns:
Dict[str, Any]: Detector information
Dictionary containing detector name, landmark support, and config.
"""
return {
'name': self.__class__.__name__,

View File

@@ -2,7 +2,9 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Any, List, Literal, Tuple
from __future__ import annotations
from typing import Any, Literal
import numpy as np
@@ -32,8 +34,8 @@ class RetinaFace(BaseDetector):
Args:
model_name (RetinaFaceWeights): Model weights to use. Defaults to `RetinaFaceWeights.MNET_V2`.
conf_thresh (float): Confidence threshold for filtering detections. Defaults to 0.5.
nms_thresh (float): Non-maximum suppression (NMS) IoU threshold. Defaults to 0.4.
confidence_threshold (float): Confidence threshold for filtering detections. Defaults to 0.5.
nms_threshold (float): Non-maximum suppression (NMS) IoU threshold. Defaults to 0.4.
input_size (Tuple[int, int]): Fixed input size (width, height) if `dynamic_size=False`.
Defaults to (640, 640).
Note: Non-default sizes may cause slower inference and CoreML compatibility issues.
@@ -44,8 +46,8 @@ class RetinaFace(BaseDetector):
Attributes:
model_name (RetinaFaceWeights): Selected model variant.
conf_thresh (float): Threshold for confidence-based filtering.
nms_thresh (float): IoU threshold used for NMS.
confidence_threshold (float): Threshold for confidence-based filtering.
nms_threshold (float): IoU threshold used for NMS.
pre_nms_topk (int): Limit on proposals before applying NMS.
post_nms_topk (int): Limit on retained detections after NMS.
dynamic_size (bool): Flag indicating dynamic or static input sizing.
@@ -63,23 +65,23 @@ class RetinaFace(BaseDetector):
self,
*,
model_name: RetinaFaceWeights = RetinaFaceWeights.MNET_V2,
conf_thresh: float = 0.5,
nms_thresh: float = 0.4,
input_size: Tuple[int, int] = (640, 640),
confidence_threshold: float = 0.5,
nms_threshold: float = 0.4,
input_size: tuple[int, int] = (640, 640),
**kwargs: Any,
) -> None:
super().__init__(
model_name=model_name,
conf_thresh=conf_thresh,
nms_thresh=nms_thresh,
confidence_threshold=confidence_threshold,
nms_threshold=nms_threshold,
input_size=input_size,
**kwargs,
)
self._supports_landmarks = True # RetinaFace supports landmarks
self.model_name = model_name
self.conf_thresh = conf_thresh
self.nms_thresh = nms_thresh
self.confidence_threshold = confidence_threshold
self.nms_threshold = nms_threshold
self.input_size = input_size
# Advanced options from kwargs
@@ -88,8 +90,8 @@ class RetinaFace(BaseDetector):
self.dynamic_size = kwargs.get('dynamic_size', False)
Logger.info(
f'Initializing RetinaFace with model={self.model_name}, conf_thresh={self.conf_thresh}, '
f'nms_thresh={self.nms_thresh}, input_size={self.input_size}'
f'Initializing RetinaFace with model={self.model_name}, confidence_threshold={self.confidence_threshold}, '
f'nms_threshold={self.nms_threshold}, input_size={self.input_size}'
)
# Get path to model weights
@@ -105,14 +107,13 @@ class RetinaFace(BaseDetector):
self._initialize_model(self._model_path)
def _initialize_model(self, model_path: str) -> None:
"""
Initializes an ONNX model session from the given path.
"""Initialize an ONNX model session from the given path.
Args:
model_path (str): The file path to the ONNX model.
model_path: The file path to the ONNX model.
Raises:
RuntimeError: If the model fails to load, logs an error and raises an exception.
RuntimeError: If the model fails to load.
"""
try:
self.session = create_onnx_session(model_path)
@@ -137,14 +138,14 @@ class RetinaFace(BaseDetector):
image = np.expand_dims(image, axis=0) # Add batch dimension (1, C, H, W)
return image
def inference(self, input_tensor: np.ndarray) -> List[np.ndarray]:
def inference(self, input_tensor: np.ndarray) -> list[np.ndarray]:
"""Perform model inference on the preprocessed image tensor.
Args:
input_tensor (np.ndarray): Preprocessed input tensor.
input_tensor: Preprocessed input tensor with shape (1, C, H, W).
Returns:
Tuple[np.ndarray, np.ndarray]: Raw model outputs.
List of raw model outputs (location, confidence, landmarks).
"""
return self.session.run(self.output_names, {self.input_names: input_tensor})
@@ -155,7 +156,7 @@ class RetinaFace(BaseDetector):
max_num: int = 0,
metric: Literal['default', 'max'] = 'max',
center_weight: float = 2.0,
) -> List[Face]:
) -> list[Face]:
"""
Perform face detection on an input image and return bounding boxes and facial landmarks.
@@ -240,41 +241,43 @@ class RetinaFace(BaseDetector):
return faces
def postprocess(
self, outputs: List[np.ndarray], resize_factor: float, shape: Tuple[int, int]
) -> Tuple[np.ndarray, np.ndarray]:
"""
Process the model outputs into final detection results.
self,
outputs: list[np.ndarray],
resize_factor: float,
shape: tuple[int, int],
) -> tuple[np.ndarray, np.ndarray]:
"""Process the model outputs into final detection results.
Args:
outputs (List[np.ndarray]): Raw outputs from the detection model.
outputs: Raw outputs from the detection model containing:
- outputs[0]: Location predictions (bounding box coordinates).
- outputs[1]: Class confidence scores.
- outputs[2]: Landmark predictions.
resize_factor (float): Factor used to resize the input image during preprocessing.
shape (Tuple[int, int]): Original shape of the image as (height, width).
resize_factor: Factor used to resize the input image during preprocessing.
shape: Original shape of the image as (width, height).
Returns:
Tuple[np.ndarray, np.ndarray]: Processed results containing:
- detections (np.ndarray): Array of detected bounding boxes with confidence scores.
Shape: (num_detections, 5), where each row is [x_min, y_min, x_max, y_max, score].
- landmarks (np.ndarray): Array of detected facial landmarks.
Shape: (num_detections, 5, 2), where each row contains 5 landmark points (x, y).
A tuple containing:
- detections: Array of detected bounding boxes with confidence scores,
shape (num_detections, 5), each row is [x1, y1, x2, y2, score].
- landmarks: Array of detected facial landmarks,
shape (num_detections, 5, 2), each row contains 5 landmark points (x, y).
"""
loc, conf, landmarks = (
location_predictions, confidence_scores, landmark_predictions = (
outputs[0].squeeze(0),
outputs[1].squeeze(0),
outputs[2].squeeze(0),
)
# Decode boxes and landmarks
boxes = decode_boxes(loc, self._priors)
landmarks = decode_landmarks(landmarks, self._priors)
boxes = decode_boxes(location_predictions, self._priors)
landmarks = decode_landmarks(landmark_predictions, self._priors)
boxes, landmarks = self._scale_detections(boxes, landmarks, resize_factor, shape=(shape[0], shape[1]))
# Extract confidence scores for the face class
scores = conf[:, 1]
mask = scores > self.conf_thresh
scores = confidence_scores[:, 1]
mask = scores > self.confidence_threshold
# Filter by confidence threshold
boxes, landmarks, scores = boxes[mask], landmarks[mask], scores[mask]
@@ -285,7 +288,7 @@ class RetinaFace(BaseDetector):
# Apply NMS
detections = np.hstack((boxes, scores[:, np.newaxis])).astype(np.float32, copy=False)
keep = non_max_suppression(detections, self.nms_thresh)
keep = non_max_suppression(detections, self.nms_threshold)
detections, landmarks = detections[keep], landmarks[keep]
# Keep top-k detections
@@ -303,9 +306,9 @@ class RetinaFace(BaseDetector):
boxes: np.ndarray,
landmarks: np.ndarray,
resize_factor: float,
shape: Tuple[int, int],
) -> Tuple[np.ndarray, np.ndarray]:
# Scale bounding boxes and landmarks to the original image size.
shape: tuple[int, int],
) -> tuple[np.ndarray, np.ndarray]:
"""Scale bounding boxes and landmarks to the original image size."""
bbox_scale = np.array([shape[0], shape[1]] * 2)
boxes = boxes * bbox_scale / resize_factor

View File

@@ -2,7 +2,9 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Any, List, Literal, Tuple
from __future__ import annotations
from typing import Any, Literal
import numpy as np
@@ -29,8 +31,8 @@ class SCRFD(BaseDetector):
Args:
model_name (SCRFDWeights): Predefined model enum (e.g., `SCRFD_10G_KPS`).
Specifies the SCRFD variant to load. Defaults to SCRFD_10G_KPS.
conf_thresh (float): Confidence threshold for filtering detections. Defaults to 0.5.
nms_thresh (float): Non-Maximum Suppression threshold. Defaults to 0.4.
confidence_threshold (float): Confidence threshold for filtering detections. Defaults to 0.5.
nms_threshold (float): Non-Maximum Suppression threshold. Defaults to 0.4.
input_size (Tuple[int, int]): Input image size (width, height).
Defaults to (640, 640).
Note: Non-default sizes may cause slower inference and CoreML compatibility issues.
@@ -38,10 +40,10 @@ class SCRFD(BaseDetector):
Attributes:
model_name (SCRFDWeights): Selected model variant.
conf_thresh (float): Threshold used to filter low-confidence detections.
nms_thresh (float): Threshold used during NMS to suppress overlapping boxes.
confidence_threshold (float): Threshold used to filter low-confidence detections.
nms_threshold (float): Threshold used during NMS to suppress overlapping boxes.
input_size (Tuple[int, int]): Image size to which inputs are resized before inference.
_fmc (int): Number of feature map levels used in the model.
_num_feature_maps (int): Number of feature map levels used in the model.
_feat_stride_fpn (List[int]): Feature map strides corresponding to each detection level.
_num_anchors (int): Number of anchors per feature location.
_center_cache (Dict): Cached anchor centers for efficient forward passes.
@@ -56,35 +58,35 @@ class SCRFD(BaseDetector):
self,
*,
model_name: SCRFDWeights = SCRFDWeights.SCRFD_10G_KPS,
conf_thresh: float = 0.5,
nms_thresh: float = 0.4,
input_size: Tuple[int, int] = (640, 640),
confidence_threshold: float = 0.5,
nms_threshold: float = 0.4,
input_size: tuple[int, int] = (640, 640),
**kwargs: Any,
) -> None:
super().__init__(
model_name=model_name,
conf_thresh=conf_thresh,
nms_thresh=nms_thresh,
confidence_threshold=confidence_threshold,
nms_threshold=nms_threshold,
input_size=input_size,
**kwargs,
)
self._supports_landmarks = True # SCRFD supports landmarks
self.model_name = model_name
self.conf_thresh = conf_thresh
self.nms_thresh = nms_thresh
self.confidence_threshold = confidence_threshold
self.nms_threshold = nms_threshold
self.input_size = input_size
# ------- SCRFD model params ------
self._fmc = 3
self._num_feature_maps = 3
self._feat_stride_fpn = [8, 16, 32]
self._num_anchors = 2
self._center_cache = {}
# ---------------------------------
Logger.info(
f'Initializing SCRFD with model={self.model_name}, conf_thresh={self.conf_thresh}, '
f'nms_thresh={self.nms_thresh}, input_size={self.input_size}'
f'Initializing SCRFD with model={self.model_name}, confidence_threshold={self.confidence_threshold}, '
f'nms_threshold={self.nms_threshold}, input_size={self.input_size}'
)
# Get path to model weights
@@ -95,14 +97,13 @@ class SCRFD(BaseDetector):
self._initialize_model(self._model_path)
def _initialize_model(self, model_path: str) -> None:
"""
Initializes an ONNX model session from the given path.
"""Initialize an ONNX model session from the given path.
Args:
model_path (str): The file path to the ONNX model.
model_path: The file path to the ONNX model.
Raises:
RuntimeError: If the model fails to load, logs an error and raises an exception.
RuntimeError: If the model fails to load.
"""
try:
self.session = create_onnx_session(model_path)
@@ -113,14 +114,14 @@ class SCRFD(BaseDetector):
Logger.error(f"Failed to load model from '{model_path}': {e}", exc_info=True)
raise RuntimeError(f"Failed to initialize model session for '{model_path}'") from e
def preprocess(self, image: np.ndarray) -> Tuple[np.ndarray, Tuple[int, int]]:
def preprocess(self, image: np.ndarray) -> np.ndarray:
"""Preprocess image for inference.
Args:
image (np.ndarray): Input image
image: Input image with shape (H, W, C).
Returns:
Tuple[np.ndarray, Tuple[int, int]]: Preprocessed blob and input size
Preprocessed image tensor with shape (1, C, H, W).
"""
image = image.astype(np.float32)
image = (image - 127.5) / 127.5
@@ -129,29 +130,42 @@ class SCRFD(BaseDetector):
return image
def inference(self, input_tensor: np.ndarray) -> List[np.ndarray]:
def inference(self, input_tensor: np.ndarray) -> list[np.ndarray]:
"""Perform model inference on the preprocessed image tensor.
Args:
input_tensor (np.ndarray): Preprocessed input tensor.
input_tensor: Preprocessed input tensor with shape (1, C, H, W).
Returns:
Tuple[np.ndarray, np.ndarray]: Raw model outputs.
List of raw model outputs.
"""
return self.session.run(self.output_names, {self.input_names: input_tensor})
def postprocess(self, outputs: List[np.ndarray], image_size: Tuple[int, int]):
scores_list = []
def postprocess(
self,
outputs: list[np.ndarray],
image_size: tuple[int, int],
) -> tuple[list[np.ndarray], list[np.ndarray], list[np.ndarray]]:
"""Process model outputs into detection results.
Args:
outputs: Raw outputs from the detection model.
image_size: Size of the input image as (height, width).
Returns:
Tuple of (scores_list, bboxes_list, landmarks_list).
"""
scores_list: list[np.ndarray] = []
bboxes_list = []
kpss_list = []
image_size = image_size
fmc = self._fmc
num_feature_maps = self._num_feature_maps
for idx, stride in enumerate(self._feat_stride_fpn):
scores = outputs[idx]
bbox_preds = outputs[fmc + idx] * stride
kps_preds = outputs[2 * fmc + idx] * stride
bbox_preds = outputs[num_feature_maps + idx] * stride
kps_preds = outputs[2 * num_feature_maps + idx] * stride
# Generate anchors
fm_height = image_size[0] // stride
@@ -171,7 +185,7 @@ class SCRFD(BaseDetector):
if len(self._center_cache) < 100:
self._center_cache[cache_key] = anchor_centers
pos_indices = np.where(scores >= self.conf_thresh)[0]
pos_indices = np.where(scores >= self.confidence_threshold)[0]
if len(pos_indices) == 0:
continue
@@ -193,7 +207,7 @@ class SCRFD(BaseDetector):
max_num: int = 0,
metric: Literal['default', 'max'] = 'max',
center_weight: float = 2.0,
) -> List[Face]:
) -> list[Face]:
"""
Perform face detection on an input image and return bounding boxes and facial landmarks.
@@ -247,7 +261,7 @@ class SCRFD(BaseDetector):
pre_det = np.hstack((bboxes, scores)).astype(np.float32, copy=False)
pre_det = pre_det[order, :]
keep = non_max_suppression(pre_det, threshold=self.nms_thresh)
keep = non_max_suppression(pre_det, threshold=self.nms_threshold)
detections = pre_det[keep, :]
landmarks = landmarks[order, :, :]

View File

@@ -2,7 +2,7 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Any, List, Literal, Tuple
from typing import Any, Literal
import cv2
import numpy as np
@@ -30,8 +30,8 @@ class YOLOv5Face(BaseDetector):
Args:
model_name (YOLOv5FaceWeights): Predefined model enum (e.g., `YOLOV5S`).
Specifies the YOLOv5-Face variant to load. Defaults to YOLOV5S.
conf_thresh (float): Confidence threshold for filtering detections. Defaults to 0.6.
nms_thresh (float): Non-Maximum Suppression threshold. Defaults to 0.5.
confidence_threshold (float): Confidence threshold for filtering detections. Defaults to 0.6.
nms_threshold (float): Non-Maximum Suppression threshold. Defaults to 0.5.
input_size (int): Input image size. Defaults to 640.
Note: ONNX model is fixed at 640. Changing this will cause inference errors.
**kwargs: Advanced options:
@@ -39,8 +39,8 @@ class YOLOv5Face(BaseDetector):
Attributes:
model_name (YOLOv5FaceWeights): Selected model variant.
conf_thresh (float): Threshold used to filter low-confidence detections.
nms_thresh (float): Threshold used during NMS to suppress overlapping boxes.
confidence_threshold (float): Threshold used to filter low-confidence detections.
nms_threshold (float): Threshold used during NMS to suppress overlapping boxes.
input_size (int): Image size to which inputs are resized before inference.
max_det (int): Maximum number of detections to return.
_model_path (str): Absolute path to the downloaded/verified model weights.
@@ -54,15 +54,15 @@ class YOLOv5Face(BaseDetector):
self,
*,
model_name: YOLOv5FaceWeights = YOLOv5FaceWeights.YOLOV5S,
conf_thresh: float = 0.6,
nms_thresh: float = 0.5,
confidence_threshold: float = 0.6,
nms_threshold: float = 0.5,
input_size: int = 640,
**kwargs: Any,
) -> None:
super().__init__(
model_name=model_name,
conf_thresh=conf_thresh,
nms_thresh=nms_thresh,
confidence_threshold=confidence_threshold,
nms_threshold=nms_threshold,
input_size=input_size,
**kwargs,
)
@@ -75,16 +75,16 @@ class YOLOv5Face(BaseDetector):
)
self.model_name = model_name
self.conf_thresh = conf_thresh
self.nms_thresh = nms_thresh
self.confidence_threshold = confidence_threshold
self.nms_threshold = nms_threshold
self.input_size = input_size
# Advanced options from kwargs
self.max_det = kwargs.get('max_det', 750)
Logger.info(
f'Initializing YOLOv5Face with model={self.model_name}, conf_thresh={self.conf_thresh}, '
f'nms_thresh={self.nms_thresh}, input_size={self.input_size}'
f'Initializing YOLOv5Face with model={self.model_name}, confidence_threshold={self.confidence_threshold}, '
f'nms_threshold={self.nms_threshold}, input_size={self.input_size}'
)
# Get path to model weights
@@ -113,7 +113,7 @@ class YOLOv5Face(BaseDetector):
Logger.error(f"Failed to load model from '{model_path}': {e}", exc_info=True)
raise RuntimeError(f"Failed to initialize model session for '{model_path}'") from e
def preprocess(self, image: np.ndarray) -> Tuple[np.ndarray, float, Tuple[int, int]]:
def preprocess(self, image: np.ndarray) -> tuple[np.ndarray, float, tuple[int, int]]:
"""
Preprocess image for inference.
@@ -154,7 +154,7 @@ class YOLOv5Face(BaseDetector):
return img_batch, scale, (pad_w, pad_h)
def inference(self, input_tensor: np.ndarray) -> List[np.ndarray]:
def inference(self, input_tensor: np.ndarray) -> list[np.ndarray]:
"""Perform model inference on the preprocessed image tensor.
Args:
@@ -169,8 +169,8 @@ class YOLOv5Face(BaseDetector):
self,
predictions: np.ndarray,
scale: float,
padding: Tuple[int, int],
) -> Tuple[np.ndarray, np.ndarray]:
padding: tuple[int, int],
) -> tuple[np.ndarray, np.ndarray]:
"""
Postprocess model predictions.
@@ -190,7 +190,7 @@ class YOLOv5Face(BaseDetector):
predictions = predictions[0] # Remove batch dimension
# Filter by confidence
mask = predictions[:, 4] >= self.conf_thresh
mask = predictions[:, 4] >= self.confidence_threshold
predictions = predictions[mask]
if len(predictions) == 0:
@@ -207,7 +207,7 @@ class YOLOv5Face(BaseDetector):
# Apply NMS
detections_for_nms = np.hstack((boxes, scores[:, None])).astype(np.float32, copy=False)
keep = non_max_suppression(detections_for_nms, self.nms_thresh)
keep = non_max_suppression(detections_for_nms, self.nms_threshold)
if len(keep) == 0:
return np.array([]), np.array([])
@@ -260,7 +260,7 @@ class YOLOv5Face(BaseDetector):
max_num: int = 0,
metric: Literal['default', 'max'] = 'max',
center_weight: float = 2.0,
) -> List[Face]:
) -> list[Face]:
"""
Perform face detection on an input image and return bounding boxes and facial landmarks.

View File

@@ -2,8 +2,9 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
from dataclasses import dataclass, fields
from typing import Optional
import numpy as np
@@ -29,6 +30,8 @@ class Face:
age: Predicted exact age in years (optional, from AgeGender model).
age_group: Predicted age range like "20-29" (optional, from FairFace).
race: Predicted race/ethnicity (optional, from FairFace).
emotion: Predicted emotion label (optional, from Emotion model).
emotion_confidence: Confidence score for emotion prediction (optional).
Properties:
sex: Gender as a human-readable string ("Female" or "Male").
@@ -42,13 +45,15 @@ class Face:
landmarks: np.ndarray
# Optional attributes
embedding: Optional[np.ndarray] = None
gender: Optional[int] = None
age: Optional[int] = None
age_group: Optional[str] = None
race: Optional[str] = None
embedding: np.ndarray | None = None
gender: int | None = None
age: int | None = None
age_group: str | None = None
race: str | None = None
emotion: str | None = None
emotion_confidence: float | None = None
def compute_similarity(self, other: 'Face') -> float:
def compute_similarity(self, other: Face) -> float:
"""Compute cosine similarity with another face."""
if self.embedding is None or other.embedding is None:
raise ValueError('Both faces must have embeddings for similarity computation')
@@ -59,7 +64,7 @@ class Face:
return {f.name: getattr(self, f.name) for f in fields(self)}
@property
def sex(self) -> Optional[str]:
def sex(self) -> str | None:
"""Get gender as a string label (Female or Male)."""
if self.gender is None:
return None
@@ -85,6 +90,8 @@ class Face:
parts.append(f'sex={self.sex}')
if self.race is not None:
parts.append(f'race={self.race}')
if self.emotion is not None:
parts.append(f'emotion={self.emotion}')
if self.embedding is not None:
parts.append(f'embedding_dim={self.embedding.shape[0]}')
return ', '.join(parts) + ')'

View File

@@ -2,21 +2,21 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Tuple, Union
from __future__ import annotations
import cv2
import numpy as np
from skimage.transform import SimilarityTransform
__all__ = [
'face_alignment',
'compute_similarity',
'bbox_center_alignment',
'compute_similarity',
'face_alignment',
'transform_points_2d',
]
# Reference alignment for facial landmarks (ArcFace)
# Standard 5-point facial landmark reference for ArcFace alignment (112x112)
reference_alignment: np.ndarray = np.array(
[
[38.2946, 51.6963],
@@ -29,22 +29,25 @@ reference_alignment: np.ndarray = np.array(
)
def estimate_norm(landmark: np.ndarray, image_size: Union[int, Tuple[int, int]] = 112) -> Tuple[np.ndarray, np.ndarray]:
"""
Estimate the normalization transformation matrix for facial landmarks.
def estimate_norm(
landmark: np.ndarray,
image_size: int | tuple[int, int] = 112,
) -> tuple[np.ndarray, np.ndarray]:
"""Estimate the normalization transformation matrix for facial landmarks.
Args:
landmark (np.ndarray): Array of shape (5, 2) representing the coordinates of the facial landmarks.
image_size (Union[int, Tuple[int, int]], optional): The size of the output image.
Can be an integer (for square images) or a tuple (width, height). Default is 112.
landmark: Array of shape (5, 2) representing the coordinates of the facial landmarks.
image_size: The size of the output image. Can be an integer (for square images)
or a tuple (width, height). Default is 112.
Returns:
np.ndarray: The 2x3 transformation matrix for aligning the landmarks.
np.ndarray: The 2x3 inverse transformation matrix for aligning the landmarks.
A tuple containing:
- The 2x3 transformation matrix for aligning the landmarks.
- The 2x3 inverse transformation matrix.
Raises:
AssertionError: If the input landmark array does not have the shape (5, 2)
or if image_size is not a multiple of 112 or 128.
or if image_size is not a multiple of 112 or 128.
"""
assert landmark.shape == (5, 2), 'Landmark array must have shape (5, 2).'
@@ -80,23 +83,23 @@ def estimate_norm(landmark: np.ndarray, image_size: Union[int, Tuple[int, int]]
def face_alignment(
image: np.ndarray,
landmark: np.ndarray,
image_size: Union[int, Tuple[int, int]] = 112,
) -> Tuple[np.ndarray, np.ndarray]:
"""
Align the face in the input image based on the given facial landmarks.
image_size: int | tuple[int, int] = 112,
) -> tuple[np.ndarray, np.ndarray]:
"""Align the face in the input image based on the given facial landmarks.
Args:
image (np.ndarray): Input image as a NumPy array.
landmark (np.ndarray): Array of shape (5, 2) representing the coordinates of the facial landmarks.
image_size (Union[int, Tuple[int, int]], optional): The size of the aligned output image.
Can be an integer (for square images) or a tuple (width, height). Default is 112.
image: Input image as a NumPy array with shape (H, W, C).
landmark: Array of shape (5, 2) representing the facial landmark coordinates.
image_size: The size of the aligned output image. Can be an integer
(for square images) or a tuple (width, height). Default is 112.
Returns:
np.ndarray: The aligned face as a NumPy array.
np.ndarray: The 2x3 transformation matrix used for alignment.
A tuple containing:
- The aligned face as a NumPy array.
- The 2x3 inverse transformation matrix used for alignment.
"""
# Get the transformation matrix
M, M_inv = estimate_norm(landmark, image_size)
transform_matrix, inverse_transform = estimate_norm(landmark, image_size)
# Handle both int and tuple for warpAffine output size
if isinstance(image_size, int):
@@ -105,44 +108,50 @@ def face_alignment(
output_size = image_size
# Warp the input image to align the face
warped = cv2.warpAffine(image, M, output_size, borderValue=0.0)
warped = cv2.warpAffine(image, transform_matrix, output_size, borderValue=0.0)
return warped, M_inv
return warped, inverse_transform
def compute_similarity(feat1: np.ndarray, feat2: np.ndarray, normalized: bool = False) -> np.float32:
"""Computing Similarity between two faces.
"""Compute cosine similarity between two face embeddings.
Args:
feat1 (np.ndarray): First embedding.
feat2 (np.ndarray): Second embedding.
normalized (bool): Set True if the embeddings are already L2 normalized.
feat1: First embedding vector.
feat2: Second embedding vector.
normalized: Set True if the embeddings are already L2 normalized.
Returns:
np.float32: Cosine similarity.
Cosine similarity score in range [-1, 1].
"""
feat1 = feat1.ravel()
feat2 = feat2.ravel()
if normalized:
return np.dot(feat1, feat2)
else:
return np.dot(feat1, feat2) / (np.linalg.norm(feat1) * np.linalg.norm(feat2) + 1e-5)
# Add small epsilon to prevent division by zero
return np.dot(feat1, feat2) / (np.linalg.norm(feat1) * np.linalg.norm(feat2) + 1e-5)
def bbox_center_alignment(image, center, output_size, scale, rotation):
"""
Apply center-based alignment, scaling, and rotation to an image.
def bbox_center_alignment(
image: np.ndarray,
center: tuple[float, float],
output_size: int,
scale: float,
rotation: float,
) -> tuple[np.ndarray, np.ndarray]:
"""Apply center-based alignment, scaling, and rotation to an image.
Args:
image (np.ndarray): Input image.
center (Tuple[float, float]): Center point (e.g., face center from bbox).
output_size (int): Desired output image size (square).
scale (float): Scaling factor to zoom in/out.
rotation (float): Rotation angle in degrees (clockwise).
image: Input image with shape (H, W, C).
center: Center point (x, y), e.g., face center from bbox.
output_size: Desired output image size (square).
scale: Scaling factor to zoom in/out.
rotation: Rotation angle in degrees (clockwise).
Returns:
cropped (np.ndarray): Aligned and cropped image.
M (np.ndarray): 2x3 affine transform matrix used.
A tuple containing:
- Aligned and cropped image with shape (output_size, output_size, C).
- 2x3 affine transform matrix used.
"""
# Convert rotation from degrees to radians
@@ -175,15 +184,14 @@ def bbox_center_alignment(image, center, output_size, scale, rotation):
def transform_points_2d(points: np.ndarray, transform: np.ndarray) -> np.ndarray:
"""
Apply a 2D affine transformation to an array of 2D points.
"""Apply a 2D affine transformation to an array of 2D points.
Args:
points (np.ndarray): An (N, 2) array of 2D points.
transform (np.ndarray): A (2, 3) affine transformation matrix.
points: An (N, 2) array of 2D points.
transform: A (2, 3) affine transformation matrix.
Returns:
np.ndarray: Transformed (N, 2) array of points.
Transformed (N, 2) array of points.
"""
transformed = np.zeros_like(points, dtype=np.float32)
for i in range(points.shape[0]):

View File

@@ -34,10 +34,7 @@ def create_gaze_estimator(method: str = 'mobilegaze', **kwargs) -> BaseGazeEstim
>>> # Create with MobileNetV2 backbone
>>> from uniface.constants import GazeWeights
>>> estimator = create_gaze_estimator(
... 'mobilegaze',
... model_name=GazeWeights.MOBILENET_V2
... )
>>> estimator = create_gaze_estimator('mobilegaze', model_name=GazeWeights.MOBILENET_V2)
>>> # Use the estimator
>>> pitch, yaw = estimator.estimate(face_crop)
@@ -51,4 +48,4 @@ def create_gaze_estimator(method: str = 'mobilegaze', **kwargs) -> BaseGazeEstim
raise ValueError(f"Unsupported gaze estimation method: '{method}'. Available: {available}")
__all__ = ['create_gaze_estimator', 'MobileGaze', 'BaseGazeEstimator']
__all__ = ['BaseGazeEstimator', 'MobileGaze', 'create_gaze_estimator']

View File

@@ -3,7 +3,6 @@
# GitHub: https://github.com/yakhyo
from abc import ABC, abstractmethod
from typing import Tuple
import numpy as np
@@ -54,7 +53,7 @@ class BaseGazeEstimator(ABC):
raise NotImplementedError('Subclasses must implement the preprocess method.')
@abstractmethod
def postprocess(self, outputs: Tuple[np.ndarray, np.ndarray]) -> Tuple[float, float]:
def postprocess(self, outputs: tuple[np.ndarray, np.ndarray]) -> tuple[float, float]:
"""
Postprocess raw model outputs into gaze angles.
@@ -71,7 +70,7 @@ class BaseGazeEstimator(ABC):
raise NotImplementedError('Subclasses must implement the postprocess method.')
@abstractmethod
def estimate(self, face_image: np.ndarray) -> Tuple[float, float]:
def estimate(self, face_image: np.ndarray) -> tuple[float, float]:
"""
Perform end-to-end gaze estimation on a face image.
@@ -91,11 +90,11 @@ class BaseGazeEstimator(ABC):
Example:
>>> estimator = create_gaze_estimator()
>>> pitch, yaw = estimator.estimate(face_crop)
>>> print(f"Looking: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°")
>>> print(f'Looking: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°')
"""
raise NotImplementedError('Subclasses must implement the estimate method.')
def __call__(self, face_image: np.ndarray) -> Tuple[float, float]:
def __call__(self, face_image: np.ndarray) -> tuple[float, float]:
"""
Provides a convenient, callable shortcut for the `estimate` method.

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Tuple
import cv2
import numpy as np
@@ -54,17 +53,17 @@ class MobileGaze(BaseGazeEstimator):
>>> # Detect faces and estimate gaze for each
>>> faces = detector.detect(image)
>>> for face in faces:
... bbox = face['bbox']
... bbox = face.bbox
... x1, y1, x2, y2 = map(int, bbox[:4])
... face_crop = image[y1:y2, x1:x2]
... pitch, yaw = gaze_estimator.estimate(face_crop)
... print(f"Gaze: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°")
... print(f'Gaze: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°')
"""
def __init__(
self,
model_name: GazeWeights = GazeWeights.RESNET34,
input_size: Tuple[int, int] = (448, 448),
input_size: tuple[int, int] = (448, 448),
) -> None:
Logger.info(f'Initializing MobileGaze with model={model_name}, input_size={input_size}')
@@ -143,7 +142,7 @@ class MobileGaze(BaseGazeEstimator):
e_x = np.exp(x - np.max(x, axis=1, keepdims=True))
return e_x / e_x.sum(axis=1, keepdims=True)
def postprocess(self, outputs: Tuple[np.ndarray, np.ndarray]) -> Tuple[np.ndarray, np.ndarray]:
def postprocess(self, outputs: tuple[np.ndarray, np.ndarray]) -> tuple[np.ndarray, np.ndarray]:
"""
Postprocess raw model outputs into gaze angles.
@@ -173,7 +172,7 @@ class MobileGaze(BaseGazeEstimator):
return pitch, yaw
def estimate(self, face_image: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
def estimate(self, face_image: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
"""
Perform end-to-end gaze estimation on a face image.

View File

@@ -25,4 +25,4 @@ def create_landmarker(method: str = '2d106det', **kwargs) -> BaseLandmarker:
raise ValueError(f"Unsupported method: '{method}'. Available: {available}")
__all__ = ['create_landmarker', 'Landmark106', 'BaseLandmarker']
__all__ = ['BaseLandmarker', 'Landmark106', 'create_landmarker']

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Tuple
import cv2
import numpy as np
@@ -46,7 +45,7 @@ class Landmark106(BaseLandmarker):
def __init__(
self,
model_name: LandmarkWeights = LandmarkWeights.DEFAULT,
input_size: Tuple[int, int] = (192, 192),
input_size: tuple[int, int] = (192, 192),
) -> None:
Logger.info(f'Initializing Facial Landmark with model={model_name}, input_size={input_size}')
self.input_size = input_size
@@ -85,7 +84,7 @@ class Landmark106(BaseLandmarker):
Logger.error(f"Failed to load landmark model from '{self.model_path}'", exc_info=True)
raise RuntimeError(f'Failed to initialize landmark model: {e}') from e
def preprocess(self, image: np.ndarray, bbox: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
def preprocess(self, image: np.ndarray, bbox: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
"""Prepares a face crop for inference.
This method takes a face bounding box, performs a center alignment to

View File

@@ -1,21 +1,41 @@
# Copyright 2025 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Logging utilities for UniFace.
This module provides a centralized logger for the UniFace library,
allowing users to enable verbose logging when debugging or developing.
"""
from __future__ import annotations
import logging
__all__ = ['Logger', 'enable_logging']
# Create logger for uniface
Logger = logging.getLogger('uniface')
Logger.setLevel(logging.WARNING) # Only show warnings/errors by default
Logger.addHandler(logging.NullHandler())
def enable_logging(level=logging.INFO):
"""
Enable verbose logging for uniface.
def enable_logging(level: int = logging.INFO) -> None:
"""Enable verbose logging for uniface.
Configures the logger to output messages to stdout with timestamps.
Call this function to see informational messages during model loading
and inference.
Args:
level: Logging level (logging.DEBUG, logging.INFO, etc.)
level: Logging level. Defaults to logging.INFO.
Common values: logging.DEBUG, logging.INFO, logging.WARNING.
Example:
>>> from uniface import enable_logging
>>> import logging
>>> enable_logging() # Show INFO logs
>>> enable_logging(level=logging.DEBUG) # Show DEBUG logs
"""
Logger.handlers.clear()
handler = logging.StreamHandler()

View File

@@ -2,6 +2,15 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Model weight management for UniFace.
This module handles downloading, caching, and verifying model weights
using SHA-256 checksums for integrity validation.
"""
from __future__ import annotations
from enum import Enum
import hashlib
import os
@@ -14,33 +23,32 @@ from uniface.log import Logger
__all__ = ['verify_model_weights']
def verify_model_weights(model_name: str, root: str = '~/.uniface/models') -> str:
"""
Ensure model weights are present, downloading and verifying them using SHA-256 if necessary.
def verify_model_weights(model_name: Enum, root: str = '~/.uniface/models') -> str:
"""Ensure model weights are present, downloading and verifying them if necessary.
Given a model identifier from an Enum class (e.g., `RetinaFaceWeights.MNET_V2`), this function checks if
the corresponding `.onnx` weight file exists locally. If not, it downloads the file from a predefined URL.
After download, the files integrity is verified using a SHA-256 hash. If verification fails, the file is deleted
and an error is raised.
Given a model identifier from an Enum class (e.g., `RetinaFaceWeights.MNET_V2`),
this function checks if the corresponding weight file exists locally. If not,
it downloads the file from a predefined URL and verifies its integrity using
a SHA-256 hash.
Args:
model_name (Enum): Model weight identifier (e.g., `RetinaFaceWeights.MNET_V2`, `ArcFaceWeights.RESNET`, etc.).
root (str, optional): Directory to store or locate the model weights. Defaults to '~/.uniface/models'.
model_name: Model weight identifier enum (e.g., `RetinaFaceWeights.MNET_V2`).
root: Directory to store or locate the model weights.
Defaults to '~/.uniface/models'.
Returns:
str: Absolute path to the verified model weights file.
Absolute path to the verified model weights file.
Raises:
ValueError: If the model is unknown or SHA-256 verification fails.
ConnectionError: If downloading the file fails.
Examples:
>>> from uniface.models import RetinaFaceWeights, verify_model_weights
>>> verify_model_weights(RetinaFaceWeights.MNET_V2)
Example:
>>> from uniface.constants import RetinaFaceWeights
>>> from uniface.model_store import verify_model_weights
>>> path = verify_model_weights(RetinaFaceWeights.MNET_V2)
>>> print(path)
'/home/user/.uniface/models/retinaface_mnet_v2.onnx'
>>> verify_model_weights(RetinaFaceWeights.RESNET34, root='/custom/dir')
'/custom/dir/retinaface_r34.onnx'
"""
root = os.path.expanduser(root)
@@ -73,10 +81,16 @@ def verify_model_weights(model_name: str, root: str = '~/.uniface/models') -> st
return model_path
def download_file(url: str, dest_path: str) -> None:
"""Download a file from a URL in chunks and save it to the destination path."""
def download_file(url: str, dest_path: str, timeout: int = 30) -> None:
"""Download a file from a URL in chunks and save it to the destination path.
Args:
url: URL to download from.
dest_path: Local file path to save to.
timeout: Connection timeout in seconds. Defaults to 30.
"""
try:
response = requests.get(url, stream=True)
response = requests.get(url, stream=True, timeout=timeout)
response.raise_for_status()
with (
open(dest_path, 'wb') as file,

View File

@@ -2,16 +2,23 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import List
"""ONNX Runtime utilities for UniFace.
This module provides helper functions for creating and managing ONNX Runtime
inference sessions with automatic hardware acceleration detection.
"""
from __future__ import annotations
import onnxruntime as ort
from uniface.log import Logger
__all__ = ['create_onnx_session', 'get_available_providers']
def get_available_providers() -> List[str]:
"""
Get list of available ONNX Runtime execution providers for the current platform.
def get_available_providers() -> list[str]:
"""Get list of available ONNX Runtime execution providers.
Automatically detects and prioritizes hardware acceleration:
- CoreML on Apple Silicon (M1/M2/M3/M4)
@@ -19,13 +26,12 @@ def get_available_providers() -> List[str]:
- CPU as fallback (always available)
Returns:
List[str]: Ordered list of execution providers to use
Ordered list of execution providers to use.
Examples:
Example:
>>> providers = get_available_providers()
>>> # On M4 Mac: ['CoreMLExecutionProvider', 'CPUExecutionProvider']
>>> # On Linux with CUDA: ['CUDAExecutionProvider', 'CPUExecutionProvider']
>>> # On CPU-only: ['CPUExecutionProvider']
"""
available = ort.get_available_providers()
providers = []
@@ -48,26 +54,28 @@ def get_available_providers() -> List[str]:
return providers
def create_onnx_session(model_path: str, providers: List[str] = None) -> ort.InferenceSession:
"""
Create an ONNX Runtime inference session with optimal provider selection.
def create_onnx_session(
model_path: str,
providers: list[str] | None = None,
) -> ort.InferenceSession:
"""Create an ONNX Runtime inference session with optimal provider selection.
Args:
model_path (str): Path to the ONNX model file
providers (List[str], optional): List of providers to use.
If None, automatically detects best available providers.
model_path: Path to the ONNX model file.
providers: List of execution providers to use. If None, automatically
detects best available providers.
Returns:
ort.InferenceSession: Configured ONNX Runtime session
Configured ONNX Runtime session.
Raises:
RuntimeError: If session creation fails
RuntimeError: If session creation fails.
Examples:
>>> session = create_onnx_session("model.onnx")
Example:
>>> session = create_onnx_session('model.onnx')
>>> # Automatically uses best available providers
>>> session = create_onnx_session("model.onnx", providers=["CPUExecutionProvider"])
>>> session = create_onnx_session('model.onnx', providers=['CPUExecutionProvider'])
>>> # Force CPU-only execution
"""
if providers is None:

View File

@@ -2,7 +2,7 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Union
from __future__ import annotations
from uniface.constants import ParsingWeights
@@ -13,38 +13,29 @@ __all__ = ['BaseFaceParser', 'BiSeNet', 'create_face_parser']
def create_face_parser(
model_name: Union[str, ParsingWeights] = ParsingWeights.RESNET18,
model_name: str | ParsingWeights = ParsingWeights.RESNET18,
) -> BaseFaceParser:
"""
Factory function to create a face parsing model instance.
"""Factory function to create a face parsing model instance.
This function provides a convenient way to instantiate face parsing models
without directly importing the specific model classes. It supports both
string-based and enum-based model selection.
without directly importing the specific model classes.
Args:
model_name (Union[str, ParsingWeights]): The face parsing model to create.
Can be either a string or a ParsingWeights enum value.
Available options:
model_name: The face parsing model to create. Can be either a string
or a ParsingWeights enum value. Available options:
- 'parsing_resnet18' or ParsingWeights.RESNET18 (default)
- 'parsing_resnet34' or ParsingWeights.RESNET34
Returns:
BaseFaceParser: An instance of the requested face parsing model.
An instance of the requested face parsing model.
Raises:
ValueError: If the model_name is not recognized.
Examples:
>>> # Using enum
Example:
>>> from uniface.parsing import create_face_parser
>>> from uniface.constants import ParsingWeights
>>> parser = create_face_parser(ParsingWeights.RESNET18)
>>>
>>> # Using string
>>> parser = create_face_parser('parsing_resnet18')
>>>
>>> # Parse a face image
>>> mask = parser.parse(face_crop)
"""
# Convert string to enum if necessary

View File

@@ -3,7 +3,6 @@
# GitHub: https://github.com/yakhyo
from abc import ABC, abstractmethod
from typing import Tuple
import numpy as np
@@ -53,7 +52,7 @@ class BaseFaceParser(ABC):
raise NotImplementedError('Subclasses must implement the preprocess method.')
@abstractmethod
def postprocess(self, outputs: np.ndarray, original_size: Tuple[int, int]) -> np.ndarray:
def postprocess(self, outputs: np.ndarray, original_size: tuple[int, int]) -> np.ndarray:
"""
Postprocess raw model outputs into a segmentation mask.
@@ -89,7 +88,7 @@ class BaseFaceParser(ABC):
Example:
>>> parser = create_face_parser()
>>> mask = parser.parse(face_crop)
>>> print(f"Mask shape: {mask.shape}, unique classes: {np.unique(mask)}")
>>> print(f'Mask shape: {mask.shape}, unique classes: {np.unique(mask)}')
"""
raise NotImplementedError('Subclasses must implement the parse method.')

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Tuple
import cv2
import numpy as np
@@ -54,17 +53,17 @@ class BiSeNet(BaseFaceParser):
>>> # Detect faces and parse each face
>>> faces = detector.detect(image)
>>> for face in faces:
... bbox = face['bbox']
... bbox = face.bbox
... x1, y1, x2, y2 = map(int, bbox[:4])
... face_crop = image[y1:y2, x1:x2]
... mask = parser.parse(face_crop)
... print(f"Mask shape: {mask.shape}, unique classes: {np.unique(mask)}")
... print(f'Mask shape: {mask.shape}, unique classes: {np.unique(mask)}')
"""
def __init__(
self,
model_name: ParsingWeights = ParsingWeights.RESNET18,
input_size: Tuple[int, int] = (512, 512),
input_size: tuple[int, int] = (512, 512),
) -> None:
Logger.info(f'Initializing BiSeNet with model={model_name}, input_size={input_size}')
@@ -127,7 +126,7 @@ class BiSeNet(BaseFaceParser):
return image
def postprocess(self, outputs: np.ndarray, original_size: Tuple[int, int]) -> np.ndarray:
def postprocess(self, outputs: np.ndarray, original_size: tuple[int, int]) -> np.ndarray:
"""
Postprocess model output to segmentation mask.

View File

@@ -2,7 +2,7 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Optional
from __future__ import annotations
import numpy as np
@@ -11,11 +11,11 @@ from .blur import BlurFace
def anonymize_faces(
image: np.ndarray,
detector: Optional[object] = None,
detector: object | None = None,
method: str = 'pixelate',
blur_strength: float = 3.0,
pixel_blocks: int = 10,
conf_thresh: float = 0.5,
confidence_threshold: float = 0.5,
**kwargs,
) -> np.ndarray:
"""One-line face anonymization with automatic detection.
@@ -26,7 +26,7 @@ def anonymize_faces(
method (str): Blur method name. Defaults to 'pixelate'.
blur_strength (float): Blur intensity. Defaults to 3.0.
pixel_blocks (int): Block count for pixelate. Defaults to 10.
conf_thresh (float): Detection confidence threshold. Defaults to 0.5.
confidence_threshold (float): Detection confidence threshold. Defaults to 0.5.
**kwargs: Additional detector arguments.
Returns:
@@ -40,7 +40,7 @@ def anonymize_faces(
try:
from uniface import RetinaFace
detector = RetinaFace(conf_thresh=conf_thresh, **kwargs)
detector = RetinaFace(confidence_threshold=confidence_threshold, **kwargs)
except ImportError as err:
raise ImportError('Could not import RetinaFace. Please ensure UniFace is properly installed.') from err

View File

@@ -2,12 +2,17 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Dict, List, Tuple, Union
from __future__ import annotations
from typing import TYPE_CHECKING, ClassVar
import cv2
import numpy as np
__all__ = ['BlurFace']
if TYPE_CHECKING:
pass
__all__ = ['BlurFace', 'EllipticalBlur']
def _gaussian_blur(region: np.ndarray, strength: float = 3.0) -> np.ndarray:
@@ -32,7 +37,7 @@ def _pixelate_blur(region: np.ndarray, blocks: int = 10) -> np.ndarray:
return cv2.resize(temp, (w, h), interpolation=cv2.INTER_NEAREST)
def _blackout_blur(region: np.ndarray, color: Tuple[int, int, int] = (0, 0, 0)) -> np.ndarray:
def _blackout_blur(region: np.ndarray, color: tuple[int, int, int] = (0, 0, 0)) -> np.ndarray:
"""Replace region with solid color."""
return np.full_like(region, color)
@@ -55,7 +60,7 @@ class EllipticalBlur:
def __call__(
self,
image: np.ndarray,
bboxes: List[Union[Tuple, List]],
bboxes: list[tuple | list],
inplace: bool = False,
) -> np.ndarray:
if not inplace:
@@ -98,14 +103,14 @@ class BlurFace:
>>> anonymized = blurrer.anonymize(image, faces)
"""
VALID_METHODS = {'gaussian', 'pixelate', 'blackout', 'elliptical', 'median'}
VALID_METHODS: ClassVar[set[str]] = {'gaussian', 'pixelate', 'blackout', 'elliptical', 'median'}
def __init__(
self,
method: str = 'pixelate',
blur_strength: float = 3.0,
pixel_blocks: int = 15,
color: Tuple[int, int, int] = (0, 0, 0),
color: tuple[int, int, int] = (0, 0, 0),
margin: int = 20,
):
self.method = method.lower()
@@ -121,6 +126,7 @@ class BlurFace:
self._elliptical = EllipticalBlur(blur_strength, margin)
def _blur_region(self, region: np.ndarray) -> np.ndarray:
"""Apply blur to a single region based on the configured method."""
if self.method == 'gaussian':
return _gaussian_blur(region, self._blur_strength)
elif self.method == 'median':
@@ -129,11 +135,12 @@ class BlurFace:
return _pixelate_blur(region, self._pixel_blocks)
elif self.method == 'blackout':
return _blackout_blur(region, self._color)
return region # Fallback (should not reach here)
def anonymize(
self,
image: np.ndarray,
faces: List[Dict],
faces: list,
inplace: bool = False,
) -> np.ndarray:
"""Anonymize faces in an image.
@@ -149,13 +156,13 @@ class BlurFace:
if not faces:
return image if inplace else image.copy()
bboxes = [face['bbox'] for face in faces]
bboxes = [face.bbox for face in faces]
return self.blur_regions(image, bboxes, inplace)
def blur_regions(
self,
image: np.ndarray,
bboxes: List[Union[Tuple, List]],
bboxes: list[tuple | list],
inplace: bool = False,
) -> np.ndarray:
"""Blur specific rectangular regions in an image.

View File

@@ -34,10 +34,7 @@ def create_recognizer(method: str = 'arcface', **kwargs) -> BaseRecognizer:
>>> # Create a specific MobileFace recognizer
>>> from uniface.constants import MobileFaceWeights
>>> recognizer = create_recognizer(
... 'mobileface',
... model_name=MobileFaceWeights.MNET_V2
... )
>>> recognizer = create_recognizer('mobileface', model_name=MobileFaceWeights.MNET_V2)
>>> # Create a SphereFace recognizer
>>> recognizer = create_recognizer('sphereface')
@@ -55,4 +52,4 @@ def create_recognizer(method: str = 'arcface', **kwargs) -> BaseRecognizer:
raise ValueError(f"Unsupported method: '{method}'. Available: {available}")
__all__ = ['create_recognizer', 'BaseRecognizer', 'ArcFace', 'MobileFace', 'SphereFace']
__all__ = ['ArcFace', 'BaseRecognizer', 'MobileFace', 'SphereFace', 'create_recognizer']

View File

@@ -2,9 +2,10 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import List, Tuple, Union
import cv2
import numpy as np
@@ -13,16 +14,22 @@ from uniface.face_utils import face_alignment
from uniface.log import Logger
from uniface.onnx_utils import create_onnx_session
__all__ = ['BaseRecognizer', 'PreprocessConfig']
@dataclass
class PreprocessConfig:
"""
Configuration for preprocessing images before feeding them into the model.
"""Configuration for preprocessing images before feeding them into the model.
Attributes:
input_mean: Mean value(s) for normalization.
input_std: Standard deviation value(s) for normalization.
input_size: Target image size as (height, width).
"""
input_mean: Union[float, List[float]] = 127.5
input_std: Union[float, List[float]] = 127.5
input_size: Tuple[int, int] = (112, 112)
input_mean: float | list[float] = 127.5
input_std: float | list[float] = 127.5
input_size: tuple[int, int] = (112, 112)
class BaseRecognizer(ABC):
@@ -94,7 +101,7 @@ class BaseRecognizer(ABC):
"""
resized_img = cv2.resize(face_img, self.input_size)
if isinstance(self.input_std, (list, tuple)):
if isinstance(self.input_std, list | tuple):
# Per-channel normalization
rgb_img = cv2.cvtColor(resized_img, cv2.COLOR_BGR2RGB).astype(np.float32)
normalized_img = (rgb_img - np.array(self.input_mean, dtype=np.float32)) / np.array(
@@ -116,13 +123,14 @@ class BaseRecognizer(ABC):
return blob
def get_embedding(self, image: np.ndarray, landmarks: np.ndarray = None) -> np.ndarray:
"""
Extracts face embedding from an image.
def get_embedding(self, image: np.ndarray, landmarks: np.ndarray | None = None) -> np.ndarray:
"""Extract face embedding from an image.
Args:
image: Input face image (BGR format). If already aligned (112x112), landmarks can be None.
landmarks: Facial landmarks (5 points for alignment). Optional if image is already aligned.
image: Input face image in BGR format. If already aligned (112x112),
landmarks can be None.
landmarks: Facial landmarks (5 points for alignment). Optional if
image is already aligned.
Returns:
Face embedding vector (typically 512-dimensional).
@@ -141,15 +149,14 @@ class BaseRecognizer(ABC):
return embedding
def get_normalized_embedding(self, image: np.ndarray, landmarks: np.ndarray) -> np.ndarray:
"""
Extracts a l2 normalized face embedding vector from an image.
"""Extract an L2-normalized face embedding vector from an image.
Args:
image: Input face image (BGR format).
image: Input face image in BGR format.
landmarks: Facial landmarks (5 points for alignment).
Returns:
Normalized face embedding vector (typically 512-dimensional).
L2-normalized face embedding vector (typically 512-dimensional).
"""
embedding = self.get_embedding(image, landmarks)
norm = np.linalg.norm(embedding)

View File

@@ -2,7 +2,7 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Optional
from __future__ import annotations
from uniface.constants import ArcFaceWeights, MobileFaceWeights, SphereFaceWeights
from uniface.model_store import verify_model_weights
@@ -34,7 +34,7 @@ class ArcFace(BaseRecognizer):
def __init__(
self,
model_name: ArcFaceWeights = ArcFaceWeights.MNET,
preprocessing: Optional[PreprocessConfig] = None,
preprocessing: PreprocessConfig | None = None,
) -> None:
if preprocessing is None:
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
@@ -64,7 +64,7 @@ class MobileFace(BaseRecognizer):
def __init__(
self,
model_name: MobileFaceWeights = MobileFaceWeights.MNET_V2,
preprocessing: Optional[PreprocessConfig] = None,
preprocessing: PreprocessConfig | None = None,
) -> None:
if preprocessing is None:
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
@@ -94,7 +94,7 @@ class SphereFace(BaseRecognizer):
def __init__(
self,
model_name: SphereFaceWeights = SphereFaceWeights.SPHERE20,
preprocessing: Optional[PreprocessConfig] = None,
preprocessing: PreprocessConfig | None = None,
) -> None:
if preprocessing is None:
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))

View File

@@ -2,7 +2,7 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Optional
from __future__ import annotations
from uniface.constants import MiniFASNetWeights
@@ -19,46 +19,27 @@ __all__ = [
def create_spoofer(
model_name: MiniFASNetWeights = MiniFASNetWeights.V2,
scale: Optional[float] = None,
scale: float | None = None,
) -> MiniFASNet:
"""
Factory function to create a face anti-spoofing model.
"""Factory function to create a face anti-spoofing model.
This is a convenience function that creates a MiniFASNet instance
with the specified model variant and optional custom scale.
Args:
model_name (MiniFASNetWeights): The model variant to use.
Options:
- MiniFASNetWeights.V2: Improved version (default), uses scale=2.7
- MiniFASNetWeights.V1SE: Squeeze-and-excitation version, uses scale=4.0
Defaults to MiniFASNetWeights.V2.
scale (Optional[float]): Custom crop scale factor for face region.
If None, uses the default scale for the selected model variant.
model_name: The model variant to use. Options:
- MiniFASNetWeights.V2: Improved version (default), uses scale=2.7
- MiniFASNetWeights.V1SE: Squeeze-and-excitation version, uses scale=4.0
scale: Custom crop scale factor for face region. If None, uses the
default scale for the selected model variant.
Returns:
MiniFASNet: An initialized face anti-spoofing model.
An initialized face anti-spoofing model.
Example:
>>> from uniface.spoofing import create_spoofer, MiniFASNetWeights
>>> from uniface import RetinaFace
>>>
>>> # Create with default settings (V2 model)
>>> spoofer = create_spoofer()
>>>
>>> # Create with V1SE model
>>> spoofer = create_spoofer(model_name=MiniFASNetWeights.V1SE)
>>>
>>> # Create with custom scale
>>> spoofer = create_spoofer(scale=3.0)
>>>
>>> # Use with face detector
>>> detector = RetinaFace()
>>> faces = detector.detect(image)
>>> for face in faces:
... label_idx, score = spoofer.predict(image, face['bbox'])
... # label_idx: 0 = Fake, 1 = Real
... label = 'Real' if label_idx == 1 else 'Fake'
... print(f'{label}: {score:.2%}')
>>> label_idx, score = spoofer.predict(image, face.bbox)
>>> # label_idx: 0 = Fake, 1 = Real
"""
return MiniFASNet(model_name=model_name, scale=scale)

View File

@@ -3,7 +3,6 @@
# GitHub: https://github.com/yakhyo
from abc import ABC, abstractmethod
from typing import List, Tuple, Union
import numpy as np
@@ -36,7 +35,7 @@ class BaseSpoofer(ABC):
raise NotImplementedError('Subclasses must implement the _initialize_model method.')
@abstractmethod
def preprocess(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> np.ndarray:
def preprocess(self, image: np.ndarray, bbox: list | np.ndarray) -> np.ndarray:
"""
Preprocess the input image for model inference.
@@ -55,7 +54,7 @@ class BaseSpoofer(ABC):
raise NotImplementedError('Subclasses must implement the preprocess method.')
@abstractmethod
def postprocess(self, outputs: np.ndarray) -> Tuple[int, float]:
def postprocess(self, outputs: np.ndarray) -> tuple[int, float]:
"""
Postprocess raw model outputs into prediction result.
@@ -73,7 +72,7 @@ class BaseSpoofer(ABC):
raise NotImplementedError('Subclasses must implement the postprocess method.')
@abstractmethod
def predict(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> Tuple[int, float]:
def predict(self, image: np.ndarray, bbox: list | np.ndarray) -> tuple[int, float]:
"""
Perform end-to-end anti-spoofing prediction on a face.
@@ -95,13 +94,13 @@ class BaseSpoofer(ABC):
>>> detector = RetinaFace()
>>> faces = detector.detect(image)
>>> for face in faces:
... label_idx, score = spoofer.predict(image, face['bbox'])
... label_idx, score = spoofer.predict(image, face.bbox)
... label = 'Real' if label_idx == 1 else 'Fake'
... print(f'{label}: {score:.2%}')
"""
raise NotImplementedError('Subclasses must implement the predict method.')
def __call__(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> Tuple[int, float]:
def __call__(self, image: np.ndarray, bbox: list | np.ndarray) -> tuple[int, float]:
"""
Provides a convenient, callable shortcut for the `predict` method.

View File

@@ -2,7 +2,6 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import List, Optional, Tuple, Union
import cv2
import numpy as np
@@ -59,7 +58,7 @@ class MiniFASNet(BaseSpoofer):
>>> # Detect faces and check if they are real
>>> faces = detector.detect(image)
>>> for face in faces:
... label_idx, score = spoofer.predict(image, face['bbox'])
... label_idx, score = spoofer.predict(image, face.bbox)
... # label_idx: 0 = Fake, 1 = Real
... label = 'Real' if label_idx == 1 else 'Fake'
... print(f'{label}: {score:.2%}')
@@ -68,7 +67,7 @@ class MiniFASNet(BaseSpoofer):
def __init__(
self,
model_name: MiniFASNetWeights = MiniFASNetWeights.V2,
scale: Optional[float] = None,
scale: float | None = None,
) -> None:
Logger.info(f'Initializing MiniFASNet with model={model_name.name}')
@@ -104,12 +103,12 @@ class MiniFASNet(BaseSpoofer):
Logger.error(f"Failed to load MiniFASNet model from '{self.model_path}'", exc_info=True)
raise RuntimeError(f'Failed to initialize MiniFASNet model: {e}') from e
def _xyxy_to_xywh(self, bbox: Union[List, np.ndarray]) -> List[int]:
def _xyxy_to_xywh(self, bbox: list | np.ndarray) -> list[int]:
"""Convert bounding box from [x1, y1, x2, y2] to [x, y, w, h] format."""
x1, y1, x2, y2 = bbox[:4]
return [int(x1), int(y1), int(x2 - x1), int(y2 - y1)]
def _crop_face(self, image: np.ndarray, bbox_xywh: List[int]) -> np.ndarray:
def _crop_face(self, image: np.ndarray, bbox_xywh: list[int]) -> np.ndarray:
"""
Crop and resize face region from image using scale factor.
@@ -147,7 +146,7 @@ class MiniFASNet(BaseSpoofer):
return resized
def preprocess(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> np.ndarray:
def preprocess(self, image: np.ndarray, bbox: list | np.ndarray) -> np.ndarray:
"""
Preprocess the input image for model inference.
@@ -181,7 +180,7 @@ class MiniFASNet(BaseSpoofer):
e_x = np.exp(x - np.max(x, axis=1, keepdims=True))
return e_x / e_x.sum(axis=1, keepdims=True)
def postprocess(self, outputs: np.ndarray) -> Tuple[int, float]:
def postprocess(self, outputs: np.ndarray) -> tuple[int, float]:
"""
Postprocess raw model outputs into prediction result.
@@ -202,7 +201,7 @@ class MiniFASNet(BaseSpoofer):
return label_idx, score
def predict(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> Tuple[int, float]:
def predict(self, image: np.ndarray, bbox: list | np.ndarray) -> tuple[int, float]:
"""
Perform end-to-end anti-spoofing prediction on a face.

View File

@@ -2,11 +2,26 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import List, Tuple, Union
"""Visualization utilities for UniFace.
This module provides functions for drawing detection results, gaze directions,
and face parsing segmentation maps on images.
"""
from __future__ import annotations
import cv2
import numpy as np
__all__ = [
'FACE_PARSING_COLORS',
'FACE_PARSING_LABELS',
'draw_detections',
'draw_fancy_bbox',
'draw_gaze',
'vis_parsing_maps',
]
# Face parsing component names (19 classes)
FACE_PARSING_LABELS = [
'background',
@@ -57,23 +72,25 @@ FACE_PARSING_COLORS = [
def draw_detections(
*,
image: np.ndarray,
bboxes: Union[List[np.ndarray], List[List[float]]],
scores: Union[np.ndarray, List[float]],
landmarks: Union[List[np.ndarray], List[List[List[float]]]],
bboxes: list[np.ndarray] | list[list[float]],
scores: np.ndarray | list[float],
landmarks: list[np.ndarray] | list[list[list[float]]],
vis_threshold: float = 0.6,
draw_score: bool = False,
fancy_bbox: bool = True,
):
"""
Draws bounding boxes, landmarks, and optional scores on an image.
) -> None:
"""Draw bounding boxes, landmarks, and optional scores on an image.
Modifies the image in-place.
Args:
image: Input image to draw on.
bboxes: List of bounding boxes [x1, y1, x2, y2].
image: Input image to draw on (modified in-place).
bboxes: List of bounding boxes as [x1, y1, x2, y2].
scores: List of confidence scores.
landmarks: List of landmark sets with shape (5, 2).
vis_threshold: Confidence threshold for filtering. Defaults to 0.6.
draw_score: Whether to draw confidence scores. Defaults to False.
fancy_bbox: Use corner-style bounding boxes. Defaults to True.
"""
colors = [(0, 0, 255), (0, 255, 255), (255, 0, 255), (0, 255, 0), (255, 0, 0)]
@@ -134,19 +151,18 @@ def draw_detections(
def draw_fancy_bbox(
image: np.ndarray,
bbox: np.ndarray,
color: Tuple[int, int, int] = (0, 255, 0),
color: tuple[int, int, int] = (0, 255, 0),
thickness: int = 3,
proportion: float = 0.2,
):
"""
Draws a bounding box with fancy corners on an image.
) -> None:
"""Draw a bounding box with fancy corners on an image.
Args:
image: Input image to draw on.
image: Input image to draw on (modified in-place).
bbox: Bounding box coordinates [x1, y1, x2, y2].
color: Color of the bounding box. Defaults to green.
thickness: Thickness of the bounding box lines. Defaults to 3.
proportion: Proportion of the corner length to the width/height of the bounding box. Defaults to 0.2.
color: Color of the bounding box in BGR. Defaults to green.
thickness: Thickness of the corner lines. Defaults to 3.
proportion: Proportion of corner length to box dimensions. Defaults to 0.2.
"""
x1, y1, x2, y2 = map(int, bbox)
width = x2 - x1
@@ -177,15 +193,14 @@ def draw_fancy_bbox(
def draw_gaze(
image: np.ndarray,
bbox: np.ndarray,
pitch: np.ndarray,
yaw: np.ndarray,
pitch: np.ndarray | float,
yaw: np.ndarray | float,
*,
draw_bbox: bool = True,
fancy_bbox: bool = True,
draw_angles: bool = True,
):
"""
Draws gaze direction with optional bounding box on an image.
) -> None:
"""Draw gaze direction with optional bounding box on an image.
Args:
image: Input image to draw on (modified in-place).
@@ -194,7 +209,7 @@ def draw_gaze(
yaw: Horizontal gaze angle in radians.
draw_bbox: Whether to draw the bounding box. Defaults to True.
fancy_bbox: Use fancy corner-style bbox. Defaults to True.
draw_angles: Whether to display pitch/yaw values as text. Defaults to False.
draw_angles: Whether to display pitch/yaw values as text. Defaults to True.
"""
x_min, y_min, x_max, y_max = map(int, bbox[:4])
@@ -275,29 +290,25 @@ def vis_parsing_maps(
save_image: bool = False,
save_path: str = 'result.png',
) -> np.ndarray:
"""
Visualizes face parsing segmentation mask by overlaying colored regions on the image.
"""Visualize face parsing segmentation mask by overlaying colored regions.
Args:
image: Input face image in RGB format with shape (H, W, 3).
segmentation_mask: Segmentation mask with shape (H, W) where each pixel
value represents a facial component class (0-18).
value represents a facial component class (0-18).
save_image: Whether to save the visualization to disk. Defaults to False.
save_path: Path to save the visualization if save_image is True.
Returns:
np.ndarray: Blended image with segmentation overlay in BGR format.
Blended image with segmentation overlay in BGR format.
Example:
>>> import cv2
>>> from uniface.parsing import BiSeNet
>>> from uniface.visualization import vis_parsing_maps
>>>
>>> parser = BiSeNet()
>>> face_image = cv2.imread('face.jpg')
>>> mask = parser.parse(face_image)
>>>
>>> # Visualize
>>> face_rgb = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
>>> result = vis_parsing_maps(face_rgb, mask)
>>> cv2.imwrite('parsed_face.jpg', result)