mirror of
https://github.com/yakhyo/uniface.git
synced 2025-12-30 00:52:25 +00:00
refactor: Standardize naming conventions (#47)
* refactor: Standardize naming conventions * chore: Update the version and re-run experiments * chore: Improve code quality tooling and documentation - Add pre-commit job to CI workflow for automated linting on PRs - Update uniface/__init__.py with copyright header, module docstring, and logically grouped exports - Revise CONTRIBUTING.md to reflect pre-commit handles all formatting - Remove redundant ruff check from CI (now handled by pre-commit) - Update build job Python version to 3.11 (matches requires-python)
This commit is contained in:
committed by
GitHub
parent
64ad0d2f53
commit
50226041c9
16
.github/workflows/ci.yml
vendored
16
.github/workflows/ci.yml
vendored
@@ -15,9 +15,20 @@ concurrency:
|
||||
cancel-in-progress: true
|
||||
|
||||
jobs:
|
||||
lint:
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 5
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: '3.11'
|
||||
- uses: pre-commit/action@v3.0.1
|
||||
|
||||
test:
|
||||
runs-on: ${{ matrix.os }}
|
||||
timeout-minutes: 15
|
||||
needs: lint
|
||||
|
||||
strategy:
|
||||
fail-fast: false
|
||||
@@ -44,9 +55,6 @@ jobs:
|
||||
run: |
|
||||
python -c "import onnxruntime as ort; print('Available providers:', ort.get_available_providers())"
|
||||
|
||||
- name: Lint with ruff
|
||||
run: ruff check .
|
||||
|
||||
- name: Run tests
|
||||
run: pytest -v --tb=short
|
||||
|
||||
@@ -65,7 +73,7 @@ jobs:
|
||||
- name: Set up Python
|
||||
uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: "3.10"
|
||||
python-version: "3.11"
|
||||
cache: "pip"
|
||||
|
||||
- name: Install build tools
|
||||
|
||||
1
.github/workflows/publish.yml
vendored
1
.github/workflows/publish.yml
vendored
@@ -117,4 +117,3 @@ jobs:
|
||||
with:
|
||||
files: dist/*
|
||||
generate_release_notes: true
|
||||
|
||||
|
||||
40
.pre-commit-config.yaml
Normal file
40
.pre-commit-config.yaml
Normal file
@@ -0,0 +1,40 @@
|
||||
# Pre-commit configuration for UniFace
|
||||
# See https://pre-commit.com for more information
|
||||
# See https://pre-commit.com/hooks.html for more hooks
|
||||
|
||||
repos:
|
||||
# General file checks
|
||||
- repo: https://github.com/pre-commit/pre-commit-hooks
|
||||
rev: v4.6.0
|
||||
hooks:
|
||||
- id: trailing-whitespace
|
||||
- id: end-of-file-fixer
|
||||
- id: check-yaml
|
||||
- id: check-toml
|
||||
- id: check-added-large-files
|
||||
args: ['--maxkb=1000']
|
||||
- id: check-merge-conflict
|
||||
- id: debug-statements
|
||||
- id: check-ast
|
||||
|
||||
# Ruff - Fast Python linter and formatter
|
||||
- repo: https://github.com/astral-sh/ruff-pre-commit
|
||||
rev: v0.8.4
|
||||
hooks:
|
||||
- id: ruff
|
||||
args: [--fix, --unsafe-fixes, --exit-non-zero-on-fix]
|
||||
- id: ruff-format
|
||||
|
||||
# Security checks
|
||||
- repo: https://github.com/PyCQA/bandit
|
||||
rev: 1.7.10
|
||||
hooks:
|
||||
- id: bandit
|
||||
args: [-c, pyproject.toml]
|
||||
additional_dependencies: ['bandit[toml]']
|
||||
exclude: ^tests/
|
||||
|
||||
# Configuration
|
||||
ci:
|
||||
autofix_commit_msg: 'style: auto-fix by pre-commit hooks'
|
||||
autoupdate_commit_msg: 'chore: update pre-commit hooks'
|
||||
183
CONTRIBUTING.md
183
CONTRIBUTING.md
@@ -16,33 +16,9 @@ Thank you for considering contributing to UniFace! We welcome contributions of a
|
||||
2. Create a new branch for your feature
|
||||
3. Write clear, documented code with type hints
|
||||
4. Add tests for new functionality
|
||||
5. Ensure all tests pass
|
||||
5. Ensure all tests pass and pre-commit hooks are satisfied
|
||||
6. Submit a pull request with a clear description
|
||||
|
||||
### Code Style
|
||||
|
||||
This project uses [Ruff](https://docs.astral.sh/ruff/) for linting and formatting.
|
||||
|
||||
```bash
|
||||
# Check for linting errors
|
||||
ruff check .
|
||||
|
||||
# Auto-fix linting errors
|
||||
ruff check . --fix
|
||||
|
||||
# Format code
|
||||
ruff format .
|
||||
```
|
||||
|
||||
**Guidelines:**
|
||||
- Follow PEP8 guidelines
|
||||
- Use type hints (Python 3.10+)
|
||||
- Write docstrings for public APIs
|
||||
- Line length: 120 characters
|
||||
- Keep code simple and readable
|
||||
|
||||
All PRs must pass `ruff check .` before merging.
|
||||
|
||||
## Development Setup
|
||||
|
||||
```bash
|
||||
@@ -51,31 +27,164 @@ cd uniface
|
||||
pip install -e ".[dev]"
|
||||
```
|
||||
|
||||
### Setting Up Pre-commit Hooks
|
||||
|
||||
We use [pre-commit](https://pre-commit.com/) to ensure code quality and consistency. Install and configure it:
|
||||
|
||||
```bash
|
||||
# Install pre-commit
|
||||
pip install pre-commit
|
||||
|
||||
# Install the git hooks
|
||||
pre-commit install
|
||||
|
||||
# (Optional) Run against all files
|
||||
pre-commit run --all-files
|
||||
```
|
||||
|
||||
Once installed, pre-commit will automatically run on every commit to check:
|
||||
|
||||
- Code formatting and linting (Ruff)
|
||||
- Security issues (Bandit)
|
||||
- General file hygiene (trailing whitespace, YAML/TOML validity, etc.)
|
||||
|
||||
**Note:** All PRs are automatically checked by CI. The merge button will only be available after all checks pass.
|
||||
|
||||
## Code Style
|
||||
|
||||
This project uses [Ruff](https://docs.astral.sh/ruff/) for linting and formatting, following modern Python best practices. Pre-commit handles all formatting automatically.
|
||||
|
||||
### Style Guidelines
|
||||
|
||||
#### General Rules
|
||||
|
||||
- **Line length:** 120 characters maximum
|
||||
- **Python version:** 3.11+ (use modern syntax)
|
||||
- **Quote style:** Single quotes for strings, double quotes for docstrings
|
||||
|
||||
#### Type Hints
|
||||
|
||||
Use modern Python 3.11+ type hints (PEP 585 and PEP 604):
|
||||
|
||||
```python
|
||||
# Preferred (modern)
|
||||
def process(items: list[str], config: dict[str, int] | None = None) -> tuple[int, str]:
|
||||
...
|
||||
|
||||
# Avoid (legacy)
|
||||
from typing import List, Dict, Optional, Tuple
|
||||
def process(items: List[str], config: Optional[Dict[str, int]] = None) -> Tuple[int, str]:
|
||||
...
|
||||
```
|
||||
|
||||
#### Docstrings
|
||||
|
||||
Use [Google-style docstrings](https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings) for all public APIs:
|
||||
|
||||
```python
|
||||
def detect_faces(image: np.ndarray, threshold: float = 0.5) -> list[Face]:
|
||||
"""Detect faces in an image.
|
||||
|
||||
Args:
|
||||
image: Input image as a numpy array with shape (H, W, C) in BGR format.
|
||||
threshold: Confidence threshold for filtering detections. Defaults to 0.5.
|
||||
|
||||
Returns:
|
||||
List of Face objects containing bounding boxes, confidence scores,
|
||||
and facial landmarks.
|
||||
|
||||
Raises:
|
||||
ValueError: If the input image has invalid dimensions.
|
||||
|
||||
Example:
|
||||
>>> from uniface import detect_faces
|
||||
>>> faces = detect_faces(image, threshold=0.8)
|
||||
>>> print(f"Found {len(faces)} faces")
|
||||
"""
|
||||
```
|
||||
|
||||
#### Import Order
|
||||
|
||||
Imports are automatically sorted by Ruff with the following order:
|
||||
|
||||
1. **Future** imports (`from __future__ import annotations`)
|
||||
2. **Standard library** (`os`, `sys`, `typing`, etc.)
|
||||
3. **Third-party** (`numpy`, `cv2`, `onnxruntime`, etc.)
|
||||
4. **First-party** (`uniface.*`)
|
||||
5. **Local** (relative imports like `.base`, `.models`)
|
||||
|
||||
```python
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
from typing import Any
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
from uniface.constants import RetinaFaceWeights
|
||||
from uniface.log import Logger
|
||||
|
||||
from .base import BaseDetector
|
||||
```
|
||||
|
||||
#### Code Comments
|
||||
|
||||
- Add comments for complex logic, magic numbers, and non-obvious behavior
|
||||
- Avoid comments that merely restate the code
|
||||
- Use `# TODO:` with issue links for planned improvements
|
||||
|
||||
```python
|
||||
# RetinaFace FPN strides and corresponding anchor sizes per level
|
||||
steps = [8, 16, 32]
|
||||
min_sizes = [[16, 32], [64, 128], [256, 512]]
|
||||
|
||||
# Add small epsilon to prevent division by zero
|
||||
similarity = np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b) + 1e-5)
|
||||
```
|
||||
|
||||
## Running Tests
|
||||
|
||||
```bash
|
||||
# Run all tests
|
||||
pytest tests/
|
||||
|
||||
# Run with verbose output
|
||||
pytest tests/ -v
|
||||
|
||||
# Run specific test file
|
||||
pytest tests/test_factory.py
|
||||
|
||||
# Run with coverage
|
||||
pytest tests/ --cov=uniface --cov-report=html
|
||||
```
|
||||
|
||||
## Adding New Features
|
||||
|
||||
When adding a new model or feature:
|
||||
|
||||
1. **Create the model class** in the appropriate submodule (e.g., `uniface/detection/`)
|
||||
2. **Add weight constants** to `uniface/constants.py` with URLs and SHA256 hashes
|
||||
3. **Export in `__init__.py`** files at both module and package levels
|
||||
4. **Write tests** in `tests/` directory
|
||||
5. **Add example usage** in `scripts/` or update existing notebooks
|
||||
6. **Update documentation** if needed
|
||||
|
||||
## Examples
|
||||
|
||||
Example notebooks demonstrating library usage:
|
||||
|
||||
| Example | Notebook |
|
||||
|---------|----------|
|
||||
| Face Detection | [face_detection.ipynb](examples/face_detection.ipynb) |
|
||||
| Face Alignment | [face_alignment.ipynb](examples/face_alignment.ipynb) |
|
||||
| Face Recognition | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
|
||||
| Face Verification | [face_verification.ipynb](examples/face_verification.ipynb) |
|
||||
| Face Search | [face_search.ipynb](examples/face_search.ipynb) |
|
||||
| Face Anonymization | [face_anonymization.ipynb](examples/face_anonymization.ipynb) |
|
||||
| Face Detection | [01_face_detection.ipynb](examples/01_face_detection.ipynb) |
|
||||
| Face Alignment | [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) |
|
||||
| Face Verification | [03_face_verification.ipynb](examples/03_face_verification.ipynb) |
|
||||
| Face Search | [04_face_search.ipynb](examples/04_face_search.ipynb) |
|
||||
| Face Analyzer | [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) |
|
||||
| Face Parsing | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
|
||||
| Face Anonymization | [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) |
|
||||
| Gaze Estimation | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
|
||||
|
||||
## Questions?
|
||||
|
||||
Open an issue or start a discussion on GitHub.
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
20
MODELS.md
20
MODELS.md
@@ -34,7 +34,7 @@ detector = RetinaFace() # Uses MNET_V2
|
||||
# Specific model
|
||||
detector = RetinaFace(
|
||||
model_name=RetinaFaceWeights.MNET_025, # Fastest
|
||||
conf_thresh=0.5,
|
||||
confidence_threshold=0.5,
|
||||
nms_thresh=0.4,
|
||||
input_size=(640, 640)
|
||||
)
|
||||
@@ -63,14 +63,14 @@ from uniface.constants import SCRFDWeights
|
||||
# Fast real-time detection
|
||||
detector = SCRFD(
|
||||
model_name=SCRFDWeights.SCRFD_500M_KPS,
|
||||
conf_thresh=0.5,
|
||||
confidence_threshold=0.5,
|
||||
input_size=(640, 640)
|
||||
)
|
||||
|
||||
# High accuracy
|
||||
detector = SCRFD(
|
||||
model_name=SCRFDWeights.SCRFD_10G_KPS,
|
||||
conf_thresh=0.5
|
||||
confidence_threshold=0.5
|
||||
)
|
||||
```
|
||||
|
||||
@@ -99,29 +99,29 @@ from uniface.constants import YOLOv5FaceWeights
|
||||
# Lightweight/Mobile
|
||||
detector = YOLOv5Face(
|
||||
model_name=YOLOv5FaceWeights.YOLOV5N,
|
||||
conf_thresh=0.6,
|
||||
confidence_threshold=0.6,
|
||||
nms_thresh=0.5
|
||||
)
|
||||
|
||||
# Real-time detection (recommended)
|
||||
detector = YOLOv5Face(
|
||||
model_name=YOLOv5FaceWeights.YOLOV5S,
|
||||
conf_thresh=0.6,
|
||||
confidence_threshold=0.6,
|
||||
nms_thresh=0.5
|
||||
)
|
||||
|
||||
# High accuracy
|
||||
detector = YOLOv5Face(
|
||||
model_name=YOLOv5FaceWeights.YOLOV5M,
|
||||
conf_thresh=0.6
|
||||
confidence_threshold=0.6
|
||||
)
|
||||
|
||||
# Detect faces with landmarks
|
||||
faces = detector.detect(image)
|
||||
for face in faces:
|
||||
bbox = face['bbox'] # [x1, y1, x2, y2]
|
||||
confidence = face['confidence']
|
||||
landmarks = face['landmarks'] # 5-point landmarks (5, 2)
|
||||
bbox = face.bbox # [x1, y1, x2, y2]
|
||||
confidence = face.confidence
|
||||
landmarks = face.landmarks # 5-point landmarks (5, 2)
|
||||
```
|
||||
|
||||
---
|
||||
@@ -466,7 +466,7 @@ spoofer = MiniFASNet(model_name=MiniFASNetWeights.V1SE)
|
||||
# Detect and check liveness
|
||||
faces = detector.detect(image)
|
||||
for face in faces:
|
||||
label_idx, score = spoofer.predict(image, face['bbox'])
|
||||
label_idx, score = spoofer.predict(image, face.bbox)
|
||||
# label_idx: 0 = Fake, 1 = Real
|
||||
label = 'Real' if label_idx == 1 else 'Fake'
|
||||
print(f"{label}: {score:.1%}")
|
||||
|
||||
@@ -545,7 +545,7 @@ from uniface.constants import RetinaFaceWeights, SCRFDWeights, YOLOv5FaceWeights
|
||||
# Fast detection (mobile/edge devices)
|
||||
detector = RetinaFace(
|
||||
model_name=RetinaFaceWeights.MNET_025,
|
||||
conf_thresh=0.7
|
||||
confidence_threshold=0.7
|
||||
)
|
||||
|
||||
# Balanced (recommended)
|
||||
@@ -556,14 +556,14 @@ detector = RetinaFace(
|
||||
# Real-time with high accuracy
|
||||
detector = YOLOv5Face(
|
||||
model_name=YOLOv5FaceWeights.YOLOV5S,
|
||||
conf_thresh=0.6,
|
||||
confidence_threshold=0.6,
|
||||
nms_thresh=0.5
|
||||
)
|
||||
|
||||
# High accuracy (server/GPU)
|
||||
detector = SCRFD(
|
||||
model_name=SCRFDWeights.SCRFD_10G_KPS,
|
||||
conf_thresh=0.5
|
||||
confidence_threshold=0.5
|
||||
)
|
||||
```
|
||||
|
||||
@@ -668,14 +668,14 @@ Explore interactive examples for common tasks:
|
||||
|
||||
| Example | Description | Notebook |
|
||||
|---------|-------------|----------|
|
||||
| **Face Detection** | Detect faces and facial landmarks | [face_detection.ipynb](examples/face_detection.ipynb) |
|
||||
| **Face Alignment** | Align and crop faces for recognition | [face_alignment.ipynb](examples/face_alignment.ipynb) |
|
||||
| **Face Recognition** | Extract face embeddings and compare faces | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
|
||||
| **Face Verification** | Compare two faces to verify identity | [face_verification.ipynb](examples/face_verification.ipynb) |
|
||||
| **Face Search** | Find a person in a group photo | [face_search.ipynb](examples/face_search.ipynb) |
|
||||
| **Face Parsing** | Segment face into semantic components | [face_parsing.ipynb](examples/face_parsing.ipynb) |
|
||||
| **Face Anonymization** | Blur or pixelate faces for privacy protection | [face_anonymization.ipynb](examples/face_anonymization.ipynb) |
|
||||
| **Gaze Estimation** | Estimate gaze direction | [gaze_estimation.ipynb](examples/gaze_estimation.ipynb) |
|
||||
| **Face Detection** | Detect faces and facial landmarks | [01_face_detection.ipynb](examples/01_face_detection.ipynb) |
|
||||
| **Face Alignment** | Align and crop faces for recognition | [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) |
|
||||
| **Face Verification** | Compare two faces to verify identity | [03_face_verification.ipynb](examples/03_face_verification.ipynb) |
|
||||
| **Face Search** | Find a person in a group photo | [04_face_search.ipynb](examples/04_face_search.ipynb) |
|
||||
| **Face Analyzer** | All-in-one detection, recognition & attributes | [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) |
|
||||
| **Face Parsing** | Segment face into semantic components | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
|
||||
| **Face Anonymization** | Blur or pixelate faces for privacy protection | [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) |
|
||||
| **Gaze Estimation** | Estimate gaze direction | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
|
||||
|
||||
### Additional Resources
|
||||
|
||||
|
||||
34
README.md
34
README.md
@@ -321,7 +321,7 @@ detector = RetinaFace()
|
||||
# Create with custom config
|
||||
detector = SCRFD(
|
||||
model_name=SCRFDWeights.SCRFD_10G_KPS, # SCRFDWeights.SCRFD_500M_KPS
|
||||
conf_thresh=0.4,
|
||||
confidence_threshold=0.4,
|
||||
input_size=(640, 640)
|
||||
)
|
||||
# Or with defaults settings: detector = SCRFD()
|
||||
@@ -340,16 +340,16 @@ from uniface.constants import RetinaFaceWeights, YOLOv5FaceWeights
|
||||
# Detection
|
||||
detector = RetinaFace(
|
||||
model_name=RetinaFaceWeights.MNET_V2,
|
||||
conf_thresh=0.5,
|
||||
nms_thresh=0.4
|
||||
confidence_threshold=0.5,
|
||||
nms_threshold=0.4
|
||||
)
|
||||
# Or detector = RetinaFace()
|
||||
|
||||
# YOLOv5-Face detection
|
||||
detector = YOLOv5Face(
|
||||
model_name=YOLOv5FaceWeights.YOLOV5S,
|
||||
conf_thresh=0.6,
|
||||
nms_thresh=0.5
|
||||
confidence_threshold=0.6,
|
||||
nms_threshold=0.5
|
||||
)
|
||||
# Or detector = YOLOv5Face
|
||||
|
||||
@@ -365,7 +365,7 @@ recognizer = SphereFace() # Angular softmax alternative
|
||||
from uniface import detect_faces
|
||||
|
||||
# One-line face detection
|
||||
faces = detect_faces(image, method='retinaface', conf_thresh=0.8) # methods: retinaface, scrfd, yolov5face
|
||||
faces = detect_faces(image, method='retinaface', confidence_threshold=0.8) # methods: retinaface, scrfd, yolov5face
|
||||
```
|
||||
|
||||
### Key Parameters (quick reference)
|
||||
@@ -374,9 +374,9 @@ faces = detect_faces(image, method='retinaface', conf_thresh=0.8) # methods: re
|
||||
|
||||
| Class | Key params (defaults) | Notes |
|
||||
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------- |
|
||||
| `RetinaFace` | `model_name=RetinaFaceWeights.MNET_V2`, `conf_thresh=0.5`, `nms_thresh=0.4`, `input_size=(640, 640)`, `dynamic_size=False` | Supports 5-point landmarks |
|
||||
| `SCRFD` | `model_name=SCRFDWeights.SCRFD_10G_KPS`, `conf_thresh=0.5`, `nms_thresh=0.4`, `input_size=(640, 640)` | Supports 5-point landmarks |
|
||||
| `YOLOv5Face` | `model_name=YOLOv5FaceWeights.YOLOV5S`, `conf_thresh=0.6`, `nms_thresh=0.5`, `input_size=640` (fixed) | Supports 5-point landmarks; models: YOLOV5N/S/M; `input_size` must be 640 |
|
||||
| `RetinaFace` | `model_name=RetinaFaceWeights.MNET_V2`, `confidence_threshold=0.5`, `nms_threshold=0.4`, `input_size=(640, 640)`, `dynamic_size=False` | Supports 5-point landmarks |
|
||||
| `SCRFD` | `model_name=SCRFDWeights.SCRFD_10G_KPS`, `confidence_threshold=0.5`, `nms_threshold=0.4`, `input_size=(640, 640)` | Supports 5-point landmarks |
|
||||
| `YOLOv5Face` | `model_name=YOLOv5FaceWeights.YOLOV5S`, `confidence_threshold=0.6`, `nms_threshold=0.5`, `input_size=640` (fixed) | Supports 5-point landmarks; models: YOLOV5N/S/M; `input_size` must be 640 |
|
||||
|
||||
**Recognition**
|
||||
|
||||
@@ -454,14 +454,14 @@ Interactive examples covering common face analysis tasks:
|
||||
|
||||
| Example | Description | Notebook |
|
||||
|---------|-------------|----------|
|
||||
| **Face Detection** | Detect faces and facial landmarks | [face_detection.ipynb](examples/face_detection.ipynb) |
|
||||
| **Face Alignment** | Align and crop faces for recognition | [face_alignment.ipynb](examples/face_alignment.ipynb) |
|
||||
| **Face Recognition** | Extract face embeddings and compare faces | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
|
||||
| **Face Verification** | Compare two faces to verify identity | [face_verification.ipynb](examples/face_verification.ipynb) |
|
||||
| **Face Search** | Find a person in a group photo | [face_search.ipynb](examples/face_search.ipynb) |
|
||||
| **Face Parsing** | Segment face into semantic components | [face_parsing.ipynb](examples/face_parsing.ipynb) |
|
||||
| **Face Anonymization** | Blur or pixelate faces for privacy protection | [face_anonymization.ipynb](examples/face_anonymization.ipynb) |
|
||||
| **Gaze Estimation** | Estimate gaze direction from face images | [gaze_estimation.ipynb](examples/gaze_estimation.ipynb) |
|
||||
| **Face Detection** | Detect faces and facial landmarks | [01_face_detection.ipynb](examples/01_face_detection.ipynb) |
|
||||
| **Face Alignment** | Align and crop faces for recognition | [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) |
|
||||
| **Face Verification** | Compare two faces to verify identity | [03_face_verification.ipynb](examples/03_face_verification.ipynb) |
|
||||
| **Face Search** | Find a person in a group photo | [04_face_search.ipynb](examples/04_face_search.ipynb) |
|
||||
| **Face Analyzer** | All-in-one detection, recognition & attributes | [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) |
|
||||
| **Face Parsing** | Segment face into semantic components | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
|
||||
| **Face Anonymization** | Blur or pixelate faces for privacy protection | [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) |
|
||||
| **Gaze Estimation** | Estimate gaze direction from face images | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
|
||||
|
||||
### Webcam Face Detection
|
||||
|
||||
|
||||
@@ -44,7 +44,7 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"1.6.0\n"
|
||||
"2.0.0\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
@@ -82,8 +82,8 @@
|
||||
],
|
||||
"source": [
|
||||
"detector = RetinaFace(\n",
|
||||
" conf_thresh=0.5,\n",
|
||||
" nms_thresh=0.4,\n",
|
||||
" confidence_threshold=0.5,\n",
|
||||
" nms_threshold=0.4,\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
@@ -48,7 +48,7 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"1.6.0\n"
|
||||
"2.0.0\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
@@ -87,8 +87,8 @@
|
||||
],
|
||||
"source": [
|
||||
"detector = RetinaFace(\n",
|
||||
" conf_thresh=0.5,\n",
|
||||
" nms_thresh=0.4,\n",
|
||||
" confidence_threshold=0.5,\n",
|
||||
" nms_threshold=0.4,\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
@@ -37,7 +37,7 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"1.6.0\n"
|
||||
"2.0.0\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
@@ -78,7 +78,7 @@
|
||||
],
|
||||
"source": [
|
||||
"analyzer = FaceAnalyzer(\n",
|
||||
" detector=RetinaFace(conf_thresh=0.5),\n",
|
||||
" detector=RetinaFace(confidence_threshold=0.5),\n",
|
||||
" recognizer=ArcFace()\n",
|
||||
")"
|
||||
]
|
||||
@@ -42,7 +42,7 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"1.6.0\n"
|
||||
"2.0.0\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
@@ -74,7 +74,7 @@
|
||||
],
|
||||
"source": [
|
||||
"analyzer = FaceAnalyzer(\n",
|
||||
" detector=RetinaFace(conf_thresh=0.5),\n",
|
||||
" detector=RetinaFace(confidence_threshold=0.5),\n",
|
||||
" recognizer=ArcFace()\n",
|
||||
")"
|
||||
]
|
||||
@@ -44,7 +44,7 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"1.6.0\n"
|
||||
"2.0.0\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
@@ -88,7 +88,7 @@
|
||||
],
|
||||
"source": [
|
||||
"analyzer = FaceAnalyzer(\n",
|
||||
" detector=RetinaFace(conf_thresh=0.5),\n",
|
||||
" detector=RetinaFace(confidence_threshold=0.5),\n",
|
||||
" recognizer=ArcFace(),\n",
|
||||
" age_gender=AgeGender()\n",
|
||||
")"
|
||||
@@ -46,7 +46,7 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"UniFace version: 1.6.0\n"
|
||||
"UniFace version: 2.0.0\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
File diff suppressed because one or more lines are too long
@@ -44,7 +44,7 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"UniFace version: 1.6.0\n"
|
||||
"UniFace version: 2.0.0\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
@@ -86,7 +86,7 @@
|
||||
],
|
||||
"source": [
|
||||
"# Initialize face detector\n",
|
||||
"detector = RetinaFace(conf_thresh=0.5)\n",
|
||||
"detector = RetinaFace(confidence_threshold=0.5)\n",
|
||||
"\n",
|
||||
"# Initialize gaze estimator (uses ResNet34 by default)\n",
|
||||
"gaze_estimator = MobileGaze()"
|
||||
@@ -1,6 +1,6 @@
|
||||
[project]
|
||||
name = "uniface"
|
||||
version = "1.6.0"
|
||||
version = "2.0.0"
|
||||
description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Landmark Analysis, Face Parsing, Gaze Estimation, Age, and Gender Detection"
|
||||
readme = "README.md"
|
||||
license = { text = "MIT" }
|
||||
@@ -89,13 +89,60 @@ exclude = [
|
||||
|
||||
[tool.ruff.format]
|
||||
quote-style = "single"
|
||||
|
||||
docstring-code-format = true
|
||||
|
||||
[tool.ruff.lint]
|
||||
select = ["E", "F", "I", "W"]
|
||||
select = [
|
||||
"E", # pycodestyle errors
|
||||
"F", # pyflakes
|
||||
"I", # isort
|
||||
"W", # pycodestyle warnings
|
||||
"UP", # pyupgrade (modern Python syntax)
|
||||
"B", # flake8-bugbear
|
||||
"C4", # flake8-comprehensions
|
||||
"SIM", # flake8-simplify
|
||||
"RUF", # Ruff-specific rules
|
||||
]
|
||||
ignore = [
|
||||
"E501", # Line too long (handled by formatter)
|
||||
"B008", # Function call in default argument (common in FastAPI/Click)
|
||||
"SIM108", # Use ternary operator (can reduce readability)
|
||||
"RUF022", # Allow logical grouping in __all__ instead of alphabetical sorting
|
||||
]
|
||||
|
||||
[tool.ruff.lint.flake8-quotes]
|
||||
docstring-quotes = "double"
|
||||
|
||||
[tool.ruff.lint.isort]
|
||||
force-single-line = false
|
||||
force-sort-within-sections = true
|
||||
known-first-party = ["uniface"]
|
||||
section-order = [
|
||||
"future",
|
||||
"standard-library",
|
||||
"third-party",
|
||||
"first-party",
|
||||
"local-folder",
|
||||
]
|
||||
|
||||
[tool.ruff.lint.pydocstyle]
|
||||
convention = "google"
|
||||
|
||||
[tool.mypy]
|
||||
python_version = "3.11"
|
||||
warn_return_any = false
|
||||
warn_unused_ignores = true
|
||||
ignore_missing_imports = true
|
||||
exclude = ["tests/", "scripts/", "examples/"]
|
||||
# Disable strict return type checking for numpy operations
|
||||
disable_error_code = ["no-any-return"]
|
||||
|
||||
[tool.bandit]
|
||||
exclude_dirs = ["tests", "scripts", "examples"]
|
||||
skips = ["B101", "B614"] # B101: assert, B614: torch.jit.load (models are SHA256 verified)
|
||||
|
||||
[tool.pytest.ini_options]
|
||||
testpaths = ["tests"]
|
||||
python_files = ["test_*.py"]
|
||||
python_functions = ["test_*"]
|
||||
addopts = "-v --tb=short"
|
||||
|
||||
@@ -28,9 +28,9 @@ def process_image(detector, image_path: Path, output_path: Path, threshold: floa
|
||||
faces = detector.detect(image)
|
||||
|
||||
# unpack face data for visualization
|
||||
bboxes = [f['bbox'] for f in faces]
|
||||
scores = [f['confidence'] for f in faces]
|
||||
landmarks = [f['landmarks'] for f in faces]
|
||||
bboxes = [f.bbox for f in faces]
|
||||
scores = [f.confidence for f in faces]
|
||||
landmarks = [f.landmarks for f in faces]
|
||||
draw_detections(
|
||||
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
||||
)
|
||||
|
||||
@@ -39,17 +39,17 @@ def process_image(
|
||||
if not faces:
|
||||
return
|
||||
|
||||
bboxes = [f['bbox'] for f in faces]
|
||||
scores = [f['confidence'] for f in faces]
|
||||
landmarks = [f['landmarks'] for f in faces]
|
||||
bboxes = [f.bbox for f in faces]
|
||||
scores = [f.confidence for f in faces]
|
||||
landmarks = [f.landmarks for f in faces]
|
||||
draw_detections(
|
||||
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
||||
)
|
||||
|
||||
for i, face in enumerate(faces):
|
||||
result = age_gender.predict(image, face['bbox'])
|
||||
result = age_gender.predict(image, face.bbox)
|
||||
print(f' Face {i + 1}: {result.sex}, {result.age} years old')
|
||||
draw_age_gender_label(image, face['bbox'], result.sex, result.age)
|
||||
draw_age_gender_label(image, face.bbox, result.sex, result.age)
|
||||
|
||||
os.makedirs(save_dir, exist_ok=True)
|
||||
output_path = os.path.join(save_dir, f'{Path(image_path).stem}_age_gender.jpg')
|
||||
@@ -74,16 +74,16 @@ def run_webcam(detector, age_gender, threshold: float = 0.6):
|
||||
faces = detector.detect(frame)
|
||||
|
||||
# unpack face data for visualization
|
||||
bboxes = [f['bbox'] for f in faces]
|
||||
scores = [f['confidence'] for f in faces]
|
||||
landmarks = [f['landmarks'] for f in faces]
|
||||
bboxes = [f.bbox for f in faces]
|
||||
scores = [f.confidence for f in faces]
|
||||
landmarks = [f.landmarks for f in faces]
|
||||
draw_detections(
|
||||
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
||||
)
|
||||
|
||||
for face in faces:
|
||||
result = age_gender.predict(frame, face['bbox'])
|
||||
draw_age_gender_label(frame, face['bbox'], result.sex, result.age)
|
||||
result = age_gender.predict(frame, face.bbox)
|
||||
draw_age_gender_label(frame, face.bbox, result.sex, result.age)
|
||||
|
||||
cv2.putText(
|
||||
frame,
|
||||
|
||||
@@ -33,9 +33,9 @@ def process_image(
|
||||
from uniface.visualization import draw_detections
|
||||
|
||||
preview = image.copy()
|
||||
bboxes = [face['bbox'] for face in faces]
|
||||
scores = [face['confidence'] for face in faces]
|
||||
landmarks = [face['landmarks'] for face in faces]
|
||||
bboxes = [face.bbox for face in faces]
|
||||
scores = [face.confidence for face in faces]
|
||||
landmarks = [face.landmarks for face in faces]
|
||||
draw_detections(preview, bboxes, scores, landmarks)
|
||||
|
||||
# Show preview
|
||||
@@ -157,7 +157,7 @@ Examples:
|
||||
|
||||
# Detection
|
||||
parser.add_argument(
|
||||
'--conf-thresh',
|
||||
'--confidence-threshold',
|
||||
type=float,
|
||||
default=0.5,
|
||||
help='Detection confidence threshold (default: 0.5)',
|
||||
@@ -183,8 +183,8 @@ Examples:
|
||||
color = tuple(color_values)
|
||||
|
||||
# Initialize detector
|
||||
print(f'Initializing face detector (conf_thresh={args.conf_thresh})...')
|
||||
detector = RetinaFace(conf_thresh=args.conf_thresh)
|
||||
print(f'Initializing face detector (confidence_threshold={args.confidence_threshold})...')
|
||||
detector = RetinaFace(confidence_threshold=args.confidence_threshold)
|
||||
|
||||
# Initialize blurrer
|
||||
print(f'Initializing blur method: {args.method}')
|
||||
|
||||
@@ -1,6 +1,15 @@
|
||||
# Face detection on image or webcam
|
||||
# Usage: python run_detection.py --image path/to/image.jpg
|
||||
# python run_detection.py --webcam
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
"""Face detection on image or webcam.
|
||||
|
||||
Usage:
|
||||
python run_detection.py --image path/to/image.jpg
|
||||
python run_detection.py --webcam
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import os
|
||||
@@ -20,9 +29,9 @@ def process_image(detector, image_path: str, threshold: float = 0.6, save_dir: s
|
||||
faces = detector.detect(image)
|
||||
|
||||
if faces:
|
||||
bboxes = [face['bbox'] for face in faces]
|
||||
scores = [face['confidence'] for face in faces]
|
||||
landmarks = [face['landmarks'] for face in faces]
|
||||
bboxes = [face.bbox for face in faces]
|
||||
scores = [face.confidence for face in faces]
|
||||
landmarks = [face.landmarks for face in faces]
|
||||
draw_detections(image, bboxes, scores, landmarks, vis_threshold=threshold)
|
||||
|
||||
os.makedirs(save_dir, exist_ok=True)
|
||||
@@ -48,9 +57,9 @@ def run_webcam(detector, threshold: float = 0.6):
|
||||
faces = detector.detect(frame)
|
||||
|
||||
# unpack face data for visualization
|
||||
bboxes = [f['bbox'] for f in faces]
|
||||
scores = [f['confidence'] for f in faces]
|
||||
landmarks = [f['landmarks'] for f in faces]
|
||||
bboxes = [f.bbox for f in faces]
|
||||
scores = [f.confidence for f in faces]
|
||||
landmarks = [f.landmarks for f in faces]
|
||||
draw_detections(
|
||||
image=frame,
|
||||
bboxes=bboxes,
|
||||
|
||||
@@ -39,17 +39,17 @@ def process_image(
|
||||
if not faces:
|
||||
return
|
||||
|
||||
bboxes = [f['bbox'] for f in faces]
|
||||
scores = [f['confidence'] for f in faces]
|
||||
landmarks = [f['landmarks'] for f in faces]
|
||||
bboxes = [f.bbox for f in faces]
|
||||
scores = [f.confidence for f in faces]
|
||||
landmarks = [f.landmarks for f in faces]
|
||||
draw_detections(
|
||||
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
||||
)
|
||||
|
||||
for i, face in enumerate(faces):
|
||||
emotion, confidence = emotion_predictor.predict(image, face['landmarks'])
|
||||
emotion, confidence = emotion_predictor.predict(image, face.landmarks)
|
||||
print(f' Face {i + 1}: {emotion} (confidence: {confidence:.3f})')
|
||||
draw_emotion_label(image, face['bbox'], emotion, confidence)
|
||||
draw_emotion_label(image, face.bbox, emotion, confidence)
|
||||
|
||||
os.makedirs(save_dir, exist_ok=True)
|
||||
output_path = os.path.join(save_dir, f'{Path(image_path).stem}_emotion.jpg')
|
||||
@@ -74,14 +74,16 @@ def run_webcam(detector, emotion_predictor, threshold: float = 0.6):
|
||||
faces = detector.detect(frame)
|
||||
|
||||
# unpack face data for visualization
|
||||
bboxes = [f['bbox'] for f in faces]
|
||||
scores = [f['confidence'] for f in faces]
|
||||
landmarks = [f['landmarks'] for f in faces]
|
||||
draw_detections(frame, bboxes, scores, landmarks, vis_threshold=threshold)
|
||||
bboxes = [f.bbox for f in faces]
|
||||
scores = [f.confidence for f in faces]
|
||||
landmarks = [f.landmarks for f in faces]
|
||||
draw_detections(
|
||||
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
||||
)
|
||||
|
||||
for face in faces:
|
||||
emotion, confidence = emotion_predictor.predict(frame, face['landmarks'])
|
||||
draw_emotion_label(frame, face['bbox'], emotion, confidence)
|
||||
emotion, confidence = emotion_predictor.predict(frame, face.landmarks)
|
||||
draw_emotion_label(frame, face.bbox, emotion, confidence)
|
||||
|
||||
cv2.putText(
|
||||
frame,
|
||||
|
||||
@@ -7,6 +7,7 @@ import os
|
||||
from pathlib import Path
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
from uniface import RetinaFace
|
||||
from uniface.constants import ParsingWeights
|
||||
@@ -14,7 +15,49 @@ from uniface.parsing import BiSeNet
|
||||
from uniface.visualization import vis_parsing_maps
|
||||
|
||||
|
||||
def process_image(detector, parser, image_path: str, save_dir: str = 'outputs'):
|
||||
def expand_bbox(
|
||||
bbox: np.ndarray,
|
||||
image_shape: tuple[int, int],
|
||||
expand_ratio: float = 0.2,
|
||||
expand_top_ratio: float = 0.4,
|
||||
) -> tuple[int, int, int, int]:
|
||||
"""
|
||||
Expand bounding box to include full head region for face parsing.
|
||||
|
||||
Face detection typically returns tight face boxes, but face parsing
|
||||
requires the full head including hair, ears, and neck.
|
||||
|
||||
Args:
|
||||
bbox: Original bounding box [x1, y1, x2, y2].
|
||||
image_shape: Image dimensions as (height, width).
|
||||
expand_ratio: Expansion ratio for left, right, and bottom (default: 0.2 = 20%).
|
||||
expand_top_ratio: Expansion ratio for top to capture hair/forehead (default: 0.4 = 40%).
|
||||
|
||||
Returns:
|
||||
Tuple[int, int, int, int]: Expanded bbox (x1, y1, x2, y2) clamped to image bounds.
|
||||
"""
|
||||
x1, y1, x2, y2 = map(int, bbox[:4])
|
||||
height, width = image_shape[:2]
|
||||
|
||||
# Calculate face dimensions
|
||||
face_width = x2 - x1
|
||||
face_height = y2 - y1
|
||||
|
||||
# Calculate expansion amounts
|
||||
expand_x = int(face_width * expand_ratio)
|
||||
expand_y_bottom = int(face_height * expand_ratio)
|
||||
expand_y_top = int(face_height * expand_top_ratio)
|
||||
|
||||
# Expand and clamp to image boundaries
|
||||
new_x1 = max(0, x1 - expand_x)
|
||||
new_y1 = max(0, y1 - expand_y_top)
|
||||
new_x2 = min(width, x2 + expand_x)
|
||||
new_y2 = min(height, y2 + expand_y_bottom)
|
||||
|
||||
return new_x1, new_y1, new_x2, new_y2
|
||||
|
||||
|
||||
def process_image(detector, parser, image_path: str, save_dir: str = 'outputs', expand_ratio: float = 0.2):
|
||||
image = cv2.imread(image_path)
|
||||
if image is None:
|
||||
print(f"Error: Failed to load image from '{image_path}'")
|
||||
@@ -26,8 +69,8 @@ def process_image(detector, parser, image_path: str, save_dir: str = 'outputs'):
|
||||
result_image = image.copy()
|
||||
|
||||
for i, face in enumerate(faces):
|
||||
bbox = face['bbox']
|
||||
x1, y1, x2, y2 = map(int, bbox[:4])
|
||||
# Expand bbox to include full head for parsing
|
||||
x1, y1, x2, y2 = expand_bbox(face.bbox, image.shape, expand_ratio=expand_ratio)
|
||||
face_crop = image[y1:y2, x1:x2]
|
||||
|
||||
if face_crop.size == 0:
|
||||
@@ -44,7 +87,7 @@ def process_image(detector, parser, image_path: str, save_dir: str = 'outputs'):
|
||||
# Place the visualization back on the original image
|
||||
result_image[y1:y2, x1:x2] = vis_result
|
||||
|
||||
# Draw bounding box
|
||||
# Draw expanded bounding box
|
||||
cv2.rectangle(result_image, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
||||
|
||||
os.makedirs(save_dir, exist_ok=True)
|
||||
@@ -53,7 +96,7 @@ def process_image(detector, parser, image_path: str, save_dir: str = 'outputs'):
|
||||
print(f'Output saved: {output_path}')
|
||||
|
||||
|
||||
def run_webcam(detector, parser):
|
||||
def run_webcam(detector, parser, expand_ratio: float = 0.2):
|
||||
cap = cv2.VideoCapture(0)
|
||||
if not cap.isOpened():
|
||||
print('Cannot open webcam')
|
||||
@@ -70,8 +113,8 @@ def run_webcam(detector, parser):
|
||||
faces = detector.detect(frame)
|
||||
|
||||
for face in faces:
|
||||
bbox = face['bbox']
|
||||
x1, y1, x2, y2 = map(int, bbox[:4])
|
||||
# Expand bbox to include full head for parsing
|
||||
x1, y1, x2, y2 = expand_bbox(face.bbox, frame.shape, expand_ratio=expand_ratio)
|
||||
face_crop = frame[y1:y2, x1:x2]
|
||||
|
||||
if face_crop.size == 0:
|
||||
@@ -87,7 +130,7 @@ def run_webcam(detector, parser):
|
||||
# Place the visualization back on the frame
|
||||
frame[y1:y2, x1:x2] = vis_result
|
||||
|
||||
# Draw bounding box
|
||||
# Draw expanded bounding box
|
||||
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
||||
|
||||
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
|
||||
@@ -108,6 +151,12 @@ def main():
|
||||
parser_arg.add_argument(
|
||||
'--model', type=str, default=ParsingWeights.RESNET18, choices=[ParsingWeights.RESNET18, ParsingWeights.RESNET34]
|
||||
)
|
||||
parser_arg.add_argument(
|
||||
'--expand-ratio',
|
||||
type=float,
|
||||
default=0.2,
|
||||
help='Bbox expansion ratio for full head coverage (default: 0.2 = 20%%)',
|
||||
)
|
||||
args = parser_arg.parse_args()
|
||||
|
||||
if not args.image and not args.webcam:
|
||||
@@ -117,9 +166,9 @@ def main():
|
||||
parser = BiSeNet(model_name=ParsingWeights.RESNET34)
|
||||
|
||||
if args.webcam:
|
||||
run_webcam(detector, parser)
|
||||
run_webcam(detector, parser, expand_ratio=args.expand_ratio)
|
||||
else:
|
||||
process_image(detector, parser, args.image, args.save_dir)
|
||||
process_image(detector, parser, args.image, args.save_dir, expand_ratio=args.expand_ratio)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
|
||||
@@ -29,7 +29,7 @@ def extract_reference_embedding(detector, recognizer, image_path: str) -> np.nda
|
||||
if not faces:
|
||||
raise RuntimeError('No faces found in reference image.')
|
||||
|
||||
landmarks = faces[0]['landmarks']
|
||||
landmarks = faces[0].landmarks
|
||||
return recognizer.get_normalized_embedding(image, landmarks)
|
||||
|
||||
|
||||
@@ -49,8 +49,8 @@ def run_webcam(detector, recognizer, ref_embedding: np.ndarray, threshold: float
|
||||
faces = detector.detect(frame)
|
||||
|
||||
for face in faces:
|
||||
bbox = face['bbox']
|
||||
landmarks = face['landmarks']
|
||||
bbox = face.bbox
|
||||
landmarks = face.landmarks
|
||||
x1, y1, x2, y2 = map(int, bbox)
|
||||
|
||||
embedding = recognizer.get_normalized_embedding(frame, landmarks)
|
||||
|
||||
@@ -24,7 +24,7 @@ def process_image(detector, gaze_estimator, image_path: str, save_dir: str = 'ou
|
||||
print(f'Detected {len(faces)} face(s)')
|
||||
|
||||
for i, face in enumerate(faces):
|
||||
bbox = face['bbox']
|
||||
bbox = face.bbox
|
||||
x1, y1, x2, y2 = map(int, bbox[:4])
|
||||
face_crop = image[y1:y2, x1:x2]
|
||||
|
||||
@@ -60,7 +60,7 @@ def run_webcam(detector, gaze_estimator):
|
||||
faces = detector.detect(frame)
|
||||
|
||||
for face in faces:
|
||||
bbox = face['bbox']
|
||||
bbox = face.bbox
|
||||
x1, y1, x2, y2 = map(int, bbox[:4])
|
||||
face_crop = frame[y1:y2, x1:x2]
|
||||
|
||||
|
||||
@@ -24,7 +24,7 @@ def process_image(detector, landmarker, image_path: str, save_dir: str = 'output
|
||||
return
|
||||
|
||||
for i, face in enumerate(faces):
|
||||
bbox = face['bbox']
|
||||
bbox = face.bbox
|
||||
x1, y1, x2, y2 = map(int, bbox)
|
||||
cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
||||
|
||||
@@ -67,7 +67,7 @@ def run_webcam(detector, landmarker):
|
||||
faces = detector.detect(frame)
|
||||
|
||||
for face in faces:
|
||||
bbox = face['bbox']
|
||||
bbox = face.bbox
|
||||
x1, y1, x2, y2 = map(int, bbox)
|
||||
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
||||
|
||||
|
||||
@@ -70,13 +70,13 @@ def process_image(detector, spoofer, image_path: str, save_dir: str = 'outputs')
|
||||
|
||||
# Run anti-spoofing on each face
|
||||
for i, face in enumerate(faces, 1):
|
||||
label_idx, score = spoofer.predict(image, face['bbox'])
|
||||
label_idx, score = spoofer.predict(image, face.bbox)
|
||||
# label_idx: 0 = Fake, 1 = Real
|
||||
label = 'Real' if label_idx == 1 else 'Fake'
|
||||
print(f' Face {i}: {label} ({score:.1%})')
|
||||
|
||||
# Draw result on image
|
||||
draw_spoofing_result(image, face['bbox'], label_idx, score)
|
||||
draw_spoofing_result(image, face.bbox, label_idx, score)
|
||||
|
||||
# Save output
|
||||
os.makedirs(save_dir, exist_ok=True)
|
||||
@@ -128,8 +128,8 @@ def process_video(detector, spoofer, source, save_dir: str = 'outputs') -> None:
|
||||
|
||||
# Run anti-spoofing on each face
|
||||
for face in faces:
|
||||
label_idx, score = spoofer.predict(frame, face['bbox'])
|
||||
draw_spoofing_result(frame, face['bbox'], label_idx, score)
|
||||
label_idx, score = spoofer.predict(frame, face.bbox)
|
||||
draw_spoofing_result(frame, face.bbox, label_idx, score)
|
||||
|
||||
# Write frame
|
||||
writer.write(frame)
|
||||
|
||||
@@ -52,9 +52,9 @@ def process_video(
|
||||
faces = detector.detect(frame)
|
||||
total_faces += len(faces)
|
||||
|
||||
bboxes = [f['bbox'] for f in faces]
|
||||
scores = [f['confidence'] for f in faces]
|
||||
landmarks = [f['landmarks'] for f in faces]
|
||||
bboxes = [f.bbox for f in faces]
|
||||
scores = [f.confidence for f in faces]
|
||||
landmarks = [f.landmarks for f in faces]
|
||||
draw_detections(
|
||||
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
||||
)
|
||||
|
||||
@@ -1,3 +1,11 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
"""Tests for AgeGender attribute predictor."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import numpy as np
|
||||
import pytest
|
||||
|
||||
|
||||
@@ -1,3 +1,11 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
"""Tests for factory functions (create_detector, create_recognizer, etc.)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import numpy as np
|
||||
import pytest
|
||||
|
||||
@@ -35,8 +43,8 @@ def test_create_detector_with_config():
|
||||
detector = create_detector(
|
||||
'retinaface',
|
||||
model_name=RetinaFaceWeights.MNET_V2,
|
||||
conf_thresh=0.8,
|
||||
nms_thresh=0.3,
|
||||
confidence_threshold=0.8,
|
||||
nms_threshold=0.3,
|
||||
)
|
||||
assert detector is not None, 'Failed to create detector with custom config'
|
||||
|
||||
@@ -53,7 +61,7 @@ def test_create_detector_scrfd_with_model():
|
||||
"""
|
||||
Test creating SCRFD detector with specific model.
|
||||
"""
|
||||
detector = create_detector('scrfd', model_name=SCRFDWeights.SCRFD_10G_KPS, conf_thresh=0.5)
|
||||
detector = create_detector('scrfd', model_name=SCRFDWeights.SCRFD_10G_KPS, confidence_threshold=0.5)
|
||||
assert detector is not None, 'Failed to create SCRFD with specific model'
|
||||
|
||||
|
||||
@@ -141,13 +149,13 @@ def test_detect_faces_with_threshold():
|
||||
Test detect_faces with custom confidence threshold.
|
||||
"""
|
||||
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
|
||||
faces = detect_faces(mock_image, method='retinaface', conf_thresh=0.8)
|
||||
faces = detect_faces(mock_image, method='retinaface', confidence_threshold=0.8)
|
||||
|
||||
assert isinstance(faces, list), 'detect_faces should return a list'
|
||||
|
||||
# All detections should respect threshold
|
||||
for face in faces:
|
||||
assert face['confidence'] >= 0.8, 'All detections should meet confidence threshold'
|
||||
assert face.confidence >= 0.8, 'All detections should meet confidence threshold'
|
||||
|
||||
|
||||
def test_detect_faces_default_method():
|
||||
@@ -246,8 +254,8 @@ def test_detector_with_different_configs():
|
||||
"""
|
||||
Test creating multiple detectors with different configurations.
|
||||
"""
|
||||
detector_high_thresh = create_detector('retinaface', conf_thresh=0.9)
|
||||
detector_low_thresh = create_detector('retinaface', conf_thresh=0.3)
|
||||
detector_high_thresh = create_detector('retinaface', confidence_threshold=0.9)
|
||||
detector_low_thresh = create_detector('retinaface', confidence_threshold=0.3)
|
||||
|
||||
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
|
||||
|
||||
|
||||
@@ -1,3 +1,11 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
"""Tests for 106-point facial landmark detector."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import numpy as np
|
||||
import pytest
|
||||
|
||||
|
||||
@@ -2,6 +2,10 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
"""Tests for BiSeNet face parsing model."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import numpy as np
|
||||
import pytest
|
||||
|
||||
|
||||
@@ -1,3 +1,11 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
"""Tests for face recognition models (ArcFace, MobileFace, SphereFace)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import numpy as np
|
||||
import pytest
|
||||
|
||||
|
||||
@@ -1,3 +1,11 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
"""Tests for RetinaFace detector."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import numpy as np
|
||||
import pytest
|
||||
|
||||
@@ -9,9 +17,9 @@ from uniface.detection import RetinaFace
|
||||
def retinaface_model():
|
||||
return RetinaFace(
|
||||
model_name=RetinaFaceWeights.MNET_V2,
|
||||
conf_thresh=0.5,
|
||||
confidence_threshold=0.5,
|
||||
pre_nms_topk=5000,
|
||||
nms_thresh=0.4,
|
||||
nms_threshold=0.4,
|
||||
post_nms_topk=750,
|
||||
)
|
||||
|
||||
@@ -27,15 +35,15 @@ def test_inference_on_640x640_image(retinaface_model):
|
||||
assert isinstance(faces, list), 'Detections should be a list.'
|
||||
|
||||
for face in faces:
|
||||
assert isinstance(face, dict), 'Each detection should be a dictionary.'
|
||||
assert 'bbox' in face, "Each detection should have a 'bbox' key."
|
||||
assert 'confidence' in face, "Each detection should have a 'confidence' key."
|
||||
assert 'landmarks' in face, "Each detection should have a 'landmarks' key."
|
||||
# Face is a dataclass, check attributes exist
|
||||
assert hasattr(face, 'bbox'), "Each detection should have a 'bbox' attribute."
|
||||
assert hasattr(face, 'confidence'), "Each detection should have a 'confidence' attribute."
|
||||
assert hasattr(face, 'landmarks'), "Each detection should have a 'landmarks' attribute."
|
||||
|
||||
bbox = face['bbox']
|
||||
bbox = face.bbox
|
||||
assert len(bbox) == 4, 'BBox should have 4 values (x1, y1, x2, y2).'
|
||||
|
||||
landmarks = face['landmarks']
|
||||
landmarks = face.landmarks
|
||||
assert len(landmarks) == 5, 'Should have 5 landmark points.'
|
||||
assert all(len(pt) == 2 for pt in landmarks), 'Each landmark should be (x, y).'
|
||||
|
||||
@@ -45,7 +53,7 @@ def test_confidence_threshold(retinaface_model):
|
||||
faces = retinaface_model.detect(mock_image)
|
||||
|
||||
for face in faces:
|
||||
confidence = face['confidence']
|
||||
confidence = face.confidence
|
||||
assert confidence >= 0.5, f'Detection has confidence {confidence} below threshold 0.5'
|
||||
|
||||
|
||||
|
||||
@@ -1,3 +1,11 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
"""Tests for SCRFD detector."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import numpy as np
|
||||
import pytest
|
||||
|
||||
@@ -9,8 +17,8 @@ from uniface.detection import SCRFD
|
||||
def scrfd_model():
|
||||
return SCRFD(
|
||||
model_name=SCRFDWeights.SCRFD_500M_KPS,
|
||||
conf_thresh=0.5,
|
||||
nms_thresh=0.4,
|
||||
confidence_threshold=0.5,
|
||||
nms_threshold=0.4,
|
||||
)
|
||||
|
||||
|
||||
@@ -25,15 +33,15 @@ def test_inference_on_640x640_image(scrfd_model):
|
||||
assert isinstance(faces, list), 'Detections should be a list.'
|
||||
|
||||
for face in faces:
|
||||
assert isinstance(face, dict), 'Each detection should be a dictionary.'
|
||||
assert 'bbox' in face, "Each detection should have a 'bbox' key."
|
||||
assert 'confidence' in face, "Each detection should have a 'confidence' key."
|
||||
assert 'landmarks' in face, "Each detection should have a 'landmarks' key."
|
||||
# Face is a dataclass, check attributes exist
|
||||
assert hasattr(face, 'bbox'), "Each detection should have a 'bbox' attribute."
|
||||
assert hasattr(face, 'confidence'), "Each detection should have a 'confidence' attribute."
|
||||
assert hasattr(face, 'landmarks'), "Each detection should have a 'landmarks' attribute."
|
||||
|
||||
bbox = face['bbox']
|
||||
bbox = face.bbox
|
||||
assert len(bbox) == 4, 'BBox should have 4 values (x1, y1, x2, y2).'
|
||||
|
||||
landmarks = face['landmarks']
|
||||
landmarks = face.landmarks
|
||||
assert len(landmarks) == 5, 'Should have 5 landmark points.'
|
||||
assert all(len(pt) == 2 for pt in landmarks), 'Each landmark should be (x, y).'
|
||||
|
||||
@@ -43,7 +51,7 @@ def test_confidence_threshold(scrfd_model):
|
||||
faces = scrfd_model.detect(mock_image)
|
||||
|
||||
for face in faces:
|
||||
confidence = face['confidence']
|
||||
confidence = face.confidence
|
||||
assert confidence >= 0.5, f'Detection has confidence {confidence} below threshold 0.5'
|
||||
|
||||
|
||||
@@ -63,7 +71,7 @@ def test_different_input_sizes(scrfd_model):
|
||||
|
||||
|
||||
def test_scrfd_10g_model():
|
||||
model = SCRFD(model_name=SCRFDWeights.SCRFD_10G_KPS, conf_thresh=0.5)
|
||||
model = SCRFD(model_name=SCRFDWeights.SCRFD_10G_KPS, confidence_threshold=0.5)
|
||||
assert model is not None, 'SCRFD 10G model initialization failed.'
|
||||
|
||||
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
|
||||
|
||||
@@ -1,3 +1,11 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
"""Tests for utility functions (compute_similarity, face_alignment, etc.)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import numpy as np
|
||||
import pytest
|
||||
|
||||
@@ -116,7 +124,7 @@ def test_compute_similarity_dtype():
|
||||
emb2 = emb2 / np.linalg.norm(emb2)
|
||||
|
||||
similarity = compute_similarity(emb1, emb2)
|
||||
assert isinstance(similarity, (float, np.floating)), f'Similarity should be float, got {type(similarity)}'
|
||||
assert isinstance(similarity, float | np.floating), f'Similarity should be float, got {type(similarity)}'
|
||||
|
||||
|
||||
# face_alignment tests
|
||||
@@ -259,4 +267,4 @@ def test_compute_similarity_with_recognition_embeddings():
|
||||
|
||||
# Should be a valid similarity score
|
||||
assert -1.0 <= similarity <= 1.0
|
||||
assert isinstance(similarity, (float, np.floating))
|
||||
assert isinstance(similarity, float | np.floating)
|
||||
|
||||
@@ -11,10 +11,24 @@
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
"""UniFace: A comprehensive library for face analysis.
|
||||
|
||||
This library provides unified APIs for:
|
||||
- Face detection (RetinaFace, SCRFD, YOLOv5Face)
|
||||
- Face recognition (ArcFace, MobileFace, SphereFace)
|
||||
- Facial landmarks (106-point detection)
|
||||
- Face parsing (semantic segmentation)
|
||||
- Gaze estimation
|
||||
- Age, gender, and emotion prediction
|
||||
- Face anti-spoofing
|
||||
- Privacy/anonymization
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
__license__ = 'MIT'
|
||||
__author__ = 'Yakhyokhuja Valikhujaev'
|
||||
__version__ = '1.6.0'
|
||||
|
||||
__version__ = '2.0.0'
|
||||
|
||||
from uniface.face_utils import compute_similarity, face_alignment
|
||||
from uniface.log import Logger, enable_logging
|
||||
@@ -23,12 +37,6 @@ from uniface.visualization import draw_detections, vis_parsing_maps
|
||||
|
||||
from .analyzer import FaceAnalyzer
|
||||
from .attribute import AgeGender, AttributeResult, FairFace
|
||||
from .face import Face
|
||||
|
||||
try:
|
||||
from .attribute import Emotion
|
||||
except ImportError:
|
||||
Emotion = None # PyTorch not installed
|
||||
from .detection import (
|
||||
SCRFD,
|
||||
RetinaFace,
|
||||
@@ -37,6 +45,7 @@ from .detection import (
|
||||
detect_faces,
|
||||
list_available_detectors,
|
||||
)
|
||||
from .face import Face
|
||||
from .gaze import MobileGaze, create_gaze_estimator
|
||||
from .landmark import Landmark106, create_landmarker
|
||||
from .parsing import BiSeNet, create_face_parser
|
||||
@@ -44,7 +53,15 @@ from .privacy import BlurFace, anonymize_faces
|
||||
from .recognition import ArcFace, MobileFace, SphereFace, create_recognizer
|
||||
from .spoofing import MiniFASNet, create_spoofer
|
||||
|
||||
# Optional: Emotion requires PyTorch
|
||||
Emotion: type | None
|
||||
try:
|
||||
from .attribute import Emotion
|
||||
except ImportError:
|
||||
Emotion = None
|
||||
|
||||
__all__ = [
|
||||
# Metadata
|
||||
'__author__',
|
||||
'__license__',
|
||||
'__version__',
|
||||
@@ -85,11 +102,11 @@ __all__ = [
|
||||
'BlurFace',
|
||||
'anonymize_faces',
|
||||
# Utilities
|
||||
'Logger',
|
||||
'compute_similarity',
|
||||
'draw_detections',
|
||||
'vis_parsing_maps',
|
||||
'enable_logging',
|
||||
'face_alignment',
|
||||
'verify_model_weights',
|
||||
'Logger',
|
||||
'enable_logging',
|
||||
'vis_parsing_maps',
|
||||
]
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import List, Optional
|
||||
from __future__ import annotations
|
||||
|
||||
import numpy as np
|
||||
|
||||
@@ -17,14 +17,32 @@ __all__ = ['FaceAnalyzer']
|
||||
|
||||
|
||||
class FaceAnalyzer:
|
||||
"""Unified face analyzer combining detection, recognition, and attributes."""
|
||||
"""Unified face analyzer combining detection, recognition, and attributes.
|
||||
|
||||
This class provides a high-level interface for face analysis by combining
|
||||
multiple components: face detection, recognition (embedding extraction),
|
||||
and attribute prediction (age, gender, race).
|
||||
|
||||
Args:
|
||||
detector: Face detector instance for detecting faces in images.
|
||||
recognizer: Optional face recognizer for extracting embeddings.
|
||||
age_gender: Optional age/gender predictor.
|
||||
fairface: Optional FairFace predictor for demographics.
|
||||
|
||||
Example:
|
||||
>>> from uniface import RetinaFace, ArcFace, FaceAnalyzer
|
||||
>>> detector = RetinaFace()
|
||||
>>> recognizer = ArcFace()
|
||||
>>> analyzer = FaceAnalyzer(detector, recognizer=recognizer)
|
||||
>>> faces = analyzer.analyze(image)
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
detector: BaseDetector,
|
||||
recognizer: Optional[BaseRecognizer] = None,
|
||||
age_gender: Optional[AgeGender] = None,
|
||||
fairface: Optional[FairFace] = None,
|
||||
recognizer: BaseRecognizer | None = None,
|
||||
age_gender: AgeGender | None = None,
|
||||
fairface: FairFace | None = None,
|
||||
) -> None:
|
||||
self.detector = detector
|
||||
self.recognizer = recognizer
|
||||
@@ -39,8 +57,18 @@ class FaceAnalyzer:
|
||||
if fairface:
|
||||
Logger.info(f' - FairFace enabled: {fairface.__class__.__name__}')
|
||||
|
||||
def analyze(self, image: np.ndarray) -> List[Face]:
|
||||
"""Analyze faces in an image."""
|
||||
def analyze(self, image: np.ndarray) -> list[Face]:
|
||||
"""Analyze faces in an image.
|
||||
|
||||
Performs face detection and optionally extracts embeddings and
|
||||
predicts attributes for each detected face.
|
||||
|
||||
Args:
|
||||
image: Input image as numpy array with shape (H, W, C) in BGR format.
|
||||
|
||||
Returns:
|
||||
List of Face objects with detection results and any predicted attributes.
|
||||
"""
|
||||
faces = self.detector.detect(image)
|
||||
Logger.debug(f'Detected {len(faces)} face(s)')
|
||||
|
||||
|
||||
@@ -2,7 +2,9 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import Any, Dict, List, Union
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any
|
||||
|
||||
import numpy as np
|
||||
|
||||
@@ -10,6 +12,7 @@ from uniface.attribute.age_gender import AgeGender
|
||||
from uniface.attribute.base import Attribute, AttributeResult
|
||||
from uniface.attribute.fairface import FairFace
|
||||
from uniface.constants import AgeGenderWeights, DDAMFNWeights, FairFaceWeights
|
||||
from uniface.face import Face
|
||||
|
||||
# Emotion requires PyTorch - make it optional
|
||||
try:
|
||||
@@ -32,17 +35,17 @@ __all__ = [
|
||||
|
||||
# A mapping from model enums to their corresponding attribute classes
|
||||
_ATTRIBUTE_MODELS = {
|
||||
**{model: AgeGender for model in AgeGenderWeights},
|
||||
**{model: FairFace for model in FairFaceWeights},
|
||||
**dict.fromkeys(AgeGenderWeights, AgeGender),
|
||||
**dict.fromkeys(FairFaceWeights, FairFace),
|
||||
}
|
||||
|
||||
# Add Emotion models only if PyTorch is available
|
||||
if _EMOTION_AVAILABLE:
|
||||
_ATTRIBUTE_MODELS.update({model: Emotion for model in DDAMFNWeights})
|
||||
_ATTRIBUTE_MODELS.update(dict.fromkeys(DDAMFNWeights, Emotion))
|
||||
|
||||
|
||||
def create_attribute_predictor(
|
||||
model_name: Union[AgeGenderWeights, DDAMFNWeights, FairFaceWeights], **kwargs: Any
|
||||
model_name: AgeGenderWeights | DDAMFNWeights | FairFaceWeights, **kwargs: Any
|
||||
) -> Attribute:
|
||||
"""
|
||||
Factory function to create an attribute predictor instance.
|
||||
@@ -75,46 +78,36 @@ def create_attribute_predictor(
|
||||
return model_class(model_name=model_name, **kwargs)
|
||||
|
||||
|
||||
def predict_attributes(
|
||||
image: np.ndarray, detections: List[Dict[str, np.ndarray]], predictor: Attribute
|
||||
) -> List[Dict[str, Any]]:
|
||||
def predict_attributes(image: np.ndarray, faces: list[Face], predictor: Attribute) -> list[Face]:
|
||||
"""
|
||||
High-level API to predict attributes for multiple detected faces.
|
||||
|
||||
This function iterates through a list of face detections, runs the
|
||||
specified attribute predictor on each one, and appends the results back
|
||||
into the detection dictionary.
|
||||
This function iterates through a list of Face objects, runs the
|
||||
specified attribute predictor on each one, and updates the Face
|
||||
objects with the predicted attributes.
|
||||
|
||||
Args:
|
||||
image (np.ndarray): The full input image in BGR format.
|
||||
detections (List[Dict]): A list of detection results, where each dict
|
||||
must contain a 'bbox' and optionally 'landmark'.
|
||||
faces (List[Face]): A list of Face objects from face detection.
|
||||
predictor (Attribute): An initialized attribute predictor instance,
|
||||
created by `create_attribute_predictor`.
|
||||
|
||||
Returns:
|
||||
The list of detections, where each dictionary is updated with a new
|
||||
'attributes' key containing the prediction result.
|
||||
List[Face]: The list of Face objects with updated attribute fields.
|
||||
"""
|
||||
for face in detections:
|
||||
# Initialize attributes dict if it doesn't exist
|
||||
if 'attributes' not in face:
|
||||
face['attributes'] = {}
|
||||
|
||||
for face in faces:
|
||||
if isinstance(predictor, AgeGender):
|
||||
result = predictor(image, face['bbox'])
|
||||
face['attributes']['gender'] = result.gender
|
||||
face['attributes']['sex'] = result.sex
|
||||
face['attributes']['age'] = result.age
|
||||
result = predictor(image, face.bbox)
|
||||
face.gender = result.gender
|
||||
face.age = result.age
|
||||
elif isinstance(predictor, FairFace):
|
||||
result = predictor(image, face['bbox'])
|
||||
face['attributes']['gender'] = result.gender
|
||||
face['attributes']['sex'] = result.sex
|
||||
face['attributes']['age_group'] = result.age_group
|
||||
face['attributes']['race'] = result.race
|
||||
result = predictor(image, face.bbox)
|
||||
face.gender = result.gender
|
||||
face.age_group = result.age_group
|
||||
face.race = result.race
|
||||
elif isinstance(predictor, Emotion):
|
||||
emotion, confidence = predictor(image, face['landmark'])
|
||||
face['attributes']['emotion'] = emotion
|
||||
face['attributes']['confidence'] = confidence
|
||||
emotion, confidence = predictor(image, face.landmarks)
|
||||
face.emotion = emotion
|
||||
face.emotion_confidence = confidence
|
||||
|
||||
return detections
|
||||
return faces
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import List, Optional, Tuple, Union
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
@@ -35,7 +34,7 @@ class AgeGender(Attribute):
|
||||
def __init__(
|
||||
self,
|
||||
model_name: AgeGenderWeights = AgeGenderWeights.DEFAULT,
|
||||
input_size: Optional[Tuple[int, int]] = None,
|
||||
input_size: tuple[int, int] | None = None,
|
||||
) -> None:
|
||||
"""
|
||||
Initializes the AgeGender prediction model.
|
||||
@@ -81,7 +80,7 @@ class AgeGender(Attribute):
|
||||
)
|
||||
raise RuntimeError(f'Failed to initialize AgeGender model: {e}') from e
|
||||
|
||||
def preprocess(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> np.ndarray:
|
||||
def preprocess(self, image: np.ndarray, bbox: list | np.ndarray) -> np.ndarray:
|
||||
"""
|
||||
Aligns the face based on the bounding box and preprocesses it for inference.
|
||||
|
||||
@@ -127,7 +126,7 @@ class AgeGender(Attribute):
|
||||
age = int(np.round(prediction[2] * 100))
|
||||
return AttributeResult(gender=gender, age=age)
|
||||
|
||||
def predict(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> AttributeResult:
|
||||
def predict(self, image: np.ndarray, bbox: list | np.ndarray) -> AttributeResult:
|
||||
"""
|
||||
Predicts age and gender for a single face specified by a bounding box.
|
||||
|
||||
|
||||
@@ -4,7 +4,7 @@
|
||||
|
||||
from abc import ABC, abstractmethod
|
||||
from dataclasses import dataclass
|
||||
from typing import Any, Optional
|
||||
from typing import Any
|
||||
|
||||
import numpy as np
|
||||
|
||||
@@ -38,7 +38,7 @@ class AttributeResult:
|
||||
25
|
||||
|
||||
>>> # FairFace result
|
||||
>>> result = AttributeResult(gender=0, age_group="20-29", race="East Asian")
|
||||
>>> result = AttributeResult(gender=0, age_group='20-29', race='East Asian')
|
||||
>>> result.sex
|
||||
'Female'
|
||||
>>> result.race
|
||||
@@ -46,9 +46,9 @@ class AttributeResult:
|
||||
"""
|
||||
|
||||
gender: int
|
||||
age: Optional[int] = None
|
||||
age_group: Optional[str] = None
|
||||
race: Optional[str] = None
|
||||
age: int | None = None
|
||||
age_group: str | None = None
|
||||
race: str | None = None
|
||||
|
||||
@property
|
||||
def sex(self) -> str:
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import List, Tuple, Union
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
@@ -29,7 +28,7 @@ class Emotion(Attribute):
|
||||
def __init__(
|
||||
self,
|
||||
model_weights: DDAMFNWeights = DDAMFNWeights.AFFECNET7,
|
||||
input_size: Tuple[int, int] = (112, 112),
|
||||
input_size: tuple[int, int] = (112, 112),
|
||||
) -> None:
|
||||
"""
|
||||
Initializes the emotion recognition model.
|
||||
@@ -81,7 +80,7 @@ class Emotion(Attribute):
|
||||
Logger.error(f"Failed to load Emotion model from '{self.model_path}'", exc_info=True)
|
||||
raise RuntimeError(f'Failed to initialize Emotion model: {e}') from e
|
||||
|
||||
def preprocess(self, image: np.ndarray, landmark: Union[List, np.ndarray]) -> torch.Tensor:
|
||||
def preprocess(self, image: np.ndarray, landmark: list | np.ndarray) -> torch.Tensor:
|
||||
"""
|
||||
Aligns the face using landmarks and preprocesses it into a tensor.
|
||||
|
||||
@@ -106,7 +105,7 @@ class Emotion(Attribute):
|
||||
|
||||
return torch.from_numpy(transposed_image).unsqueeze(0).to(self.device)
|
||||
|
||||
def postprocess(self, prediction: torch.Tensor) -> Tuple[str, float]:
|
||||
def postprocess(self, prediction: torch.Tensor) -> tuple[str, float]:
|
||||
"""
|
||||
Processes the raw model output to get the emotion label and confidence score.
|
||||
"""
|
||||
@@ -116,7 +115,7 @@ class Emotion(Attribute):
|
||||
confidence = float(probabilities[pred_index])
|
||||
return emotion_label, confidence
|
||||
|
||||
def predict(self, image: np.ndarray, landmark: Union[List, np.ndarray]) -> Tuple[str, float]:
|
||||
def predict(self, image: np.ndarray, landmark: list | np.ndarray) -> tuple[str, float]:
|
||||
"""
|
||||
Predicts the emotion from a single face specified by its landmarks.
|
||||
"""
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import List, Optional, Tuple, Union
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
@@ -13,7 +12,7 @@ from uniface.log import Logger
|
||||
from uniface.model_store import verify_model_weights
|
||||
from uniface.onnx_utils import create_onnx_session
|
||||
|
||||
__all__ = ['FairFace', 'RACE_LABELS', 'AGE_LABELS']
|
||||
__all__ = ['AGE_LABELS', 'RACE_LABELS', 'FairFace']
|
||||
|
||||
# Label definitions
|
||||
RACE_LABELS = [
|
||||
@@ -49,7 +48,7 @@ class FairFace(Attribute):
|
||||
def __init__(
|
||||
self,
|
||||
model_name: FairFaceWeights = FairFaceWeights.DEFAULT,
|
||||
input_size: Optional[Tuple[int, int]] = None,
|
||||
input_size: tuple[int, int] | None = None,
|
||||
) -> None:
|
||||
"""
|
||||
Initializes the FairFace prediction model.
|
||||
@@ -82,7 +81,7 @@ class FairFace(Attribute):
|
||||
)
|
||||
raise RuntimeError(f'Failed to initialize FairFace model: {e}') from e
|
||||
|
||||
def preprocess(self, image: np.ndarray, bbox: Optional[Union[List, np.ndarray]] = None) -> np.ndarray:
|
||||
def preprocess(self, image: np.ndarray, bbox: list | np.ndarray | None = None) -> np.ndarray:
|
||||
"""
|
||||
Preprocesses the face image for inference.
|
||||
|
||||
@@ -130,7 +129,7 @@ class FairFace(Attribute):
|
||||
|
||||
return image
|
||||
|
||||
def postprocess(self, prediction: Tuple[np.ndarray, np.ndarray, np.ndarray]) -> AttributeResult:
|
||||
def postprocess(self, prediction: tuple[np.ndarray, np.ndarray, np.ndarray]) -> AttributeResult:
|
||||
"""
|
||||
Processes the raw model output to extract race, gender, and age.
|
||||
|
||||
@@ -162,7 +161,7 @@ class FairFace(Attribute):
|
||||
race=RACE_LABELS[race_idx],
|
||||
)
|
||||
|
||||
def predict(self, image: np.ndarray, bbox: Optional[Union[List, np.ndarray]] = None) -> AttributeResult:
|
||||
def predict(self, image: np.ndarray, bbox: list | np.ndarray | None = None) -> AttributeResult:
|
||||
"""
|
||||
Predicts race, gender, and age for a face.
|
||||
|
||||
|
||||
@@ -2,34 +2,42 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import itertools
|
||||
import math
|
||||
from typing import List, Optional, Tuple
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
__all__ = [
|
||||
'resize_image',
|
||||
'generate_anchors',
|
||||
'non_max_suppression',
|
||||
'decode_boxes',
|
||||
'decode_landmarks',
|
||||
'distance2bbox',
|
||||
'distance2kps',
|
||||
'generate_anchors',
|
||||
'non_max_suppression',
|
||||
'resize_image',
|
||||
]
|
||||
|
||||
|
||||
def resize_image(frame, target_shape: Tuple[int, int] = (640, 640)) -> Tuple[np.ndarray, float]:
|
||||
"""
|
||||
Resize an image to fit within a target shape while keeping its aspect ratio.
|
||||
def resize_image(
|
||||
frame: np.ndarray,
|
||||
target_shape: tuple[int, int] = (640, 640),
|
||||
) -> tuple[np.ndarray, float]:
|
||||
"""Resize an image to fit within a target shape while keeping its aspect ratio.
|
||||
|
||||
The image is resized to fit within the target dimensions and placed on a
|
||||
blank canvas (zero-padded to target size).
|
||||
|
||||
Args:
|
||||
frame (np.ndarray): Input image.
|
||||
target_shape (Tuple[int, int]): Target size (width, height). Defaults to (640, 640).
|
||||
frame: Input image with shape (H, W, C).
|
||||
target_shape: Target size as (width, height). Defaults to (640, 640).
|
||||
|
||||
Returns:
|
||||
Tuple[np.ndarray, float]: Resized image on a blank canvas and the resize factor.
|
||||
A tuple containing:
|
||||
- Resized image on a blank canvas with shape (height, width, 3).
|
||||
- The resize factor as a float.
|
||||
"""
|
||||
width, height = target_shape
|
||||
|
||||
@@ -53,16 +61,16 @@ def resize_image(frame, target_shape: Tuple[int, int] = (640, 640)) -> Tuple[np.
|
||||
return image, resize_factor
|
||||
|
||||
|
||||
def generate_anchors(image_size: Tuple[int, int] = (640, 640)) -> np.ndarray:
|
||||
"""
|
||||
Generate anchor boxes for a given image size (RetinaFace specific).
|
||||
def generate_anchors(image_size: tuple[int, int] = (640, 640)) -> np.ndarray:
|
||||
"""Generate anchor boxes for a given image size (RetinaFace specific).
|
||||
|
||||
Args:
|
||||
image_size (Tuple[int, int]): Input image size (width, height). Defaults to (640, 640).
|
||||
image_size: Input image size as (width, height). Defaults to (640, 640).
|
||||
|
||||
Returns:
|
||||
np.ndarray: Anchor box coordinates as a NumPy array with shape (num_anchors, 4).
|
||||
Anchor box coordinates as a numpy array with shape (num_anchors, 4).
|
||||
"""
|
||||
# RetinaFace FPN strides and corresponding anchor sizes per level
|
||||
steps = [8, 16, 32]
|
||||
min_sizes = [[16, 32], [64, 128], [256, 512]]
|
||||
|
||||
@@ -85,16 +93,15 @@ def generate_anchors(image_size: Tuple[int, int] = (640, 640)) -> np.ndarray:
|
||||
return output
|
||||
|
||||
|
||||
def non_max_suppression(dets: np.ndarray, threshold: float) -> List[int]:
|
||||
"""
|
||||
Apply Non-Maximum Suppression (NMS) to reduce overlapping bounding boxes based on a threshold.
|
||||
def non_max_suppression(dets: np.ndarray, threshold: float) -> list[int]:
|
||||
"""Apply Non-Maximum Suppression (NMS) to reduce overlapping bounding boxes.
|
||||
|
||||
Args:
|
||||
dets (np.ndarray): Array of detections with each row as [x1, y1, x2, y2, score].
|
||||
threshold (float): IoU threshold for suppression.
|
||||
dets: Array of detections with each row as [x1, y1, x2, y2, score].
|
||||
threshold: IoU threshold for suppression.
|
||||
|
||||
Returns:
|
||||
List[int]: Indices of bounding boxes retained after suppression.
|
||||
Indices of bounding boxes retained after suppression.
|
||||
"""
|
||||
x1 = dets[:, 0]
|
||||
y1 = dets[:, 1]
|
||||
@@ -125,18 +132,22 @@ def non_max_suppression(dets: np.ndarray, threshold: float) -> List[int]:
|
||||
return keep
|
||||
|
||||
|
||||
def decode_boxes(loc: np.ndarray, priors: np.ndarray, variances: Optional[List[float]] = None) -> np.ndarray:
|
||||
"""
|
||||
Decode locations from predictions using priors to undo
|
||||
the encoding done for offset regression at train time (RetinaFace specific).
|
||||
def decode_boxes(
|
||||
loc: np.ndarray,
|
||||
priors: np.ndarray,
|
||||
variances: list[float] | None = None,
|
||||
) -> np.ndarray:
|
||||
"""Decode locations from predictions using priors (RetinaFace specific).
|
||||
|
||||
Undoes the encoding done for offset regression at train time.
|
||||
|
||||
Args:
|
||||
loc (np.ndarray): Location predictions for loc layers, shape: [num_priors, 4]
|
||||
priors (np.ndarray): Prior boxes in center-offset form, shape: [num_priors, 4]
|
||||
variances (Optional[List[float]]): Variances of prior boxes. Defaults to [0.1, 0.2].
|
||||
loc: Location predictions for loc layers, shape: [num_priors, 4].
|
||||
priors: Prior boxes in center-offset form, shape: [num_priors, 4].
|
||||
variances: Variances of prior boxes. Defaults to [0.1, 0.2].
|
||||
|
||||
Returns:
|
||||
np.ndarray: Decoded bounding box predictions with shape [num_priors, 4]
|
||||
Decoded bounding box predictions with shape [num_priors, 4].
|
||||
"""
|
||||
if variances is None:
|
||||
variances = [0.1, 0.2]
|
||||
@@ -155,18 +166,19 @@ def decode_boxes(loc: np.ndarray, priors: np.ndarray, variances: Optional[List[f
|
||||
|
||||
|
||||
def decode_landmarks(
|
||||
predictions: np.ndarray, priors: np.ndarray, variances: Optional[List[float]] = None
|
||||
predictions: np.ndarray,
|
||||
priors: np.ndarray,
|
||||
variances: list[float] | None = None,
|
||||
) -> np.ndarray:
|
||||
"""
|
||||
Decode landmark predictions using prior boxes (RetinaFace specific).
|
||||
"""Decode landmark predictions using prior boxes (RetinaFace specific).
|
||||
|
||||
Args:
|
||||
predictions (np.ndarray): Landmark predictions, shape: [num_priors, 10]
|
||||
priors (np.ndarray): Prior boxes, shape: [num_priors, 4]
|
||||
variances (Optional[List[float]]): Scaling factors for landmark offsets. Defaults to [0.1, 0.2].
|
||||
predictions: Landmark predictions, shape: [num_priors, 10].
|
||||
priors: Prior boxes, shape: [num_priors, 4].
|
||||
variances: Scaling factors for landmark offsets. Defaults to [0.1, 0.2].
|
||||
|
||||
Returns:
|
||||
np.ndarray: Decoded landmarks, shape: [num_priors, 10]
|
||||
Decoded landmarks, shape: [num_priors, 10].
|
||||
"""
|
||||
if variances is None:
|
||||
variances = [0.1, 0.2]
|
||||
@@ -187,18 +199,21 @@ def decode_landmarks(
|
||||
return landmarks
|
||||
|
||||
|
||||
def distance2bbox(points: np.ndarray, distance: np.ndarray, max_shape: Optional[Tuple[int, int]] = None) -> np.ndarray:
|
||||
"""
|
||||
Decode distance prediction to bounding box (SCRFD specific).
|
||||
def distance2bbox(
|
||||
points: np.ndarray,
|
||||
distance: np.ndarray,
|
||||
max_shape: tuple[int, int] | None = None,
|
||||
) -> np.ndarray:
|
||||
"""Decode distance prediction to bounding box (SCRFD specific).
|
||||
|
||||
Args:
|
||||
points (np.ndarray): Anchor points with shape (n, 2), [x, y].
|
||||
distance (np.ndarray): Distance from the given point to 4
|
||||
boundaries (left, top, right, bottom) with shape (n, 4).
|
||||
max_shape (Optional[Tuple[int, int]]): Shape of the image (height, width) for clipping.
|
||||
points: Anchor points with shape (n, 2), [x, y].
|
||||
distance: Distance from the given point to 4 boundaries
|
||||
(left, top, right, bottom) with shape (n, 4).
|
||||
max_shape: Shape of the image (height, width) for clipping.
|
||||
|
||||
Returns:
|
||||
np.ndarray: Decoded bounding boxes with shape (n, 4) as [x1, y1, x2, y2].
|
||||
Decoded bounding boxes with shape (n, 4) as [x1, y1, x2, y2].
|
||||
"""
|
||||
x1 = points[:, 0] - distance[:, 0]
|
||||
y1 = points[:, 1] - distance[:, 1]
|
||||
@@ -219,17 +234,20 @@ def distance2bbox(points: np.ndarray, distance: np.ndarray, max_shape: Optional[
|
||||
return np.stack([x1, y1, x2, y2], axis=-1)
|
||||
|
||||
|
||||
def distance2kps(points: np.ndarray, distance: np.ndarray, max_shape: Optional[Tuple[int, int]] = None) -> np.ndarray:
|
||||
"""
|
||||
Decode distance prediction to keypoints (SCRFD specific).
|
||||
def distance2kps(
|
||||
points: np.ndarray,
|
||||
distance: np.ndarray,
|
||||
max_shape: tuple[int, int] | None = None,
|
||||
) -> np.ndarray:
|
||||
"""Decode distance prediction to keypoints (SCRFD specific).
|
||||
|
||||
Args:
|
||||
points (np.ndarray): Anchor points with shape (n, 2), [x, y].
|
||||
distance (np.ndarray): Distance from the given point to keypoints with shape (n, 2k).
|
||||
max_shape (Optional[Tuple[int, int]]): Shape of the image (height, width) for clipping.
|
||||
points: Anchor points with shape (n, 2), [x, y].
|
||||
distance: Distance from the given point to keypoints with shape (n, 2k).
|
||||
max_shape: Shape of the image (height, width) for clipping.
|
||||
|
||||
Returns:
|
||||
np.ndarray: Decoded keypoints with shape (n, 2k).
|
||||
Decoded keypoints with shape (n, 2k).
|
||||
"""
|
||||
preds = []
|
||||
for i in range(0, distance.shape[1], 2):
|
||||
|
||||
@@ -3,7 +3,6 @@
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from enum import Enum
|
||||
from typing import Dict
|
||||
|
||||
|
||||
# fmt: off
|
||||
@@ -142,7 +141,7 @@ class MiniFASNetWeights(str, Enum):
|
||||
V2 = "minifasnet_v2"
|
||||
|
||||
|
||||
MODEL_URLS: Dict[Enum, str] = {
|
||||
MODEL_URLS: dict[Enum, str] = {
|
||||
# RetinaFace
|
||||
RetinaFaceWeights.MNET_025: 'https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1_0.25.onnx',
|
||||
RetinaFaceWeights.MNET_050: 'https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1_0.50.onnx',
|
||||
@@ -191,7 +190,7 @@ MODEL_URLS: Dict[Enum, str] = {
|
||||
MiniFASNetWeights.V2: 'https://github.com/yakhyo/face-anti-spoofing/releases/download/weights/MiniFASNetV2.onnx',
|
||||
}
|
||||
|
||||
MODEL_SHA256: Dict[Enum, str] = {
|
||||
MODEL_SHA256: dict[Enum, str] = {
|
||||
# RetinaFace
|
||||
RetinaFaceWeights.MNET_025: 'b7a7acab55e104dce6f32cdfff929bd83946da5cd869b9e2e9bdffafd1b7e4a5',
|
||||
RetinaFaceWeights.MNET_050: 'd8977186f6037999af5b4113d42ba77a84a6ab0c996b17c713cc3d53b88bfc37',
|
||||
|
||||
@@ -2,8 +2,9 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Dict, List
|
||||
from typing import Any
|
||||
|
||||
import numpy as np
|
||||
|
||||
@@ -14,37 +15,40 @@ from .retinaface import RetinaFace
|
||||
from .scrfd import SCRFD
|
||||
from .yolov5 import YOLOv5Face
|
||||
|
||||
# Global cache for detector instances
|
||||
_detector_cache: Dict[str, BaseDetector] = {}
|
||||
# Global cache for detector instances (keyed by method name + config hash)
|
||||
_detector_cache: dict[str, BaseDetector] = {}
|
||||
|
||||
|
||||
def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs) -> List[Face]:
|
||||
"""
|
||||
High-level face detection function.
|
||||
def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs: Any) -> list[Face]:
|
||||
"""High-level face detection function.
|
||||
|
||||
Detects faces in an image using the specified detection method.
|
||||
Results are cached for repeated calls with the same configuration.
|
||||
|
||||
Args:
|
||||
image (np.ndarray): Input image as numpy array.
|
||||
method (str): Detection method to use. Options: 'retinaface', 'scrfd', 'yolov5face'.
|
||||
image: Input image as numpy array with shape (H, W, C) in BGR format.
|
||||
method: Detection method to use. Options: 'retinaface', 'scrfd', 'yolov5face'.
|
||||
**kwargs: Additional arguments passed to the detector.
|
||||
|
||||
Returns:
|
||||
List[Face]: A list of Face objects, each containing:
|
||||
- bbox (np.ndarray): [x1, y1, x2, y2] bounding box coordinates.
|
||||
- confidence (float): The confidence score of the detection.
|
||||
- landmarks (np.ndarray): 5-point facial landmarks with shape (5, 2).
|
||||
A list of Face objects, each containing:
|
||||
- bbox: [x1, y1, x2, y2] bounding box coordinates.
|
||||
- confidence: The confidence score of the detection.
|
||||
- landmarks: 5-point facial landmarks with shape (5, 2).
|
||||
|
||||
Example:
|
||||
>>> from uniface import detect_faces
|
||||
>>> image = cv2.imread("your_image.jpg")
|
||||
>>> faces = detect_faces(image, method='retinaface', conf_thresh=0.8)
|
||||
>>> import cv2
|
||||
>>> image = cv2.imread('your_image.jpg')
|
||||
>>> faces = detect_faces(image, method='retinaface', confidence_threshold=0.8)
|
||||
>>> for face in faces:
|
||||
... print(f"Found face with confidence: {face.confidence}")
|
||||
... print(f"BBox: {face.bbox}")
|
||||
... print(f'Found face with confidence: {face.confidence}')
|
||||
... print(f'BBox: {face.bbox}')
|
||||
"""
|
||||
method_name = method.lower()
|
||||
|
||||
sorted_kwargs = sorted(kwargs.items())
|
||||
cache_key = f'{method_name}_{str(sorted_kwargs)}'
|
||||
cache_key = f'{method_name}_{sorted_kwargs!s}'
|
||||
|
||||
if cache_key not in _detector_cache:
|
||||
# Pass kwargs to create the correctly configured detector
|
||||
@@ -54,49 +58,36 @@ def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs) -> Lis
|
||||
return detector.detect(image)
|
||||
|
||||
|
||||
def create_detector(method: str = 'retinaface', **kwargs) -> BaseDetector:
|
||||
"""
|
||||
Factory function to create face detectors.
|
||||
def create_detector(method: str = 'retinaface', **kwargs: Any) -> BaseDetector:
|
||||
"""Factory function to create face detectors.
|
||||
|
||||
Args:
|
||||
method (str): Detection method. Options:
|
||||
method: Detection method. Options:
|
||||
- 'retinaface': RetinaFace detector (default)
|
||||
- 'scrfd': SCRFD detector (fast and accurate)
|
||||
- 'yolov5face': YOLOv5-Face detector (accurate with landmarks)
|
||||
**kwargs: Detector-specific parameters
|
||||
**kwargs: Detector-specific parameters.
|
||||
|
||||
Returns:
|
||||
BaseDetector: Initialized detector instance
|
||||
Initialized detector instance.
|
||||
|
||||
Raises:
|
||||
ValueError: If method is not supported
|
||||
ValueError: If method is not supported.
|
||||
|
||||
Examples:
|
||||
Example:
|
||||
>>> # Basic usage
|
||||
>>> detector = create_detector('retinaface')
|
||||
|
||||
>>> # SCRFD detector with custom parameters
|
||||
>>> from uniface.constants import SCRFDWeights
|
||||
>>> detector = create_detector(
|
||||
... 'scrfd',
|
||||
... model_name=SCRFDWeights.SCRFD_10G_KPS,
|
||||
... conf_thresh=0.8,
|
||||
... input_size=(640, 640)
|
||||
... 'scrfd', model_name=SCRFDWeights.SCRFD_10G_KPS, confidence_threshold=0.8, input_size=(640, 640)
|
||||
... )
|
||||
|
||||
>>> # RetinaFace detector
|
||||
>>> from uniface.constants import RetinaFaceWeights
|
||||
>>> detector = create_detector(
|
||||
... 'retinaface',
|
||||
... model_name=RetinaFaceWeights.MNET_V2,
|
||||
... conf_thresh=0.8,
|
||||
... nms_thresh=0.4
|
||||
... )
|
||||
|
||||
>>> # YOLOv5-Face detector
|
||||
>>> detector = create_detector(
|
||||
... 'yolov5face',
|
||||
... model_name=YOLOv5FaceWeights.YOLOV5S,
|
||||
... conf_thresh=0.25,
|
||||
... nms_thresh=0.45
|
||||
... 'retinaface', model_name=RetinaFaceWeights.MNET_V2, confidence_threshold=0.8, nms_threshold=0.4
|
||||
... )
|
||||
"""
|
||||
method = method.lower()
|
||||
@@ -115,12 +106,12 @@ def create_detector(method: str = 'retinaface', **kwargs) -> BaseDetector:
|
||||
raise ValueError(f"Unsupported detection method: '{method}'. Available methods: {available_methods}")
|
||||
|
||||
|
||||
def list_available_detectors() -> Dict[str, Dict[str, Any]]:
|
||||
"""
|
||||
List all available detection methods with their descriptions and parameters.
|
||||
def list_available_detectors() -> dict[str, dict[str, Any]]:
|
||||
"""List all available detection methods with their descriptions and parameters.
|
||||
|
||||
Returns:
|
||||
Dict[str, Dict[str, Any]]: Dictionary of detector information
|
||||
Dictionary mapping detector names to their information including
|
||||
description, landmark support, paper reference, and default parameters.
|
||||
"""
|
||||
return {
|
||||
'retinaface': {
|
||||
@@ -129,8 +120,8 @@ def list_available_detectors() -> Dict[str, Dict[str, Any]]:
|
||||
'paper': 'https://arxiv.org/abs/1905.00641',
|
||||
'default_params': {
|
||||
'model_name': 'mnet_v2',
|
||||
'conf_thresh': 0.5,
|
||||
'nms_thresh': 0.4,
|
||||
'confidence_threshold': 0.5,
|
||||
'nms_threshold': 0.4,
|
||||
'input_size': (640, 640),
|
||||
},
|
||||
},
|
||||
@@ -140,8 +131,8 @@ def list_available_detectors() -> Dict[str, Dict[str, Any]]:
|
||||
'paper': 'https://arxiv.org/abs/2105.04714',
|
||||
'default_params': {
|
||||
'model_name': 'scrfd_10g_kps',
|
||||
'conf_thresh': 0.5,
|
||||
'nms_thresh': 0.4,
|
||||
'confidence_threshold': 0.5,
|
||||
'nms_threshold': 0.4,
|
||||
'input_size': (640, 640),
|
||||
},
|
||||
},
|
||||
@@ -151,8 +142,8 @@ def list_available_detectors() -> Dict[str, Dict[str, Any]]:
|
||||
'paper': 'https://arxiv.org/abs/2105.12931',
|
||||
'default_params': {
|
||||
'model_name': 'yolov5s_face',
|
||||
'conf_thresh': 0.25,
|
||||
'nms_thresh': 0.45,
|
||||
'confidence_threshold': 0.25,
|
||||
'nms_threshold': 0.45,
|
||||
'input_size': 640,
|
||||
},
|
||||
},
|
||||
@@ -160,11 +151,11 @@ def list_available_detectors() -> Dict[str, Dict[str, Any]]:
|
||||
|
||||
|
||||
__all__ = [
|
||||
'detect_faces',
|
||||
'create_detector',
|
||||
'list_available_detectors',
|
||||
'SCRFD',
|
||||
'BaseDetector',
|
||||
'RetinaFace',
|
||||
'YOLOv5Face',
|
||||
'BaseDetector',
|
||||
'create_detector',
|
||||
'detect_faces',
|
||||
'list_available_detectors',
|
||||
]
|
||||
|
||||
@@ -2,40 +2,51 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from abc import ABC, abstractmethod
|
||||
from typing import Any, Dict, List
|
||||
from typing import Any
|
||||
|
||||
import numpy as np
|
||||
|
||||
from uniface.face import Face
|
||||
|
||||
__all__ = ['BaseDetector']
|
||||
|
||||
|
||||
class BaseDetector(ABC):
|
||||
"""
|
||||
Abstract base class for all face detectors.
|
||||
"""Abstract base class for all face detectors.
|
||||
|
||||
This class defines the interface that all face detectors must implement,
|
||||
ensuring consistency across different detection methods.
|
||||
|
||||
Attributes:
|
||||
config: Dictionary containing detector configuration parameters.
|
||||
_supports_landmarks: Flag indicating if detector supports landmark detection.
|
||||
"""
|
||||
|
||||
def __init__(self, **kwargs):
|
||||
"""Initialize the detector with configuration parameters."""
|
||||
self.config = kwargs
|
||||
|
||||
@abstractmethod
|
||||
def detect(self, image: np.ndarray, **kwargs) -> List[Face]:
|
||||
"""
|
||||
Detect faces in an image.
|
||||
def __init__(self, **kwargs: Any) -> None:
|
||||
"""Initialize the detector with configuration parameters.
|
||||
|
||||
Args:
|
||||
image (np.ndarray): Input image as numpy array with shape (H, W, C)
|
||||
**kwargs: Additional detection parameters
|
||||
**kwargs: Detector-specific configuration parameters.
|
||||
"""
|
||||
self.config: dict[str, Any] = kwargs
|
||||
self._supports_landmarks: bool = False
|
||||
|
||||
@abstractmethod
|
||||
def detect(self, image: np.ndarray, **kwargs: Any) -> list[Face]:
|
||||
"""Detect faces in an image.
|
||||
|
||||
Args:
|
||||
image: Input image as numpy array with shape (H, W, C) in BGR format.
|
||||
**kwargs: Additional detection parameters.
|
||||
|
||||
Returns:
|
||||
List[Face]: List of detected Face objects, each containing:
|
||||
- bbox (np.ndarray): Bounding box coordinates with shape (4,) as [x1, y1, x2, y2]
|
||||
- confidence (float): Detection confidence score (0.0 to 1.0)
|
||||
- landmarks (np.ndarray): Facial landmarks with shape (5, 2) for 5-point landmarks
|
||||
List of detected Face objects, each containing:
|
||||
- bbox: Bounding box coordinates with shape (4,) as [x1, y1, x2, y2].
|
||||
- confidence: Detection confidence score (0.0 to 1.0).
|
||||
- landmarks: Facial landmarks with shape (5, 2) for 5-point landmarks.
|
||||
|
||||
Example:
|
||||
>>> faces = detector.detect(image)
|
||||
@@ -44,34 +55,29 @@ class BaseDetector(ABC):
|
||||
... confidence = face.confidence # float
|
||||
... landmarks = face.landmarks # np.ndarray with shape (5, 2)
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def preprocess(self, image: np.ndarray) -> np.ndarray:
|
||||
"""
|
||||
Preprocess input image for detection.
|
||||
"""Preprocess input image for detection.
|
||||
|
||||
Args:
|
||||
image (np.ndarray): Input image
|
||||
image: Input image with shape (H, W, C).
|
||||
|
||||
Returns:
|
||||
np.ndarray: Preprocessed image tensor
|
||||
Preprocessed image tensor ready for inference.
|
||||
"""
|
||||
pass
|
||||
|
||||
@abstractmethod
|
||||
def postprocess(self, outputs, **kwargs) -> Any:
|
||||
"""
|
||||
Postprocess model outputs to get final detections.
|
||||
def postprocess(self, outputs: Any, **kwargs: Any) -> Any:
|
||||
"""Postprocess model outputs to get final detections.
|
||||
|
||||
Args:
|
||||
outputs: Raw model outputs
|
||||
**kwargs: Additional postprocessing parameters
|
||||
outputs: Raw model outputs.
|
||||
**kwargs: Additional postprocessing parameters.
|
||||
|
||||
Returns:
|
||||
Any: Processed outputs (implementation-specific format, typically tuple of arrays)
|
||||
Processed outputs (implementation-specific format, typically tuple of arrays).
|
||||
"""
|
||||
pass
|
||||
|
||||
def __str__(self) -> str:
|
||||
"""String representation of the detector."""
|
||||
@@ -83,20 +89,18 @@ class BaseDetector(ABC):
|
||||
|
||||
@property
|
||||
def supports_landmarks(self) -> bool:
|
||||
"""
|
||||
Whether this detector supports landmark detection.
|
||||
"""Whether this detector supports landmark detection.
|
||||
|
||||
Returns:
|
||||
bool: True if landmarks are supported, False otherwise
|
||||
True if landmarks are supported, False otherwise.
|
||||
"""
|
||||
return hasattr(self, '_supports_landmarks') and self._supports_landmarks
|
||||
|
||||
def get_info(self) -> Dict[str, Any]:
|
||||
"""
|
||||
Get detector information and configuration.
|
||||
def get_info(self) -> dict[str, Any]:
|
||||
"""Get detector information and configuration.
|
||||
|
||||
Returns:
|
||||
Dict[str, Any]: Detector information
|
||||
Dictionary containing detector name, landmark support, and config.
|
||||
"""
|
||||
return {
|
||||
'name': self.__class__.__name__,
|
||||
|
||||
@@ -2,7 +2,9 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import Any, List, Literal, Tuple
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Literal
|
||||
|
||||
import numpy as np
|
||||
|
||||
@@ -32,8 +34,8 @@ class RetinaFace(BaseDetector):
|
||||
|
||||
Args:
|
||||
model_name (RetinaFaceWeights): Model weights to use. Defaults to `RetinaFaceWeights.MNET_V2`.
|
||||
conf_thresh (float): Confidence threshold for filtering detections. Defaults to 0.5.
|
||||
nms_thresh (float): Non-maximum suppression (NMS) IoU threshold. Defaults to 0.4.
|
||||
confidence_threshold (float): Confidence threshold for filtering detections. Defaults to 0.5.
|
||||
nms_threshold (float): Non-maximum suppression (NMS) IoU threshold. Defaults to 0.4.
|
||||
input_size (Tuple[int, int]): Fixed input size (width, height) if `dynamic_size=False`.
|
||||
Defaults to (640, 640).
|
||||
Note: Non-default sizes may cause slower inference and CoreML compatibility issues.
|
||||
@@ -44,8 +46,8 @@ class RetinaFace(BaseDetector):
|
||||
|
||||
Attributes:
|
||||
model_name (RetinaFaceWeights): Selected model variant.
|
||||
conf_thresh (float): Threshold for confidence-based filtering.
|
||||
nms_thresh (float): IoU threshold used for NMS.
|
||||
confidence_threshold (float): Threshold for confidence-based filtering.
|
||||
nms_threshold (float): IoU threshold used for NMS.
|
||||
pre_nms_topk (int): Limit on proposals before applying NMS.
|
||||
post_nms_topk (int): Limit on retained detections after NMS.
|
||||
dynamic_size (bool): Flag indicating dynamic or static input sizing.
|
||||
@@ -63,23 +65,23 @@ class RetinaFace(BaseDetector):
|
||||
self,
|
||||
*,
|
||||
model_name: RetinaFaceWeights = RetinaFaceWeights.MNET_V2,
|
||||
conf_thresh: float = 0.5,
|
||||
nms_thresh: float = 0.4,
|
||||
input_size: Tuple[int, int] = (640, 640),
|
||||
confidence_threshold: float = 0.5,
|
||||
nms_threshold: float = 0.4,
|
||||
input_size: tuple[int, int] = (640, 640),
|
||||
**kwargs: Any,
|
||||
) -> None:
|
||||
super().__init__(
|
||||
model_name=model_name,
|
||||
conf_thresh=conf_thresh,
|
||||
nms_thresh=nms_thresh,
|
||||
confidence_threshold=confidence_threshold,
|
||||
nms_threshold=nms_threshold,
|
||||
input_size=input_size,
|
||||
**kwargs,
|
||||
)
|
||||
self._supports_landmarks = True # RetinaFace supports landmarks
|
||||
|
||||
self.model_name = model_name
|
||||
self.conf_thresh = conf_thresh
|
||||
self.nms_thresh = nms_thresh
|
||||
self.confidence_threshold = confidence_threshold
|
||||
self.nms_threshold = nms_threshold
|
||||
self.input_size = input_size
|
||||
|
||||
# Advanced options from kwargs
|
||||
@@ -88,8 +90,8 @@ class RetinaFace(BaseDetector):
|
||||
self.dynamic_size = kwargs.get('dynamic_size', False)
|
||||
|
||||
Logger.info(
|
||||
f'Initializing RetinaFace with model={self.model_name}, conf_thresh={self.conf_thresh}, '
|
||||
f'nms_thresh={self.nms_thresh}, input_size={self.input_size}'
|
||||
f'Initializing RetinaFace with model={self.model_name}, confidence_threshold={self.confidence_threshold}, '
|
||||
f'nms_threshold={self.nms_threshold}, input_size={self.input_size}'
|
||||
)
|
||||
|
||||
# Get path to model weights
|
||||
@@ -105,14 +107,13 @@ class RetinaFace(BaseDetector):
|
||||
self._initialize_model(self._model_path)
|
||||
|
||||
def _initialize_model(self, model_path: str) -> None:
|
||||
"""
|
||||
Initializes an ONNX model session from the given path.
|
||||
"""Initialize an ONNX model session from the given path.
|
||||
|
||||
Args:
|
||||
model_path (str): The file path to the ONNX model.
|
||||
model_path: The file path to the ONNX model.
|
||||
|
||||
Raises:
|
||||
RuntimeError: If the model fails to load, logs an error and raises an exception.
|
||||
RuntimeError: If the model fails to load.
|
||||
"""
|
||||
try:
|
||||
self.session = create_onnx_session(model_path)
|
||||
@@ -137,14 +138,14 @@ class RetinaFace(BaseDetector):
|
||||
image = np.expand_dims(image, axis=0) # Add batch dimension (1, C, H, W)
|
||||
return image
|
||||
|
||||
def inference(self, input_tensor: np.ndarray) -> List[np.ndarray]:
|
||||
def inference(self, input_tensor: np.ndarray) -> list[np.ndarray]:
|
||||
"""Perform model inference on the preprocessed image tensor.
|
||||
|
||||
Args:
|
||||
input_tensor (np.ndarray): Preprocessed input tensor.
|
||||
input_tensor: Preprocessed input tensor with shape (1, C, H, W).
|
||||
|
||||
Returns:
|
||||
Tuple[np.ndarray, np.ndarray]: Raw model outputs.
|
||||
List of raw model outputs (location, confidence, landmarks).
|
||||
"""
|
||||
return self.session.run(self.output_names, {self.input_names: input_tensor})
|
||||
|
||||
@@ -155,7 +156,7 @@ class RetinaFace(BaseDetector):
|
||||
max_num: int = 0,
|
||||
metric: Literal['default', 'max'] = 'max',
|
||||
center_weight: float = 2.0,
|
||||
) -> List[Face]:
|
||||
) -> list[Face]:
|
||||
"""
|
||||
Perform face detection on an input image and return bounding boxes and facial landmarks.
|
||||
|
||||
@@ -240,41 +241,43 @@ class RetinaFace(BaseDetector):
|
||||
return faces
|
||||
|
||||
def postprocess(
|
||||
self, outputs: List[np.ndarray], resize_factor: float, shape: Tuple[int, int]
|
||||
) -> Tuple[np.ndarray, np.ndarray]:
|
||||
"""
|
||||
Process the model outputs into final detection results.
|
||||
self,
|
||||
outputs: list[np.ndarray],
|
||||
resize_factor: float,
|
||||
shape: tuple[int, int],
|
||||
) -> tuple[np.ndarray, np.ndarray]:
|
||||
"""Process the model outputs into final detection results.
|
||||
|
||||
Args:
|
||||
outputs (List[np.ndarray]): Raw outputs from the detection model.
|
||||
outputs: Raw outputs from the detection model containing:
|
||||
- outputs[0]: Location predictions (bounding box coordinates).
|
||||
- outputs[1]: Class confidence scores.
|
||||
- outputs[2]: Landmark predictions.
|
||||
resize_factor (float): Factor used to resize the input image during preprocessing.
|
||||
shape (Tuple[int, int]): Original shape of the image as (height, width).
|
||||
resize_factor: Factor used to resize the input image during preprocessing.
|
||||
shape: Original shape of the image as (width, height).
|
||||
|
||||
Returns:
|
||||
Tuple[np.ndarray, np.ndarray]: Processed results containing:
|
||||
- detections (np.ndarray): Array of detected bounding boxes with confidence scores.
|
||||
Shape: (num_detections, 5), where each row is [x_min, y_min, x_max, y_max, score].
|
||||
- landmarks (np.ndarray): Array of detected facial landmarks.
|
||||
Shape: (num_detections, 5, 2), where each row contains 5 landmark points (x, y).
|
||||
A tuple containing:
|
||||
- detections: Array of detected bounding boxes with confidence scores,
|
||||
shape (num_detections, 5), each row is [x1, y1, x2, y2, score].
|
||||
- landmarks: Array of detected facial landmarks,
|
||||
shape (num_detections, 5, 2), each row contains 5 landmark points (x, y).
|
||||
"""
|
||||
loc, conf, landmarks = (
|
||||
location_predictions, confidence_scores, landmark_predictions = (
|
||||
outputs[0].squeeze(0),
|
||||
outputs[1].squeeze(0),
|
||||
outputs[2].squeeze(0),
|
||||
)
|
||||
|
||||
# Decode boxes and landmarks
|
||||
boxes = decode_boxes(loc, self._priors)
|
||||
landmarks = decode_landmarks(landmarks, self._priors)
|
||||
boxes = decode_boxes(location_predictions, self._priors)
|
||||
landmarks = decode_landmarks(landmark_predictions, self._priors)
|
||||
|
||||
boxes, landmarks = self._scale_detections(boxes, landmarks, resize_factor, shape=(shape[0], shape[1]))
|
||||
|
||||
# Extract confidence scores for the face class
|
||||
scores = conf[:, 1]
|
||||
mask = scores > self.conf_thresh
|
||||
scores = confidence_scores[:, 1]
|
||||
mask = scores > self.confidence_threshold
|
||||
|
||||
# Filter by confidence threshold
|
||||
boxes, landmarks, scores = boxes[mask], landmarks[mask], scores[mask]
|
||||
@@ -285,7 +288,7 @@ class RetinaFace(BaseDetector):
|
||||
|
||||
# Apply NMS
|
||||
detections = np.hstack((boxes, scores[:, np.newaxis])).astype(np.float32, copy=False)
|
||||
keep = non_max_suppression(detections, self.nms_thresh)
|
||||
keep = non_max_suppression(detections, self.nms_threshold)
|
||||
detections, landmarks = detections[keep], landmarks[keep]
|
||||
|
||||
# Keep top-k detections
|
||||
@@ -303,9 +306,9 @@ class RetinaFace(BaseDetector):
|
||||
boxes: np.ndarray,
|
||||
landmarks: np.ndarray,
|
||||
resize_factor: float,
|
||||
shape: Tuple[int, int],
|
||||
) -> Tuple[np.ndarray, np.ndarray]:
|
||||
# Scale bounding boxes and landmarks to the original image size.
|
||||
shape: tuple[int, int],
|
||||
) -> tuple[np.ndarray, np.ndarray]:
|
||||
"""Scale bounding boxes and landmarks to the original image size."""
|
||||
bbox_scale = np.array([shape[0], shape[1]] * 2)
|
||||
boxes = boxes * bbox_scale / resize_factor
|
||||
|
||||
|
||||
@@ -2,7 +2,9 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import Any, List, Literal, Tuple
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any, Literal
|
||||
|
||||
import numpy as np
|
||||
|
||||
@@ -29,8 +31,8 @@ class SCRFD(BaseDetector):
|
||||
Args:
|
||||
model_name (SCRFDWeights): Predefined model enum (e.g., `SCRFD_10G_KPS`).
|
||||
Specifies the SCRFD variant to load. Defaults to SCRFD_10G_KPS.
|
||||
conf_thresh (float): Confidence threshold for filtering detections. Defaults to 0.5.
|
||||
nms_thresh (float): Non-Maximum Suppression threshold. Defaults to 0.4.
|
||||
confidence_threshold (float): Confidence threshold for filtering detections. Defaults to 0.5.
|
||||
nms_threshold (float): Non-Maximum Suppression threshold. Defaults to 0.4.
|
||||
input_size (Tuple[int, int]): Input image size (width, height).
|
||||
Defaults to (640, 640).
|
||||
Note: Non-default sizes may cause slower inference and CoreML compatibility issues.
|
||||
@@ -38,10 +40,10 @@ class SCRFD(BaseDetector):
|
||||
|
||||
Attributes:
|
||||
model_name (SCRFDWeights): Selected model variant.
|
||||
conf_thresh (float): Threshold used to filter low-confidence detections.
|
||||
nms_thresh (float): Threshold used during NMS to suppress overlapping boxes.
|
||||
confidence_threshold (float): Threshold used to filter low-confidence detections.
|
||||
nms_threshold (float): Threshold used during NMS to suppress overlapping boxes.
|
||||
input_size (Tuple[int, int]): Image size to which inputs are resized before inference.
|
||||
_fmc (int): Number of feature map levels used in the model.
|
||||
_num_feature_maps (int): Number of feature map levels used in the model.
|
||||
_feat_stride_fpn (List[int]): Feature map strides corresponding to each detection level.
|
||||
_num_anchors (int): Number of anchors per feature location.
|
||||
_center_cache (Dict): Cached anchor centers for efficient forward passes.
|
||||
@@ -56,35 +58,35 @@ class SCRFD(BaseDetector):
|
||||
self,
|
||||
*,
|
||||
model_name: SCRFDWeights = SCRFDWeights.SCRFD_10G_KPS,
|
||||
conf_thresh: float = 0.5,
|
||||
nms_thresh: float = 0.4,
|
||||
input_size: Tuple[int, int] = (640, 640),
|
||||
confidence_threshold: float = 0.5,
|
||||
nms_threshold: float = 0.4,
|
||||
input_size: tuple[int, int] = (640, 640),
|
||||
**kwargs: Any,
|
||||
) -> None:
|
||||
super().__init__(
|
||||
model_name=model_name,
|
||||
conf_thresh=conf_thresh,
|
||||
nms_thresh=nms_thresh,
|
||||
confidence_threshold=confidence_threshold,
|
||||
nms_threshold=nms_threshold,
|
||||
input_size=input_size,
|
||||
**kwargs,
|
||||
)
|
||||
self._supports_landmarks = True # SCRFD supports landmarks
|
||||
|
||||
self.model_name = model_name
|
||||
self.conf_thresh = conf_thresh
|
||||
self.nms_thresh = nms_thresh
|
||||
self.confidence_threshold = confidence_threshold
|
||||
self.nms_threshold = nms_threshold
|
||||
self.input_size = input_size
|
||||
|
||||
# ------- SCRFD model params ------
|
||||
self._fmc = 3
|
||||
self._num_feature_maps = 3
|
||||
self._feat_stride_fpn = [8, 16, 32]
|
||||
self._num_anchors = 2
|
||||
self._center_cache = {}
|
||||
# ---------------------------------
|
||||
|
||||
Logger.info(
|
||||
f'Initializing SCRFD with model={self.model_name}, conf_thresh={self.conf_thresh}, '
|
||||
f'nms_thresh={self.nms_thresh}, input_size={self.input_size}'
|
||||
f'Initializing SCRFD with model={self.model_name}, confidence_threshold={self.confidence_threshold}, '
|
||||
f'nms_threshold={self.nms_threshold}, input_size={self.input_size}'
|
||||
)
|
||||
|
||||
# Get path to model weights
|
||||
@@ -95,14 +97,13 @@ class SCRFD(BaseDetector):
|
||||
self._initialize_model(self._model_path)
|
||||
|
||||
def _initialize_model(self, model_path: str) -> None:
|
||||
"""
|
||||
Initializes an ONNX model session from the given path.
|
||||
"""Initialize an ONNX model session from the given path.
|
||||
|
||||
Args:
|
||||
model_path (str): The file path to the ONNX model.
|
||||
model_path: The file path to the ONNX model.
|
||||
|
||||
Raises:
|
||||
RuntimeError: If the model fails to load, logs an error and raises an exception.
|
||||
RuntimeError: If the model fails to load.
|
||||
"""
|
||||
try:
|
||||
self.session = create_onnx_session(model_path)
|
||||
@@ -113,14 +114,14 @@ class SCRFD(BaseDetector):
|
||||
Logger.error(f"Failed to load model from '{model_path}': {e}", exc_info=True)
|
||||
raise RuntimeError(f"Failed to initialize model session for '{model_path}'") from e
|
||||
|
||||
def preprocess(self, image: np.ndarray) -> Tuple[np.ndarray, Tuple[int, int]]:
|
||||
def preprocess(self, image: np.ndarray) -> np.ndarray:
|
||||
"""Preprocess image for inference.
|
||||
|
||||
Args:
|
||||
image (np.ndarray): Input image
|
||||
image: Input image with shape (H, W, C).
|
||||
|
||||
Returns:
|
||||
Tuple[np.ndarray, Tuple[int, int]]: Preprocessed blob and input size
|
||||
Preprocessed image tensor with shape (1, C, H, W).
|
||||
"""
|
||||
image = image.astype(np.float32)
|
||||
image = (image - 127.5) / 127.5
|
||||
@@ -129,29 +130,42 @@ class SCRFD(BaseDetector):
|
||||
|
||||
return image
|
||||
|
||||
def inference(self, input_tensor: np.ndarray) -> List[np.ndarray]:
|
||||
def inference(self, input_tensor: np.ndarray) -> list[np.ndarray]:
|
||||
"""Perform model inference on the preprocessed image tensor.
|
||||
|
||||
Args:
|
||||
input_tensor (np.ndarray): Preprocessed input tensor.
|
||||
input_tensor: Preprocessed input tensor with shape (1, C, H, W).
|
||||
|
||||
Returns:
|
||||
Tuple[np.ndarray, np.ndarray]: Raw model outputs.
|
||||
List of raw model outputs.
|
||||
"""
|
||||
return self.session.run(self.output_names, {self.input_names: input_tensor})
|
||||
|
||||
def postprocess(self, outputs: List[np.ndarray], image_size: Tuple[int, int]):
|
||||
scores_list = []
|
||||
def postprocess(
|
||||
self,
|
||||
outputs: list[np.ndarray],
|
||||
image_size: tuple[int, int],
|
||||
) -> tuple[list[np.ndarray], list[np.ndarray], list[np.ndarray]]:
|
||||
"""Process model outputs into detection results.
|
||||
|
||||
Args:
|
||||
outputs: Raw outputs from the detection model.
|
||||
image_size: Size of the input image as (height, width).
|
||||
|
||||
Returns:
|
||||
Tuple of (scores_list, bboxes_list, landmarks_list).
|
||||
"""
|
||||
scores_list: list[np.ndarray] = []
|
||||
bboxes_list = []
|
||||
kpss_list = []
|
||||
|
||||
image_size = image_size
|
||||
|
||||
fmc = self._fmc
|
||||
num_feature_maps = self._num_feature_maps
|
||||
for idx, stride in enumerate(self._feat_stride_fpn):
|
||||
scores = outputs[idx]
|
||||
bbox_preds = outputs[fmc + idx] * stride
|
||||
kps_preds = outputs[2 * fmc + idx] * stride
|
||||
bbox_preds = outputs[num_feature_maps + idx] * stride
|
||||
kps_preds = outputs[2 * num_feature_maps + idx] * stride
|
||||
|
||||
# Generate anchors
|
||||
fm_height = image_size[0] // stride
|
||||
@@ -171,7 +185,7 @@ class SCRFD(BaseDetector):
|
||||
if len(self._center_cache) < 100:
|
||||
self._center_cache[cache_key] = anchor_centers
|
||||
|
||||
pos_indices = np.where(scores >= self.conf_thresh)[0]
|
||||
pos_indices = np.where(scores >= self.confidence_threshold)[0]
|
||||
if len(pos_indices) == 0:
|
||||
continue
|
||||
|
||||
@@ -193,7 +207,7 @@ class SCRFD(BaseDetector):
|
||||
max_num: int = 0,
|
||||
metric: Literal['default', 'max'] = 'max',
|
||||
center_weight: float = 2.0,
|
||||
) -> List[Face]:
|
||||
) -> list[Face]:
|
||||
"""
|
||||
Perform face detection on an input image and return bounding boxes and facial landmarks.
|
||||
|
||||
@@ -247,7 +261,7 @@ class SCRFD(BaseDetector):
|
||||
pre_det = np.hstack((bboxes, scores)).astype(np.float32, copy=False)
|
||||
pre_det = pre_det[order, :]
|
||||
|
||||
keep = non_max_suppression(pre_det, threshold=self.nms_thresh)
|
||||
keep = non_max_suppression(pre_det, threshold=self.nms_threshold)
|
||||
|
||||
detections = pre_det[keep, :]
|
||||
landmarks = landmarks[order, :, :]
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import Any, List, Literal, Tuple
|
||||
from typing import Any, Literal
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
@@ -30,8 +30,8 @@ class YOLOv5Face(BaseDetector):
|
||||
Args:
|
||||
model_name (YOLOv5FaceWeights): Predefined model enum (e.g., `YOLOV5S`).
|
||||
Specifies the YOLOv5-Face variant to load. Defaults to YOLOV5S.
|
||||
conf_thresh (float): Confidence threshold for filtering detections. Defaults to 0.6.
|
||||
nms_thresh (float): Non-Maximum Suppression threshold. Defaults to 0.5.
|
||||
confidence_threshold (float): Confidence threshold for filtering detections. Defaults to 0.6.
|
||||
nms_threshold (float): Non-Maximum Suppression threshold. Defaults to 0.5.
|
||||
input_size (int): Input image size. Defaults to 640.
|
||||
Note: ONNX model is fixed at 640. Changing this will cause inference errors.
|
||||
**kwargs: Advanced options:
|
||||
@@ -39,8 +39,8 @@ class YOLOv5Face(BaseDetector):
|
||||
|
||||
Attributes:
|
||||
model_name (YOLOv5FaceWeights): Selected model variant.
|
||||
conf_thresh (float): Threshold used to filter low-confidence detections.
|
||||
nms_thresh (float): Threshold used during NMS to suppress overlapping boxes.
|
||||
confidence_threshold (float): Threshold used to filter low-confidence detections.
|
||||
nms_threshold (float): Threshold used during NMS to suppress overlapping boxes.
|
||||
input_size (int): Image size to which inputs are resized before inference.
|
||||
max_det (int): Maximum number of detections to return.
|
||||
_model_path (str): Absolute path to the downloaded/verified model weights.
|
||||
@@ -54,15 +54,15 @@ class YOLOv5Face(BaseDetector):
|
||||
self,
|
||||
*,
|
||||
model_name: YOLOv5FaceWeights = YOLOv5FaceWeights.YOLOV5S,
|
||||
conf_thresh: float = 0.6,
|
||||
nms_thresh: float = 0.5,
|
||||
confidence_threshold: float = 0.6,
|
||||
nms_threshold: float = 0.5,
|
||||
input_size: int = 640,
|
||||
**kwargs: Any,
|
||||
) -> None:
|
||||
super().__init__(
|
||||
model_name=model_name,
|
||||
conf_thresh=conf_thresh,
|
||||
nms_thresh=nms_thresh,
|
||||
confidence_threshold=confidence_threshold,
|
||||
nms_threshold=nms_threshold,
|
||||
input_size=input_size,
|
||||
**kwargs,
|
||||
)
|
||||
@@ -75,16 +75,16 @@ class YOLOv5Face(BaseDetector):
|
||||
)
|
||||
|
||||
self.model_name = model_name
|
||||
self.conf_thresh = conf_thresh
|
||||
self.nms_thresh = nms_thresh
|
||||
self.confidence_threshold = confidence_threshold
|
||||
self.nms_threshold = nms_threshold
|
||||
self.input_size = input_size
|
||||
|
||||
# Advanced options from kwargs
|
||||
self.max_det = kwargs.get('max_det', 750)
|
||||
|
||||
Logger.info(
|
||||
f'Initializing YOLOv5Face with model={self.model_name}, conf_thresh={self.conf_thresh}, '
|
||||
f'nms_thresh={self.nms_thresh}, input_size={self.input_size}'
|
||||
f'Initializing YOLOv5Face with model={self.model_name}, confidence_threshold={self.confidence_threshold}, '
|
||||
f'nms_threshold={self.nms_threshold}, input_size={self.input_size}'
|
||||
)
|
||||
|
||||
# Get path to model weights
|
||||
@@ -113,7 +113,7 @@ class YOLOv5Face(BaseDetector):
|
||||
Logger.error(f"Failed to load model from '{model_path}': {e}", exc_info=True)
|
||||
raise RuntimeError(f"Failed to initialize model session for '{model_path}'") from e
|
||||
|
||||
def preprocess(self, image: np.ndarray) -> Tuple[np.ndarray, float, Tuple[int, int]]:
|
||||
def preprocess(self, image: np.ndarray) -> tuple[np.ndarray, float, tuple[int, int]]:
|
||||
"""
|
||||
Preprocess image for inference.
|
||||
|
||||
@@ -154,7 +154,7 @@ class YOLOv5Face(BaseDetector):
|
||||
|
||||
return img_batch, scale, (pad_w, pad_h)
|
||||
|
||||
def inference(self, input_tensor: np.ndarray) -> List[np.ndarray]:
|
||||
def inference(self, input_tensor: np.ndarray) -> list[np.ndarray]:
|
||||
"""Perform model inference on the preprocessed image tensor.
|
||||
|
||||
Args:
|
||||
@@ -169,8 +169,8 @@ class YOLOv5Face(BaseDetector):
|
||||
self,
|
||||
predictions: np.ndarray,
|
||||
scale: float,
|
||||
padding: Tuple[int, int],
|
||||
) -> Tuple[np.ndarray, np.ndarray]:
|
||||
padding: tuple[int, int],
|
||||
) -> tuple[np.ndarray, np.ndarray]:
|
||||
"""
|
||||
Postprocess model predictions.
|
||||
|
||||
@@ -190,7 +190,7 @@ class YOLOv5Face(BaseDetector):
|
||||
predictions = predictions[0] # Remove batch dimension
|
||||
|
||||
# Filter by confidence
|
||||
mask = predictions[:, 4] >= self.conf_thresh
|
||||
mask = predictions[:, 4] >= self.confidence_threshold
|
||||
predictions = predictions[mask]
|
||||
|
||||
if len(predictions) == 0:
|
||||
@@ -207,7 +207,7 @@ class YOLOv5Face(BaseDetector):
|
||||
|
||||
# Apply NMS
|
||||
detections_for_nms = np.hstack((boxes, scores[:, None])).astype(np.float32, copy=False)
|
||||
keep = non_max_suppression(detections_for_nms, self.nms_thresh)
|
||||
keep = non_max_suppression(detections_for_nms, self.nms_threshold)
|
||||
|
||||
if len(keep) == 0:
|
||||
return np.array([]), np.array([])
|
||||
@@ -260,7 +260,7 @@ class YOLOv5Face(BaseDetector):
|
||||
max_num: int = 0,
|
||||
metric: Literal['default', 'max'] = 'max',
|
||||
center_weight: float = 2.0,
|
||||
) -> List[Face]:
|
||||
) -> list[Face]:
|
||||
"""
|
||||
Perform face detection on an input image and return bounding boxes and facial landmarks.
|
||||
|
||||
|
||||
@@ -2,8 +2,9 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass, fields
|
||||
from typing import Optional
|
||||
|
||||
import numpy as np
|
||||
|
||||
@@ -29,6 +30,8 @@ class Face:
|
||||
age: Predicted exact age in years (optional, from AgeGender model).
|
||||
age_group: Predicted age range like "20-29" (optional, from FairFace).
|
||||
race: Predicted race/ethnicity (optional, from FairFace).
|
||||
emotion: Predicted emotion label (optional, from Emotion model).
|
||||
emotion_confidence: Confidence score for emotion prediction (optional).
|
||||
|
||||
Properties:
|
||||
sex: Gender as a human-readable string ("Female" or "Male").
|
||||
@@ -42,13 +45,15 @@ class Face:
|
||||
landmarks: np.ndarray
|
||||
|
||||
# Optional attributes
|
||||
embedding: Optional[np.ndarray] = None
|
||||
gender: Optional[int] = None
|
||||
age: Optional[int] = None
|
||||
age_group: Optional[str] = None
|
||||
race: Optional[str] = None
|
||||
embedding: np.ndarray | None = None
|
||||
gender: int | None = None
|
||||
age: int | None = None
|
||||
age_group: str | None = None
|
||||
race: str | None = None
|
||||
emotion: str | None = None
|
||||
emotion_confidence: float | None = None
|
||||
|
||||
def compute_similarity(self, other: 'Face') -> float:
|
||||
def compute_similarity(self, other: Face) -> float:
|
||||
"""Compute cosine similarity with another face."""
|
||||
if self.embedding is None or other.embedding is None:
|
||||
raise ValueError('Both faces must have embeddings for similarity computation')
|
||||
@@ -59,7 +64,7 @@ class Face:
|
||||
return {f.name: getattr(self, f.name) for f in fields(self)}
|
||||
|
||||
@property
|
||||
def sex(self) -> Optional[str]:
|
||||
def sex(self) -> str | None:
|
||||
"""Get gender as a string label (Female or Male)."""
|
||||
if self.gender is None:
|
||||
return None
|
||||
@@ -85,6 +90,8 @@ class Face:
|
||||
parts.append(f'sex={self.sex}')
|
||||
if self.race is not None:
|
||||
parts.append(f'race={self.race}')
|
||||
if self.emotion is not None:
|
||||
parts.append(f'emotion={self.emotion}')
|
||||
if self.embedding is not None:
|
||||
parts.append(f'embedding_dim={self.embedding.shape[0]}')
|
||||
return ', '.join(parts) + ')'
|
||||
|
||||
@@ -2,21 +2,21 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import Tuple, Union
|
||||
from __future__ import annotations
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from skimage.transform import SimilarityTransform
|
||||
|
||||
__all__ = [
|
||||
'face_alignment',
|
||||
'compute_similarity',
|
||||
'bbox_center_alignment',
|
||||
'compute_similarity',
|
||||
'face_alignment',
|
||||
'transform_points_2d',
|
||||
]
|
||||
|
||||
|
||||
# Reference alignment for facial landmarks (ArcFace)
|
||||
# Standard 5-point facial landmark reference for ArcFace alignment (112x112)
|
||||
reference_alignment: np.ndarray = np.array(
|
||||
[
|
||||
[38.2946, 51.6963],
|
||||
@@ -29,22 +29,25 @@ reference_alignment: np.ndarray = np.array(
|
||||
)
|
||||
|
||||
|
||||
def estimate_norm(landmark: np.ndarray, image_size: Union[int, Tuple[int, int]] = 112) -> Tuple[np.ndarray, np.ndarray]:
|
||||
"""
|
||||
Estimate the normalization transformation matrix for facial landmarks.
|
||||
def estimate_norm(
|
||||
landmark: np.ndarray,
|
||||
image_size: int | tuple[int, int] = 112,
|
||||
) -> tuple[np.ndarray, np.ndarray]:
|
||||
"""Estimate the normalization transformation matrix for facial landmarks.
|
||||
|
||||
Args:
|
||||
landmark (np.ndarray): Array of shape (5, 2) representing the coordinates of the facial landmarks.
|
||||
image_size (Union[int, Tuple[int, int]], optional): The size of the output image.
|
||||
Can be an integer (for square images) or a tuple (width, height). Default is 112.
|
||||
landmark: Array of shape (5, 2) representing the coordinates of the facial landmarks.
|
||||
image_size: The size of the output image. Can be an integer (for square images)
|
||||
or a tuple (width, height). Default is 112.
|
||||
|
||||
Returns:
|
||||
np.ndarray: The 2x3 transformation matrix for aligning the landmarks.
|
||||
np.ndarray: The 2x3 inverse transformation matrix for aligning the landmarks.
|
||||
A tuple containing:
|
||||
- The 2x3 transformation matrix for aligning the landmarks.
|
||||
- The 2x3 inverse transformation matrix.
|
||||
|
||||
Raises:
|
||||
AssertionError: If the input landmark array does not have the shape (5, 2)
|
||||
or if image_size is not a multiple of 112 or 128.
|
||||
or if image_size is not a multiple of 112 or 128.
|
||||
"""
|
||||
assert landmark.shape == (5, 2), 'Landmark array must have shape (5, 2).'
|
||||
|
||||
@@ -80,23 +83,23 @@ def estimate_norm(landmark: np.ndarray, image_size: Union[int, Tuple[int, int]]
|
||||
def face_alignment(
|
||||
image: np.ndarray,
|
||||
landmark: np.ndarray,
|
||||
image_size: Union[int, Tuple[int, int]] = 112,
|
||||
) -> Tuple[np.ndarray, np.ndarray]:
|
||||
"""
|
||||
Align the face in the input image based on the given facial landmarks.
|
||||
image_size: int | tuple[int, int] = 112,
|
||||
) -> tuple[np.ndarray, np.ndarray]:
|
||||
"""Align the face in the input image based on the given facial landmarks.
|
||||
|
||||
Args:
|
||||
image (np.ndarray): Input image as a NumPy array.
|
||||
landmark (np.ndarray): Array of shape (5, 2) representing the coordinates of the facial landmarks.
|
||||
image_size (Union[int, Tuple[int, int]], optional): The size of the aligned output image.
|
||||
Can be an integer (for square images) or a tuple (width, height). Default is 112.
|
||||
image: Input image as a NumPy array with shape (H, W, C).
|
||||
landmark: Array of shape (5, 2) representing the facial landmark coordinates.
|
||||
image_size: The size of the aligned output image. Can be an integer
|
||||
(for square images) or a tuple (width, height). Default is 112.
|
||||
|
||||
Returns:
|
||||
np.ndarray: The aligned face as a NumPy array.
|
||||
np.ndarray: The 2x3 transformation matrix used for alignment.
|
||||
A tuple containing:
|
||||
- The aligned face as a NumPy array.
|
||||
- The 2x3 inverse transformation matrix used for alignment.
|
||||
"""
|
||||
# Get the transformation matrix
|
||||
M, M_inv = estimate_norm(landmark, image_size)
|
||||
transform_matrix, inverse_transform = estimate_norm(landmark, image_size)
|
||||
|
||||
# Handle both int and tuple for warpAffine output size
|
||||
if isinstance(image_size, int):
|
||||
@@ -105,44 +108,50 @@ def face_alignment(
|
||||
output_size = image_size
|
||||
|
||||
# Warp the input image to align the face
|
||||
warped = cv2.warpAffine(image, M, output_size, borderValue=0.0)
|
||||
warped = cv2.warpAffine(image, transform_matrix, output_size, borderValue=0.0)
|
||||
|
||||
return warped, M_inv
|
||||
return warped, inverse_transform
|
||||
|
||||
|
||||
def compute_similarity(feat1: np.ndarray, feat2: np.ndarray, normalized: bool = False) -> np.float32:
|
||||
"""Computing Similarity between two faces.
|
||||
"""Compute cosine similarity between two face embeddings.
|
||||
|
||||
Args:
|
||||
feat1 (np.ndarray): First embedding.
|
||||
feat2 (np.ndarray): Second embedding.
|
||||
normalized (bool): Set True if the embeddings are already L2 normalized.
|
||||
feat1: First embedding vector.
|
||||
feat2: Second embedding vector.
|
||||
normalized: Set True if the embeddings are already L2 normalized.
|
||||
|
||||
Returns:
|
||||
np.float32: Cosine similarity.
|
||||
Cosine similarity score in range [-1, 1].
|
||||
"""
|
||||
feat1 = feat1.ravel()
|
||||
feat2 = feat2.ravel()
|
||||
if normalized:
|
||||
return np.dot(feat1, feat2)
|
||||
else:
|
||||
return np.dot(feat1, feat2) / (np.linalg.norm(feat1) * np.linalg.norm(feat2) + 1e-5)
|
||||
# Add small epsilon to prevent division by zero
|
||||
return np.dot(feat1, feat2) / (np.linalg.norm(feat1) * np.linalg.norm(feat2) + 1e-5)
|
||||
|
||||
|
||||
def bbox_center_alignment(image, center, output_size, scale, rotation):
|
||||
"""
|
||||
Apply center-based alignment, scaling, and rotation to an image.
|
||||
def bbox_center_alignment(
|
||||
image: np.ndarray,
|
||||
center: tuple[float, float],
|
||||
output_size: int,
|
||||
scale: float,
|
||||
rotation: float,
|
||||
) -> tuple[np.ndarray, np.ndarray]:
|
||||
"""Apply center-based alignment, scaling, and rotation to an image.
|
||||
|
||||
Args:
|
||||
image (np.ndarray): Input image.
|
||||
center (Tuple[float, float]): Center point (e.g., face center from bbox).
|
||||
output_size (int): Desired output image size (square).
|
||||
scale (float): Scaling factor to zoom in/out.
|
||||
rotation (float): Rotation angle in degrees (clockwise).
|
||||
image: Input image with shape (H, W, C).
|
||||
center: Center point (x, y), e.g., face center from bbox.
|
||||
output_size: Desired output image size (square).
|
||||
scale: Scaling factor to zoom in/out.
|
||||
rotation: Rotation angle in degrees (clockwise).
|
||||
|
||||
Returns:
|
||||
cropped (np.ndarray): Aligned and cropped image.
|
||||
M (np.ndarray): 2x3 affine transform matrix used.
|
||||
A tuple containing:
|
||||
- Aligned and cropped image with shape (output_size, output_size, C).
|
||||
- 2x3 affine transform matrix used.
|
||||
"""
|
||||
|
||||
# Convert rotation from degrees to radians
|
||||
@@ -175,15 +184,14 @@ def bbox_center_alignment(image, center, output_size, scale, rotation):
|
||||
|
||||
|
||||
def transform_points_2d(points: np.ndarray, transform: np.ndarray) -> np.ndarray:
|
||||
"""
|
||||
Apply a 2D affine transformation to an array of 2D points.
|
||||
"""Apply a 2D affine transformation to an array of 2D points.
|
||||
|
||||
Args:
|
||||
points (np.ndarray): An (N, 2) array of 2D points.
|
||||
transform (np.ndarray): A (2, 3) affine transformation matrix.
|
||||
points: An (N, 2) array of 2D points.
|
||||
transform: A (2, 3) affine transformation matrix.
|
||||
|
||||
Returns:
|
||||
np.ndarray: Transformed (N, 2) array of points.
|
||||
Transformed (N, 2) array of points.
|
||||
"""
|
||||
transformed = np.zeros_like(points, dtype=np.float32)
|
||||
for i in range(points.shape[0]):
|
||||
|
||||
@@ -34,10 +34,7 @@ def create_gaze_estimator(method: str = 'mobilegaze', **kwargs) -> BaseGazeEstim
|
||||
|
||||
>>> # Create with MobileNetV2 backbone
|
||||
>>> from uniface.constants import GazeWeights
|
||||
>>> estimator = create_gaze_estimator(
|
||||
... 'mobilegaze',
|
||||
... model_name=GazeWeights.MOBILENET_V2
|
||||
... )
|
||||
>>> estimator = create_gaze_estimator('mobilegaze', model_name=GazeWeights.MOBILENET_V2)
|
||||
|
||||
>>> # Use the estimator
|
||||
>>> pitch, yaw = estimator.estimate(face_crop)
|
||||
@@ -51,4 +48,4 @@ def create_gaze_estimator(method: str = 'mobilegaze', **kwargs) -> BaseGazeEstim
|
||||
raise ValueError(f"Unsupported gaze estimation method: '{method}'. Available: {available}")
|
||||
|
||||
|
||||
__all__ = ['create_gaze_estimator', 'MobileGaze', 'BaseGazeEstimator']
|
||||
__all__ = ['BaseGazeEstimator', 'MobileGaze', 'create_gaze_estimator']
|
||||
|
||||
@@ -3,7 +3,6 @@
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from abc import ABC, abstractmethod
|
||||
from typing import Tuple
|
||||
|
||||
import numpy as np
|
||||
|
||||
@@ -54,7 +53,7 @@ class BaseGazeEstimator(ABC):
|
||||
raise NotImplementedError('Subclasses must implement the preprocess method.')
|
||||
|
||||
@abstractmethod
|
||||
def postprocess(self, outputs: Tuple[np.ndarray, np.ndarray]) -> Tuple[float, float]:
|
||||
def postprocess(self, outputs: tuple[np.ndarray, np.ndarray]) -> tuple[float, float]:
|
||||
"""
|
||||
Postprocess raw model outputs into gaze angles.
|
||||
|
||||
@@ -71,7 +70,7 @@ class BaseGazeEstimator(ABC):
|
||||
raise NotImplementedError('Subclasses must implement the postprocess method.')
|
||||
|
||||
@abstractmethod
|
||||
def estimate(self, face_image: np.ndarray) -> Tuple[float, float]:
|
||||
def estimate(self, face_image: np.ndarray) -> tuple[float, float]:
|
||||
"""
|
||||
Perform end-to-end gaze estimation on a face image.
|
||||
|
||||
@@ -91,11 +90,11 @@ class BaseGazeEstimator(ABC):
|
||||
Example:
|
||||
>>> estimator = create_gaze_estimator()
|
||||
>>> pitch, yaw = estimator.estimate(face_crop)
|
||||
>>> print(f"Looking: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°")
|
||||
>>> print(f'Looking: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°')
|
||||
"""
|
||||
raise NotImplementedError('Subclasses must implement the estimate method.')
|
||||
|
||||
def __call__(self, face_image: np.ndarray) -> Tuple[float, float]:
|
||||
def __call__(self, face_image: np.ndarray) -> tuple[float, float]:
|
||||
"""
|
||||
Provides a convenient, callable shortcut for the `estimate` method.
|
||||
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import Tuple
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
@@ -54,17 +53,17 @@ class MobileGaze(BaseGazeEstimator):
|
||||
>>> # Detect faces and estimate gaze for each
|
||||
>>> faces = detector.detect(image)
|
||||
>>> for face in faces:
|
||||
... bbox = face['bbox']
|
||||
... bbox = face.bbox
|
||||
... x1, y1, x2, y2 = map(int, bbox[:4])
|
||||
... face_crop = image[y1:y2, x1:x2]
|
||||
... pitch, yaw = gaze_estimator.estimate(face_crop)
|
||||
... print(f"Gaze: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°")
|
||||
... print(f'Gaze: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°')
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
model_name: GazeWeights = GazeWeights.RESNET34,
|
||||
input_size: Tuple[int, int] = (448, 448),
|
||||
input_size: tuple[int, int] = (448, 448),
|
||||
) -> None:
|
||||
Logger.info(f'Initializing MobileGaze with model={model_name}, input_size={input_size}')
|
||||
|
||||
@@ -143,7 +142,7 @@ class MobileGaze(BaseGazeEstimator):
|
||||
e_x = np.exp(x - np.max(x, axis=1, keepdims=True))
|
||||
return e_x / e_x.sum(axis=1, keepdims=True)
|
||||
|
||||
def postprocess(self, outputs: Tuple[np.ndarray, np.ndarray]) -> Tuple[np.ndarray, np.ndarray]:
|
||||
def postprocess(self, outputs: tuple[np.ndarray, np.ndarray]) -> tuple[np.ndarray, np.ndarray]:
|
||||
"""
|
||||
Postprocess raw model outputs into gaze angles.
|
||||
|
||||
@@ -173,7 +172,7 @@ class MobileGaze(BaseGazeEstimator):
|
||||
|
||||
return pitch, yaw
|
||||
|
||||
def estimate(self, face_image: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
|
||||
def estimate(self, face_image: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
|
||||
"""
|
||||
Perform end-to-end gaze estimation on a face image.
|
||||
|
||||
|
||||
@@ -25,4 +25,4 @@ def create_landmarker(method: str = '2d106det', **kwargs) -> BaseLandmarker:
|
||||
raise ValueError(f"Unsupported method: '{method}'. Available: {available}")
|
||||
|
||||
|
||||
__all__ = ['create_landmarker', 'Landmark106', 'BaseLandmarker']
|
||||
__all__ = ['BaseLandmarker', 'Landmark106', 'create_landmarker']
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import Tuple
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
@@ -46,7 +45,7 @@ class Landmark106(BaseLandmarker):
|
||||
def __init__(
|
||||
self,
|
||||
model_name: LandmarkWeights = LandmarkWeights.DEFAULT,
|
||||
input_size: Tuple[int, int] = (192, 192),
|
||||
input_size: tuple[int, int] = (192, 192),
|
||||
) -> None:
|
||||
Logger.info(f'Initializing Facial Landmark with model={model_name}, input_size={input_size}')
|
||||
self.input_size = input_size
|
||||
@@ -85,7 +84,7 @@ class Landmark106(BaseLandmarker):
|
||||
Logger.error(f"Failed to load landmark model from '{self.model_path}'", exc_info=True)
|
||||
raise RuntimeError(f'Failed to initialize landmark model: {e}') from e
|
||||
|
||||
def preprocess(self, image: np.ndarray, bbox: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
|
||||
def preprocess(self, image: np.ndarray, bbox: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
|
||||
"""Prepares a face crop for inference.
|
||||
|
||||
This method takes a face bounding box, performs a center alignment to
|
||||
|
||||
@@ -1,21 +1,41 @@
|
||||
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
"""Logging utilities for UniFace.
|
||||
|
||||
This module provides a centralized logger for the UniFace library,
|
||||
allowing users to enable verbose logging when debugging or developing.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
|
||||
__all__ = ['Logger', 'enable_logging']
|
||||
|
||||
# Create logger for uniface
|
||||
Logger = logging.getLogger('uniface')
|
||||
Logger.setLevel(logging.WARNING) # Only show warnings/errors by default
|
||||
Logger.addHandler(logging.NullHandler())
|
||||
|
||||
|
||||
def enable_logging(level=logging.INFO):
|
||||
"""
|
||||
Enable verbose logging for uniface.
|
||||
def enable_logging(level: int = logging.INFO) -> None:
|
||||
"""Enable verbose logging for uniface.
|
||||
|
||||
Configures the logger to output messages to stdout with timestamps.
|
||||
Call this function to see informational messages during model loading
|
||||
and inference.
|
||||
|
||||
Args:
|
||||
level: Logging level (logging.DEBUG, logging.INFO, etc.)
|
||||
level: Logging level. Defaults to logging.INFO.
|
||||
Common values: logging.DEBUG, logging.INFO, logging.WARNING.
|
||||
|
||||
Example:
|
||||
>>> from uniface import enable_logging
|
||||
>>> import logging
|
||||
>>> enable_logging() # Show INFO logs
|
||||
>>> enable_logging(level=logging.DEBUG) # Show DEBUG logs
|
||||
"""
|
||||
Logger.handlers.clear()
|
||||
handler = logging.StreamHandler()
|
||||
|
||||
@@ -2,6 +2,15 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
"""Model weight management for UniFace.
|
||||
|
||||
This module handles downloading, caching, and verifying model weights
|
||||
using SHA-256 checksums for integrity validation.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from enum import Enum
|
||||
import hashlib
|
||||
import os
|
||||
|
||||
@@ -14,33 +23,32 @@ from uniface.log import Logger
|
||||
__all__ = ['verify_model_weights']
|
||||
|
||||
|
||||
def verify_model_weights(model_name: str, root: str = '~/.uniface/models') -> str:
|
||||
"""
|
||||
Ensure model weights are present, downloading and verifying them using SHA-256 if necessary.
|
||||
def verify_model_weights(model_name: Enum, root: str = '~/.uniface/models') -> str:
|
||||
"""Ensure model weights are present, downloading and verifying them if necessary.
|
||||
|
||||
Given a model identifier from an Enum class (e.g., `RetinaFaceWeights.MNET_V2`), this function checks if
|
||||
the corresponding `.onnx` weight file exists locally. If not, it downloads the file from a predefined URL.
|
||||
After download, the file’s integrity is verified using a SHA-256 hash. If verification fails, the file is deleted
|
||||
and an error is raised.
|
||||
Given a model identifier from an Enum class (e.g., `RetinaFaceWeights.MNET_V2`),
|
||||
this function checks if the corresponding weight file exists locally. If not,
|
||||
it downloads the file from a predefined URL and verifies its integrity using
|
||||
a SHA-256 hash.
|
||||
|
||||
Args:
|
||||
model_name (Enum): Model weight identifier (e.g., `RetinaFaceWeights.MNET_V2`, `ArcFaceWeights.RESNET`, etc.).
|
||||
root (str, optional): Directory to store or locate the model weights. Defaults to '~/.uniface/models'.
|
||||
model_name: Model weight identifier enum (e.g., `RetinaFaceWeights.MNET_V2`).
|
||||
root: Directory to store or locate the model weights.
|
||||
Defaults to '~/.uniface/models'.
|
||||
|
||||
Returns:
|
||||
str: Absolute path to the verified model weights file.
|
||||
Absolute path to the verified model weights file.
|
||||
|
||||
Raises:
|
||||
ValueError: If the model is unknown or SHA-256 verification fails.
|
||||
ConnectionError: If downloading the file fails.
|
||||
|
||||
Examples:
|
||||
>>> from uniface.models import RetinaFaceWeights, verify_model_weights
|
||||
>>> verify_model_weights(RetinaFaceWeights.MNET_V2)
|
||||
Example:
|
||||
>>> from uniface.constants import RetinaFaceWeights
|
||||
>>> from uniface.model_store import verify_model_weights
|
||||
>>> path = verify_model_weights(RetinaFaceWeights.MNET_V2)
|
||||
>>> print(path)
|
||||
'/home/user/.uniface/models/retinaface_mnet_v2.onnx'
|
||||
|
||||
>>> verify_model_weights(RetinaFaceWeights.RESNET34, root='/custom/dir')
|
||||
'/custom/dir/retinaface_r34.onnx'
|
||||
"""
|
||||
|
||||
root = os.path.expanduser(root)
|
||||
@@ -73,10 +81,16 @@ def verify_model_weights(model_name: str, root: str = '~/.uniface/models') -> st
|
||||
return model_path
|
||||
|
||||
|
||||
def download_file(url: str, dest_path: str) -> None:
|
||||
"""Download a file from a URL in chunks and save it to the destination path."""
|
||||
def download_file(url: str, dest_path: str, timeout: int = 30) -> None:
|
||||
"""Download a file from a URL in chunks and save it to the destination path.
|
||||
|
||||
Args:
|
||||
url: URL to download from.
|
||||
dest_path: Local file path to save to.
|
||||
timeout: Connection timeout in seconds. Defaults to 30.
|
||||
"""
|
||||
try:
|
||||
response = requests.get(url, stream=True)
|
||||
response = requests.get(url, stream=True, timeout=timeout)
|
||||
response.raise_for_status()
|
||||
with (
|
||||
open(dest_path, 'wb') as file,
|
||||
|
||||
@@ -2,16 +2,23 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import List
|
||||
"""ONNX Runtime utilities for UniFace.
|
||||
|
||||
This module provides helper functions for creating and managing ONNX Runtime
|
||||
inference sessions with automatic hardware acceleration detection.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import onnxruntime as ort
|
||||
|
||||
from uniface.log import Logger
|
||||
|
||||
__all__ = ['create_onnx_session', 'get_available_providers']
|
||||
|
||||
def get_available_providers() -> List[str]:
|
||||
"""
|
||||
Get list of available ONNX Runtime execution providers for the current platform.
|
||||
|
||||
def get_available_providers() -> list[str]:
|
||||
"""Get list of available ONNX Runtime execution providers.
|
||||
|
||||
Automatically detects and prioritizes hardware acceleration:
|
||||
- CoreML on Apple Silicon (M1/M2/M3/M4)
|
||||
@@ -19,13 +26,12 @@ def get_available_providers() -> List[str]:
|
||||
- CPU as fallback (always available)
|
||||
|
||||
Returns:
|
||||
List[str]: Ordered list of execution providers to use
|
||||
Ordered list of execution providers to use.
|
||||
|
||||
Examples:
|
||||
Example:
|
||||
>>> providers = get_available_providers()
|
||||
>>> # On M4 Mac: ['CoreMLExecutionProvider', 'CPUExecutionProvider']
|
||||
>>> # On Linux with CUDA: ['CUDAExecutionProvider', 'CPUExecutionProvider']
|
||||
>>> # On CPU-only: ['CPUExecutionProvider']
|
||||
"""
|
||||
available = ort.get_available_providers()
|
||||
providers = []
|
||||
@@ -48,26 +54,28 @@ def get_available_providers() -> List[str]:
|
||||
return providers
|
||||
|
||||
|
||||
def create_onnx_session(model_path: str, providers: List[str] = None) -> ort.InferenceSession:
|
||||
"""
|
||||
Create an ONNX Runtime inference session with optimal provider selection.
|
||||
def create_onnx_session(
|
||||
model_path: str,
|
||||
providers: list[str] | None = None,
|
||||
) -> ort.InferenceSession:
|
||||
"""Create an ONNX Runtime inference session with optimal provider selection.
|
||||
|
||||
Args:
|
||||
model_path (str): Path to the ONNX model file
|
||||
providers (List[str], optional): List of providers to use.
|
||||
If None, automatically detects best available providers.
|
||||
model_path: Path to the ONNX model file.
|
||||
providers: List of execution providers to use. If None, automatically
|
||||
detects best available providers.
|
||||
|
||||
Returns:
|
||||
ort.InferenceSession: Configured ONNX Runtime session
|
||||
Configured ONNX Runtime session.
|
||||
|
||||
Raises:
|
||||
RuntimeError: If session creation fails
|
||||
RuntimeError: If session creation fails.
|
||||
|
||||
Examples:
|
||||
>>> session = create_onnx_session("model.onnx")
|
||||
Example:
|
||||
>>> session = create_onnx_session('model.onnx')
|
||||
>>> # Automatically uses best available providers
|
||||
|
||||
>>> session = create_onnx_session("model.onnx", providers=["CPUExecutionProvider"])
|
||||
>>> session = create_onnx_session('model.onnx', providers=['CPUExecutionProvider'])
|
||||
>>> # Force CPU-only execution
|
||||
"""
|
||||
if providers is None:
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import Union
|
||||
from __future__ import annotations
|
||||
|
||||
from uniface.constants import ParsingWeights
|
||||
|
||||
@@ -13,38 +13,29 @@ __all__ = ['BaseFaceParser', 'BiSeNet', 'create_face_parser']
|
||||
|
||||
|
||||
def create_face_parser(
|
||||
model_name: Union[str, ParsingWeights] = ParsingWeights.RESNET18,
|
||||
model_name: str | ParsingWeights = ParsingWeights.RESNET18,
|
||||
) -> BaseFaceParser:
|
||||
"""
|
||||
Factory function to create a face parsing model instance.
|
||||
"""Factory function to create a face parsing model instance.
|
||||
|
||||
This function provides a convenient way to instantiate face parsing models
|
||||
without directly importing the specific model classes. It supports both
|
||||
string-based and enum-based model selection.
|
||||
without directly importing the specific model classes.
|
||||
|
||||
Args:
|
||||
model_name (Union[str, ParsingWeights]): The face parsing model to create.
|
||||
Can be either a string or a ParsingWeights enum value.
|
||||
Available options:
|
||||
model_name: The face parsing model to create. Can be either a string
|
||||
or a ParsingWeights enum value. Available options:
|
||||
- 'parsing_resnet18' or ParsingWeights.RESNET18 (default)
|
||||
- 'parsing_resnet34' or ParsingWeights.RESNET34
|
||||
|
||||
Returns:
|
||||
BaseFaceParser: An instance of the requested face parsing model.
|
||||
An instance of the requested face parsing model.
|
||||
|
||||
Raises:
|
||||
ValueError: If the model_name is not recognized.
|
||||
|
||||
Examples:
|
||||
>>> # Using enum
|
||||
Example:
|
||||
>>> from uniface.parsing import create_face_parser
|
||||
>>> from uniface.constants import ParsingWeights
|
||||
>>> parser = create_face_parser(ParsingWeights.RESNET18)
|
||||
>>>
|
||||
>>> # Using string
|
||||
>>> parser = create_face_parser('parsing_resnet18')
|
||||
>>>
|
||||
>>> # Parse a face image
|
||||
>>> mask = parser.parse(face_crop)
|
||||
"""
|
||||
# Convert string to enum if necessary
|
||||
|
||||
@@ -3,7 +3,6 @@
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from abc import ABC, abstractmethod
|
||||
from typing import Tuple
|
||||
|
||||
import numpy as np
|
||||
|
||||
@@ -53,7 +52,7 @@ class BaseFaceParser(ABC):
|
||||
raise NotImplementedError('Subclasses must implement the preprocess method.')
|
||||
|
||||
@abstractmethod
|
||||
def postprocess(self, outputs: np.ndarray, original_size: Tuple[int, int]) -> np.ndarray:
|
||||
def postprocess(self, outputs: np.ndarray, original_size: tuple[int, int]) -> np.ndarray:
|
||||
"""
|
||||
Postprocess raw model outputs into a segmentation mask.
|
||||
|
||||
@@ -89,7 +88,7 @@ class BaseFaceParser(ABC):
|
||||
Example:
|
||||
>>> parser = create_face_parser()
|
||||
>>> mask = parser.parse(face_crop)
|
||||
>>> print(f"Mask shape: {mask.shape}, unique classes: {np.unique(mask)}")
|
||||
>>> print(f'Mask shape: {mask.shape}, unique classes: {np.unique(mask)}')
|
||||
"""
|
||||
raise NotImplementedError('Subclasses must implement the parse method.')
|
||||
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import Tuple
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
@@ -54,17 +53,17 @@ class BiSeNet(BaseFaceParser):
|
||||
>>> # Detect faces and parse each face
|
||||
>>> faces = detector.detect(image)
|
||||
>>> for face in faces:
|
||||
... bbox = face['bbox']
|
||||
... bbox = face.bbox
|
||||
... x1, y1, x2, y2 = map(int, bbox[:4])
|
||||
... face_crop = image[y1:y2, x1:x2]
|
||||
... mask = parser.parse(face_crop)
|
||||
... print(f"Mask shape: {mask.shape}, unique classes: {np.unique(mask)}")
|
||||
... print(f'Mask shape: {mask.shape}, unique classes: {np.unique(mask)}')
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
model_name: ParsingWeights = ParsingWeights.RESNET18,
|
||||
input_size: Tuple[int, int] = (512, 512),
|
||||
input_size: tuple[int, int] = (512, 512),
|
||||
) -> None:
|
||||
Logger.info(f'Initializing BiSeNet with model={model_name}, input_size={input_size}')
|
||||
|
||||
@@ -127,7 +126,7 @@ class BiSeNet(BaseFaceParser):
|
||||
|
||||
return image
|
||||
|
||||
def postprocess(self, outputs: np.ndarray, original_size: Tuple[int, int]) -> np.ndarray:
|
||||
def postprocess(self, outputs: np.ndarray, original_size: tuple[int, int]) -> np.ndarray:
|
||||
"""
|
||||
Postprocess model output to segmentation mask.
|
||||
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import Optional
|
||||
from __future__ import annotations
|
||||
|
||||
import numpy as np
|
||||
|
||||
@@ -11,11 +11,11 @@ from .blur import BlurFace
|
||||
|
||||
def anonymize_faces(
|
||||
image: np.ndarray,
|
||||
detector: Optional[object] = None,
|
||||
detector: object | None = None,
|
||||
method: str = 'pixelate',
|
||||
blur_strength: float = 3.0,
|
||||
pixel_blocks: int = 10,
|
||||
conf_thresh: float = 0.5,
|
||||
confidence_threshold: float = 0.5,
|
||||
**kwargs,
|
||||
) -> np.ndarray:
|
||||
"""One-line face anonymization with automatic detection.
|
||||
@@ -26,7 +26,7 @@ def anonymize_faces(
|
||||
method (str): Blur method name. Defaults to 'pixelate'.
|
||||
blur_strength (float): Blur intensity. Defaults to 3.0.
|
||||
pixel_blocks (int): Block count for pixelate. Defaults to 10.
|
||||
conf_thresh (float): Detection confidence threshold. Defaults to 0.5.
|
||||
confidence_threshold (float): Detection confidence threshold. Defaults to 0.5.
|
||||
**kwargs: Additional detector arguments.
|
||||
|
||||
Returns:
|
||||
@@ -40,7 +40,7 @@ def anonymize_faces(
|
||||
try:
|
||||
from uniface import RetinaFace
|
||||
|
||||
detector = RetinaFace(conf_thresh=conf_thresh, **kwargs)
|
||||
detector = RetinaFace(confidence_threshold=confidence_threshold, **kwargs)
|
||||
except ImportError as err:
|
||||
raise ImportError('Could not import RetinaFace. Please ensure UniFace is properly installed.') from err
|
||||
|
||||
|
||||
@@ -2,12 +2,17 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import Dict, List, Tuple, Union
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import TYPE_CHECKING, ClassVar
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
__all__ = ['BlurFace']
|
||||
if TYPE_CHECKING:
|
||||
pass
|
||||
|
||||
__all__ = ['BlurFace', 'EllipticalBlur']
|
||||
|
||||
|
||||
def _gaussian_blur(region: np.ndarray, strength: float = 3.0) -> np.ndarray:
|
||||
@@ -32,7 +37,7 @@ def _pixelate_blur(region: np.ndarray, blocks: int = 10) -> np.ndarray:
|
||||
return cv2.resize(temp, (w, h), interpolation=cv2.INTER_NEAREST)
|
||||
|
||||
|
||||
def _blackout_blur(region: np.ndarray, color: Tuple[int, int, int] = (0, 0, 0)) -> np.ndarray:
|
||||
def _blackout_blur(region: np.ndarray, color: tuple[int, int, int] = (0, 0, 0)) -> np.ndarray:
|
||||
"""Replace region with solid color."""
|
||||
return np.full_like(region, color)
|
||||
|
||||
@@ -55,7 +60,7 @@ class EllipticalBlur:
|
||||
def __call__(
|
||||
self,
|
||||
image: np.ndarray,
|
||||
bboxes: List[Union[Tuple, List]],
|
||||
bboxes: list[tuple | list],
|
||||
inplace: bool = False,
|
||||
) -> np.ndarray:
|
||||
if not inplace:
|
||||
@@ -98,14 +103,14 @@ class BlurFace:
|
||||
>>> anonymized = blurrer.anonymize(image, faces)
|
||||
"""
|
||||
|
||||
VALID_METHODS = {'gaussian', 'pixelate', 'blackout', 'elliptical', 'median'}
|
||||
VALID_METHODS: ClassVar[set[str]] = {'gaussian', 'pixelate', 'blackout', 'elliptical', 'median'}
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
method: str = 'pixelate',
|
||||
blur_strength: float = 3.0,
|
||||
pixel_blocks: int = 15,
|
||||
color: Tuple[int, int, int] = (0, 0, 0),
|
||||
color: tuple[int, int, int] = (0, 0, 0),
|
||||
margin: int = 20,
|
||||
):
|
||||
self.method = method.lower()
|
||||
@@ -121,6 +126,7 @@ class BlurFace:
|
||||
self._elliptical = EllipticalBlur(blur_strength, margin)
|
||||
|
||||
def _blur_region(self, region: np.ndarray) -> np.ndarray:
|
||||
"""Apply blur to a single region based on the configured method."""
|
||||
if self.method == 'gaussian':
|
||||
return _gaussian_blur(region, self._blur_strength)
|
||||
elif self.method == 'median':
|
||||
@@ -129,11 +135,12 @@ class BlurFace:
|
||||
return _pixelate_blur(region, self._pixel_blocks)
|
||||
elif self.method == 'blackout':
|
||||
return _blackout_blur(region, self._color)
|
||||
return region # Fallback (should not reach here)
|
||||
|
||||
def anonymize(
|
||||
self,
|
||||
image: np.ndarray,
|
||||
faces: List[Dict],
|
||||
faces: list,
|
||||
inplace: bool = False,
|
||||
) -> np.ndarray:
|
||||
"""Anonymize faces in an image.
|
||||
@@ -149,13 +156,13 @@ class BlurFace:
|
||||
if not faces:
|
||||
return image if inplace else image.copy()
|
||||
|
||||
bboxes = [face['bbox'] for face in faces]
|
||||
bboxes = [face.bbox for face in faces]
|
||||
return self.blur_regions(image, bboxes, inplace)
|
||||
|
||||
def blur_regions(
|
||||
self,
|
||||
image: np.ndarray,
|
||||
bboxes: List[Union[Tuple, List]],
|
||||
bboxes: list[tuple | list],
|
||||
inplace: bool = False,
|
||||
) -> np.ndarray:
|
||||
"""Blur specific rectangular regions in an image.
|
||||
|
||||
@@ -34,10 +34,7 @@ def create_recognizer(method: str = 'arcface', **kwargs) -> BaseRecognizer:
|
||||
|
||||
>>> # Create a specific MobileFace recognizer
|
||||
>>> from uniface.constants import MobileFaceWeights
|
||||
>>> recognizer = create_recognizer(
|
||||
... 'mobileface',
|
||||
... model_name=MobileFaceWeights.MNET_V2
|
||||
... )
|
||||
>>> recognizer = create_recognizer('mobileface', model_name=MobileFaceWeights.MNET_V2)
|
||||
|
||||
>>> # Create a SphereFace recognizer
|
||||
>>> recognizer = create_recognizer('sphereface')
|
||||
@@ -55,4 +52,4 @@ def create_recognizer(method: str = 'arcface', **kwargs) -> BaseRecognizer:
|
||||
raise ValueError(f"Unsupported method: '{method}'. Available: {available}")
|
||||
|
||||
|
||||
__all__ = ['create_recognizer', 'BaseRecognizer', 'ArcFace', 'MobileFace', 'SphereFace']
|
||||
__all__ = ['ArcFace', 'BaseRecognizer', 'MobileFace', 'SphereFace', 'create_recognizer']
|
||||
|
||||
@@ -2,9 +2,10 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from abc import ABC, abstractmethod
|
||||
from dataclasses import dataclass
|
||||
from typing import List, Tuple, Union
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
@@ -13,16 +14,22 @@ from uniface.face_utils import face_alignment
|
||||
from uniface.log import Logger
|
||||
from uniface.onnx_utils import create_onnx_session
|
||||
|
||||
__all__ = ['BaseRecognizer', 'PreprocessConfig']
|
||||
|
||||
|
||||
@dataclass
|
||||
class PreprocessConfig:
|
||||
"""
|
||||
Configuration for preprocessing images before feeding them into the model.
|
||||
"""Configuration for preprocessing images before feeding them into the model.
|
||||
|
||||
Attributes:
|
||||
input_mean: Mean value(s) for normalization.
|
||||
input_std: Standard deviation value(s) for normalization.
|
||||
input_size: Target image size as (height, width).
|
||||
"""
|
||||
|
||||
input_mean: Union[float, List[float]] = 127.5
|
||||
input_std: Union[float, List[float]] = 127.5
|
||||
input_size: Tuple[int, int] = (112, 112)
|
||||
input_mean: float | list[float] = 127.5
|
||||
input_std: float | list[float] = 127.5
|
||||
input_size: tuple[int, int] = (112, 112)
|
||||
|
||||
|
||||
class BaseRecognizer(ABC):
|
||||
@@ -94,7 +101,7 @@ class BaseRecognizer(ABC):
|
||||
"""
|
||||
resized_img = cv2.resize(face_img, self.input_size)
|
||||
|
||||
if isinstance(self.input_std, (list, tuple)):
|
||||
if isinstance(self.input_std, list | tuple):
|
||||
# Per-channel normalization
|
||||
rgb_img = cv2.cvtColor(resized_img, cv2.COLOR_BGR2RGB).astype(np.float32)
|
||||
normalized_img = (rgb_img - np.array(self.input_mean, dtype=np.float32)) / np.array(
|
||||
@@ -116,13 +123,14 @@ class BaseRecognizer(ABC):
|
||||
|
||||
return blob
|
||||
|
||||
def get_embedding(self, image: np.ndarray, landmarks: np.ndarray = None) -> np.ndarray:
|
||||
"""
|
||||
Extracts face embedding from an image.
|
||||
def get_embedding(self, image: np.ndarray, landmarks: np.ndarray | None = None) -> np.ndarray:
|
||||
"""Extract face embedding from an image.
|
||||
|
||||
Args:
|
||||
image: Input face image (BGR format). If already aligned (112x112), landmarks can be None.
|
||||
landmarks: Facial landmarks (5 points for alignment). Optional if image is already aligned.
|
||||
image: Input face image in BGR format. If already aligned (112x112),
|
||||
landmarks can be None.
|
||||
landmarks: Facial landmarks (5 points for alignment). Optional if
|
||||
image is already aligned.
|
||||
|
||||
Returns:
|
||||
Face embedding vector (typically 512-dimensional).
|
||||
@@ -141,15 +149,14 @@ class BaseRecognizer(ABC):
|
||||
return embedding
|
||||
|
||||
def get_normalized_embedding(self, image: np.ndarray, landmarks: np.ndarray) -> np.ndarray:
|
||||
"""
|
||||
Extracts a l2 normalized face embedding vector from an image.
|
||||
"""Extract an L2-normalized face embedding vector from an image.
|
||||
|
||||
Args:
|
||||
image: Input face image (BGR format).
|
||||
image: Input face image in BGR format.
|
||||
landmarks: Facial landmarks (5 points for alignment).
|
||||
|
||||
Returns:
|
||||
Normalized face embedding vector (typically 512-dimensional).
|
||||
L2-normalized face embedding vector (typically 512-dimensional).
|
||||
"""
|
||||
embedding = self.get_embedding(image, landmarks)
|
||||
norm = np.linalg.norm(embedding)
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import Optional
|
||||
from __future__ import annotations
|
||||
|
||||
from uniface.constants import ArcFaceWeights, MobileFaceWeights, SphereFaceWeights
|
||||
from uniface.model_store import verify_model_weights
|
||||
@@ -34,7 +34,7 @@ class ArcFace(BaseRecognizer):
|
||||
def __init__(
|
||||
self,
|
||||
model_name: ArcFaceWeights = ArcFaceWeights.MNET,
|
||||
preprocessing: Optional[PreprocessConfig] = None,
|
||||
preprocessing: PreprocessConfig | None = None,
|
||||
) -> None:
|
||||
if preprocessing is None:
|
||||
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
|
||||
@@ -64,7 +64,7 @@ class MobileFace(BaseRecognizer):
|
||||
def __init__(
|
||||
self,
|
||||
model_name: MobileFaceWeights = MobileFaceWeights.MNET_V2,
|
||||
preprocessing: Optional[PreprocessConfig] = None,
|
||||
preprocessing: PreprocessConfig | None = None,
|
||||
) -> None:
|
||||
if preprocessing is None:
|
||||
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
|
||||
@@ -94,7 +94,7 @@ class SphereFace(BaseRecognizer):
|
||||
def __init__(
|
||||
self,
|
||||
model_name: SphereFaceWeights = SphereFaceWeights.SPHERE20,
|
||||
preprocessing: Optional[PreprocessConfig] = None,
|
||||
preprocessing: PreprocessConfig | None = None,
|
||||
) -> None:
|
||||
if preprocessing is None:
|
||||
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import Optional
|
||||
from __future__ import annotations
|
||||
|
||||
from uniface.constants import MiniFASNetWeights
|
||||
|
||||
@@ -19,46 +19,27 @@ __all__ = [
|
||||
|
||||
def create_spoofer(
|
||||
model_name: MiniFASNetWeights = MiniFASNetWeights.V2,
|
||||
scale: Optional[float] = None,
|
||||
scale: float | None = None,
|
||||
) -> MiniFASNet:
|
||||
"""
|
||||
Factory function to create a face anti-spoofing model.
|
||||
"""Factory function to create a face anti-spoofing model.
|
||||
|
||||
This is a convenience function that creates a MiniFASNet instance
|
||||
with the specified model variant and optional custom scale.
|
||||
|
||||
Args:
|
||||
model_name (MiniFASNetWeights): The model variant to use.
|
||||
Options:
|
||||
- MiniFASNetWeights.V2: Improved version (default), uses scale=2.7
|
||||
- MiniFASNetWeights.V1SE: Squeeze-and-excitation version, uses scale=4.0
|
||||
Defaults to MiniFASNetWeights.V2.
|
||||
scale (Optional[float]): Custom crop scale factor for face region.
|
||||
If None, uses the default scale for the selected model variant.
|
||||
model_name: The model variant to use. Options:
|
||||
- MiniFASNetWeights.V2: Improved version (default), uses scale=2.7
|
||||
- MiniFASNetWeights.V1SE: Squeeze-and-excitation version, uses scale=4.0
|
||||
scale: Custom crop scale factor for face region. If None, uses the
|
||||
default scale for the selected model variant.
|
||||
|
||||
Returns:
|
||||
MiniFASNet: An initialized face anti-spoofing model.
|
||||
An initialized face anti-spoofing model.
|
||||
|
||||
Example:
|
||||
>>> from uniface.spoofing import create_spoofer, MiniFASNetWeights
|
||||
>>> from uniface import RetinaFace
|
||||
>>>
|
||||
>>> # Create with default settings (V2 model)
|
||||
>>> spoofer = create_spoofer()
|
||||
>>>
|
||||
>>> # Create with V1SE model
|
||||
>>> spoofer = create_spoofer(model_name=MiniFASNetWeights.V1SE)
|
||||
>>>
|
||||
>>> # Create with custom scale
|
||||
>>> spoofer = create_spoofer(scale=3.0)
|
||||
>>>
|
||||
>>> # Use with face detector
|
||||
>>> detector = RetinaFace()
|
||||
>>> faces = detector.detect(image)
|
||||
>>> for face in faces:
|
||||
... label_idx, score = spoofer.predict(image, face['bbox'])
|
||||
... # label_idx: 0 = Fake, 1 = Real
|
||||
... label = 'Real' if label_idx == 1 else 'Fake'
|
||||
... print(f'{label}: {score:.2%}')
|
||||
>>> label_idx, score = spoofer.predict(image, face.bbox)
|
||||
>>> # label_idx: 0 = Fake, 1 = Real
|
||||
"""
|
||||
return MiniFASNet(model_name=model_name, scale=scale)
|
||||
|
||||
@@ -3,7 +3,6 @@
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from abc import ABC, abstractmethod
|
||||
from typing import List, Tuple, Union
|
||||
|
||||
import numpy as np
|
||||
|
||||
@@ -36,7 +35,7 @@ class BaseSpoofer(ABC):
|
||||
raise NotImplementedError('Subclasses must implement the _initialize_model method.')
|
||||
|
||||
@abstractmethod
|
||||
def preprocess(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> np.ndarray:
|
||||
def preprocess(self, image: np.ndarray, bbox: list | np.ndarray) -> np.ndarray:
|
||||
"""
|
||||
Preprocess the input image for model inference.
|
||||
|
||||
@@ -55,7 +54,7 @@ class BaseSpoofer(ABC):
|
||||
raise NotImplementedError('Subclasses must implement the preprocess method.')
|
||||
|
||||
@abstractmethod
|
||||
def postprocess(self, outputs: np.ndarray) -> Tuple[int, float]:
|
||||
def postprocess(self, outputs: np.ndarray) -> tuple[int, float]:
|
||||
"""
|
||||
Postprocess raw model outputs into prediction result.
|
||||
|
||||
@@ -73,7 +72,7 @@ class BaseSpoofer(ABC):
|
||||
raise NotImplementedError('Subclasses must implement the postprocess method.')
|
||||
|
||||
@abstractmethod
|
||||
def predict(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> Tuple[int, float]:
|
||||
def predict(self, image: np.ndarray, bbox: list | np.ndarray) -> tuple[int, float]:
|
||||
"""
|
||||
Perform end-to-end anti-spoofing prediction on a face.
|
||||
|
||||
@@ -95,13 +94,13 @@ class BaseSpoofer(ABC):
|
||||
>>> detector = RetinaFace()
|
||||
>>> faces = detector.detect(image)
|
||||
>>> for face in faces:
|
||||
... label_idx, score = spoofer.predict(image, face['bbox'])
|
||||
... label_idx, score = spoofer.predict(image, face.bbox)
|
||||
... label = 'Real' if label_idx == 1 else 'Fake'
|
||||
... print(f'{label}: {score:.2%}')
|
||||
"""
|
||||
raise NotImplementedError('Subclasses must implement the predict method.')
|
||||
|
||||
def __call__(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> Tuple[int, float]:
|
||||
def __call__(self, image: np.ndarray, bbox: list | np.ndarray) -> tuple[int, float]:
|
||||
"""
|
||||
Provides a convenient, callable shortcut for the `predict` method.
|
||||
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import List, Optional, Tuple, Union
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
@@ -59,7 +58,7 @@ class MiniFASNet(BaseSpoofer):
|
||||
>>> # Detect faces and check if they are real
|
||||
>>> faces = detector.detect(image)
|
||||
>>> for face in faces:
|
||||
... label_idx, score = spoofer.predict(image, face['bbox'])
|
||||
... label_idx, score = spoofer.predict(image, face.bbox)
|
||||
... # label_idx: 0 = Fake, 1 = Real
|
||||
... label = 'Real' if label_idx == 1 else 'Fake'
|
||||
... print(f'{label}: {score:.2%}')
|
||||
@@ -68,7 +67,7 @@ class MiniFASNet(BaseSpoofer):
|
||||
def __init__(
|
||||
self,
|
||||
model_name: MiniFASNetWeights = MiniFASNetWeights.V2,
|
||||
scale: Optional[float] = None,
|
||||
scale: float | None = None,
|
||||
) -> None:
|
||||
Logger.info(f'Initializing MiniFASNet with model={model_name.name}')
|
||||
|
||||
@@ -104,12 +103,12 @@ class MiniFASNet(BaseSpoofer):
|
||||
Logger.error(f"Failed to load MiniFASNet model from '{self.model_path}'", exc_info=True)
|
||||
raise RuntimeError(f'Failed to initialize MiniFASNet model: {e}') from e
|
||||
|
||||
def _xyxy_to_xywh(self, bbox: Union[List, np.ndarray]) -> List[int]:
|
||||
def _xyxy_to_xywh(self, bbox: list | np.ndarray) -> list[int]:
|
||||
"""Convert bounding box from [x1, y1, x2, y2] to [x, y, w, h] format."""
|
||||
x1, y1, x2, y2 = bbox[:4]
|
||||
return [int(x1), int(y1), int(x2 - x1), int(y2 - y1)]
|
||||
|
||||
def _crop_face(self, image: np.ndarray, bbox_xywh: List[int]) -> np.ndarray:
|
||||
def _crop_face(self, image: np.ndarray, bbox_xywh: list[int]) -> np.ndarray:
|
||||
"""
|
||||
Crop and resize face region from image using scale factor.
|
||||
|
||||
@@ -147,7 +146,7 @@ class MiniFASNet(BaseSpoofer):
|
||||
|
||||
return resized
|
||||
|
||||
def preprocess(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> np.ndarray:
|
||||
def preprocess(self, image: np.ndarray, bbox: list | np.ndarray) -> np.ndarray:
|
||||
"""
|
||||
Preprocess the input image for model inference.
|
||||
|
||||
@@ -181,7 +180,7 @@ class MiniFASNet(BaseSpoofer):
|
||||
e_x = np.exp(x - np.max(x, axis=1, keepdims=True))
|
||||
return e_x / e_x.sum(axis=1, keepdims=True)
|
||||
|
||||
def postprocess(self, outputs: np.ndarray) -> Tuple[int, float]:
|
||||
def postprocess(self, outputs: np.ndarray) -> tuple[int, float]:
|
||||
"""
|
||||
Postprocess raw model outputs into prediction result.
|
||||
|
||||
@@ -202,7 +201,7 @@ class MiniFASNet(BaseSpoofer):
|
||||
|
||||
return label_idx, score
|
||||
|
||||
def predict(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> Tuple[int, float]:
|
||||
def predict(self, image: np.ndarray, bbox: list | np.ndarray) -> tuple[int, float]:
|
||||
"""
|
||||
Perform end-to-end anti-spoofing prediction on a face.
|
||||
|
||||
|
||||
@@ -2,11 +2,26 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from typing import List, Tuple, Union
|
||||
"""Visualization utilities for UniFace.
|
||||
|
||||
This module provides functions for drawing detection results, gaze directions,
|
||||
and face parsing segmentation maps on images.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
__all__ = [
|
||||
'FACE_PARSING_COLORS',
|
||||
'FACE_PARSING_LABELS',
|
||||
'draw_detections',
|
||||
'draw_fancy_bbox',
|
||||
'draw_gaze',
|
||||
'vis_parsing_maps',
|
||||
]
|
||||
|
||||
# Face parsing component names (19 classes)
|
||||
FACE_PARSING_LABELS = [
|
||||
'background',
|
||||
@@ -57,23 +72,25 @@ FACE_PARSING_COLORS = [
|
||||
def draw_detections(
|
||||
*,
|
||||
image: np.ndarray,
|
||||
bboxes: Union[List[np.ndarray], List[List[float]]],
|
||||
scores: Union[np.ndarray, List[float]],
|
||||
landmarks: Union[List[np.ndarray], List[List[List[float]]]],
|
||||
bboxes: list[np.ndarray] | list[list[float]],
|
||||
scores: np.ndarray | list[float],
|
||||
landmarks: list[np.ndarray] | list[list[list[float]]],
|
||||
vis_threshold: float = 0.6,
|
||||
draw_score: bool = False,
|
||||
fancy_bbox: bool = True,
|
||||
):
|
||||
"""
|
||||
Draws bounding boxes, landmarks, and optional scores on an image.
|
||||
) -> None:
|
||||
"""Draw bounding boxes, landmarks, and optional scores on an image.
|
||||
|
||||
Modifies the image in-place.
|
||||
|
||||
Args:
|
||||
image: Input image to draw on.
|
||||
bboxes: List of bounding boxes [x1, y1, x2, y2].
|
||||
image: Input image to draw on (modified in-place).
|
||||
bboxes: List of bounding boxes as [x1, y1, x2, y2].
|
||||
scores: List of confidence scores.
|
||||
landmarks: List of landmark sets with shape (5, 2).
|
||||
vis_threshold: Confidence threshold for filtering. Defaults to 0.6.
|
||||
draw_score: Whether to draw confidence scores. Defaults to False.
|
||||
fancy_bbox: Use corner-style bounding boxes. Defaults to True.
|
||||
"""
|
||||
colors = [(0, 0, 255), (0, 255, 255), (255, 0, 255), (0, 255, 0), (255, 0, 0)]
|
||||
|
||||
@@ -134,19 +151,18 @@ def draw_detections(
|
||||
def draw_fancy_bbox(
|
||||
image: np.ndarray,
|
||||
bbox: np.ndarray,
|
||||
color: Tuple[int, int, int] = (0, 255, 0),
|
||||
color: tuple[int, int, int] = (0, 255, 0),
|
||||
thickness: int = 3,
|
||||
proportion: float = 0.2,
|
||||
):
|
||||
"""
|
||||
Draws a bounding box with fancy corners on an image.
|
||||
) -> None:
|
||||
"""Draw a bounding box with fancy corners on an image.
|
||||
|
||||
Args:
|
||||
image: Input image to draw on.
|
||||
image: Input image to draw on (modified in-place).
|
||||
bbox: Bounding box coordinates [x1, y1, x2, y2].
|
||||
color: Color of the bounding box. Defaults to green.
|
||||
thickness: Thickness of the bounding box lines. Defaults to 3.
|
||||
proportion: Proportion of the corner length to the width/height of the bounding box. Defaults to 0.2.
|
||||
color: Color of the bounding box in BGR. Defaults to green.
|
||||
thickness: Thickness of the corner lines. Defaults to 3.
|
||||
proportion: Proportion of corner length to box dimensions. Defaults to 0.2.
|
||||
"""
|
||||
x1, y1, x2, y2 = map(int, bbox)
|
||||
width = x2 - x1
|
||||
@@ -177,15 +193,14 @@ def draw_fancy_bbox(
|
||||
def draw_gaze(
|
||||
image: np.ndarray,
|
||||
bbox: np.ndarray,
|
||||
pitch: np.ndarray,
|
||||
yaw: np.ndarray,
|
||||
pitch: np.ndarray | float,
|
||||
yaw: np.ndarray | float,
|
||||
*,
|
||||
draw_bbox: bool = True,
|
||||
fancy_bbox: bool = True,
|
||||
draw_angles: bool = True,
|
||||
):
|
||||
"""
|
||||
Draws gaze direction with optional bounding box on an image.
|
||||
) -> None:
|
||||
"""Draw gaze direction with optional bounding box on an image.
|
||||
|
||||
Args:
|
||||
image: Input image to draw on (modified in-place).
|
||||
@@ -194,7 +209,7 @@ def draw_gaze(
|
||||
yaw: Horizontal gaze angle in radians.
|
||||
draw_bbox: Whether to draw the bounding box. Defaults to True.
|
||||
fancy_bbox: Use fancy corner-style bbox. Defaults to True.
|
||||
draw_angles: Whether to display pitch/yaw values as text. Defaults to False.
|
||||
draw_angles: Whether to display pitch/yaw values as text. Defaults to True.
|
||||
"""
|
||||
x_min, y_min, x_max, y_max = map(int, bbox[:4])
|
||||
|
||||
@@ -275,29 +290,25 @@ def vis_parsing_maps(
|
||||
save_image: bool = False,
|
||||
save_path: str = 'result.png',
|
||||
) -> np.ndarray:
|
||||
"""
|
||||
Visualizes face parsing segmentation mask by overlaying colored regions on the image.
|
||||
"""Visualize face parsing segmentation mask by overlaying colored regions.
|
||||
|
||||
Args:
|
||||
image: Input face image in RGB format with shape (H, W, 3).
|
||||
segmentation_mask: Segmentation mask with shape (H, W) where each pixel
|
||||
value represents a facial component class (0-18).
|
||||
value represents a facial component class (0-18).
|
||||
save_image: Whether to save the visualization to disk. Defaults to False.
|
||||
save_path: Path to save the visualization if save_image is True.
|
||||
|
||||
Returns:
|
||||
np.ndarray: Blended image with segmentation overlay in BGR format.
|
||||
Blended image with segmentation overlay in BGR format.
|
||||
|
||||
Example:
|
||||
>>> import cv2
|
||||
>>> from uniface.parsing import BiSeNet
|
||||
>>> from uniface.visualization import vis_parsing_maps
|
||||
>>>
|
||||
>>> parser = BiSeNet()
|
||||
>>> face_image = cv2.imread('face.jpg')
|
||||
>>> mask = parser.parse(face_image)
|
||||
>>>
|
||||
>>> # Visualize
|
||||
>>> face_rgb = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
|
||||
>>> result = vis_parsing_maps(face_rgb, mask)
|
||||
>>> cv2.imwrite('parsed_face.jpg', result)
|
||||
|
||||
Reference in New Issue
Block a user