mirror of
https://github.com/yakhyo/uniface.git
synced 2025-12-30 09:02:25 +00:00
refactor: Standardize naming conventions (#47)
* refactor: Standardize naming conventions * chore: Update the version and re-run experiments * chore: Improve code quality tooling and documentation - Add pre-commit job to CI workflow for automated linting on PRs - Update uniface/__init__.py with copyright header, module docstring, and logically grouped exports - Revise CONTRIBUTING.md to reflect pre-commit handles all formatting - Remove redundant ruff check from CI (now handled by pre-commit) - Update build job Python version to 3.11 (matches requires-python)
This commit is contained in:
committed by
GitHub
parent
64ad0d2f53
commit
50226041c9
16
.github/workflows/ci.yml
vendored
16
.github/workflows/ci.yml
vendored
@@ -15,9 +15,20 @@ concurrency:
|
|||||||
cancel-in-progress: true
|
cancel-in-progress: true
|
||||||
|
|
||||||
jobs:
|
jobs:
|
||||||
|
lint:
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
timeout-minutes: 5
|
||||||
|
steps:
|
||||||
|
- uses: actions/checkout@v4
|
||||||
|
- uses: actions/setup-python@v5
|
||||||
|
with:
|
||||||
|
python-version: '3.11'
|
||||||
|
- uses: pre-commit/action@v3.0.1
|
||||||
|
|
||||||
test:
|
test:
|
||||||
runs-on: ${{ matrix.os }}
|
runs-on: ${{ matrix.os }}
|
||||||
timeout-minutes: 15
|
timeout-minutes: 15
|
||||||
|
needs: lint
|
||||||
|
|
||||||
strategy:
|
strategy:
|
||||||
fail-fast: false
|
fail-fast: false
|
||||||
@@ -44,9 +55,6 @@ jobs:
|
|||||||
run: |
|
run: |
|
||||||
python -c "import onnxruntime as ort; print('Available providers:', ort.get_available_providers())"
|
python -c "import onnxruntime as ort; print('Available providers:', ort.get_available_providers())"
|
||||||
|
|
||||||
- name: Lint with ruff
|
|
||||||
run: ruff check .
|
|
||||||
|
|
||||||
- name: Run tests
|
- name: Run tests
|
||||||
run: pytest -v --tb=short
|
run: pytest -v --tb=short
|
||||||
|
|
||||||
@@ -65,7 +73,7 @@ jobs:
|
|||||||
- name: Set up Python
|
- name: Set up Python
|
||||||
uses: actions/setup-python@v5
|
uses: actions/setup-python@v5
|
||||||
with:
|
with:
|
||||||
python-version: "3.10"
|
python-version: "3.11"
|
||||||
cache: "pip"
|
cache: "pip"
|
||||||
|
|
||||||
- name: Install build tools
|
- name: Install build tools
|
||||||
|
|||||||
1
.github/workflows/publish.yml
vendored
1
.github/workflows/publish.yml
vendored
@@ -117,4 +117,3 @@ jobs:
|
|||||||
with:
|
with:
|
||||||
files: dist/*
|
files: dist/*
|
||||||
generate_release_notes: true
|
generate_release_notes: true
|
||||||
|
|
||||||
|
|||||||
40
.pre-commit-config.yaml
Normal file
40
.pre-commit-config.yaml
Normal file
@@ -0,0 +1,40 @@
|
|||||||
|
# Pre-commit configuration for UniFace
|
||||||
|
# See https://pre-commit.com for more information
|
||||||
|
# See https://pre-commit.com/hooks.html for more hooks
|
||||||
|
|
||||||
|
repos:
|
||||||
|
# General file checks
|
||||||
|
- repo: https://github.com/pre-commit/pre-commit-hooks
|
||||||
|
rev: v4.6.0
|
||||||
|
hooks:
|
||||||
|
- id: trailing-whitespace
|
||||||
|
- id: end-of-file-fixer
|
||||||
|
- id: check-yaml
|
||||||
|
- id: check-toml
|
||||||
|
- id: check-added-large-files
|
||||||
|
args: ['--maxkb=1000']
|
||||||
|
- id: check-merge-conflict
|
||||||
|
- id: debug-statements
|
||||||
|
- id: check-ast
|
||||||
|
|
||||||
|
# Ruff - Fast Python linter and formatter
|
||||||
|
- repo: https://github.com/astral-sh/ruff-pre-commit
|
||||||
|
rev: v0.8.4
|
||||||
|
hooks:
|
||||||
|
- id: ruff
|
||||||
|
args: [--fix, --unsafe-fixes, --exit-non-zero-on-fix]
|
||||||
|
- id: ruff-format
|
||||||
|
|
||||||
|
# Security checks
|
||||||
|
- repo: https://github.com/PyCQA/bandit
|
||||||
|
rev: 1.7.10
|
||||||
|
hooks:
|
||||||
|
- id: bandit
|
||||||
|
args: [-c, pyproject.toml]
|
||||||
|
additional_dependencies: ['bandit[toml]']
|
||||||
|
exclude: ^tests/
|
||||||
|
|
||||||
|
# Configuration
|
||||||
|
ci:
|
||||||
|
autofix_commit_msg: 'style: auto-fix by pre-commit hooks'
|
||||||
|
autoupdate_commit_msg: 'chore: update pre-commit hooks'
|
||||||
183
CONTRIBUTING.md
183
CONTRIBUTING.md
@@ -16,33 +16,9 @@ Thank you for considering contributing to UniFace! We welcome contributions of a
|
|||||||
2. Create a new branch for your feature
|
2. Create a new branch for your feature
|
||||||
3. Write clear, documented code with type hints
|
3. Write clear, documented code with type hints
|
||||||
4. Add tests for new functionality
|
4. Add tests for new functionality
|
||||||
5. Ensure all tests pass
|
5. Ensure all tests pass and pre-commit hooks are satisfied
|
||||||
6. Submit a pull request with a clear description
|
6. Submit a pull request with a clear description
|
||||||
|
|
||||||
### Code Style
|
|
||||||
|
|
||||||
This project uses [Ruff](https://docs.astral.sh/ruff/) for linting and formatting.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Check for linting errors
|
|
||||||
ruff check .
|
|
||||||
|
|
||||||
# Auto-fix linting errors
|
|
||||||
ruff check . --fix
|
|
||||||
|
|
||||||
# Format code
|
|
||||||
ruff format .
|
|
||||||
```
|
|
||||||
|
|
||||||
**Guidelines:**
|
|
||||||
- Follow PEP8 guidelines
|
|
||||||
- Use type hints (Python 3.10+)
|
|
||||||
- Write docstrings for public APIs
|
|
||||||
- Line length: 120 characters
|
|
||||||
- Keep code simple and readable
|
|
||||||
|
|
||||||
All PRs must pass `ruff check .` before merging.
|
|
||||||
|
|
||||||
## Development Setup
|
## Development Setup
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@@ -51,31 +27,164 @@ cd uniface
|
|||||||
pip install -e ".[dev]"
|
pip install -e ".[dev]"
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Setting Up Pre-commit Hooks
|
||||||
|
|
||||||
|
We use [pre-commit](https://pre-commit.com/) to ensure code quality and consistency. Install and configure it:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Install pre-commit
|
||||||
|
pip install pre-commit
|
||||||
|
|
||||||
|
# Install the git hooks
|
||||||
|
pre-commit install
|
||||||
|
|
||||||
|
# (Optional) Run against all files
|
||||||
|
pre-commit run --all-files
|
||||||
|
```
|
||||||
|
|
||||||
|
Once installed, pre-commit will automatically run on every commit to check:
|
||||||
|
|
||||||
|
- Code formatting and linting (Ruff)
|
||||||
|
- Security issues (Bandit)
|
||||||
|
- General file hygiene (trailing whitespace, YAML/TOML validity, etc.)
|
||||||
|
|
||||||
|
**Note:** All PRs are automatically checked by CI. The merge button will only be available after all checks pass.
|
||||||
|
|
||||||
|
## Code Style
|
||||||
|
|
||||||
|
This project uses [Ruff](https://docs.astral.sh/ruff/) for linting and formatting, following modern Python best practices. Pre-commit handles all formatting automatically.
|
||||||
|
|
||||||
|
### Style Guidelines
|
||||||
|
|
||||||
|
#### General Rules
|
||||||
|
|
||||||
|
- **Line length:** 120 characters maximum
|
||||||
|
- **Python version:** 3.11+ (use modern syntax)
|
||||||
|
- **Quote style:** Single quotes for strings, double quotes for docstrings
|
||||||
|
|
||||||
|
#### Type Hints
|
||||||
|
|
||||||
|
Use modern Python 3.11+ type hints (PEP 585 and PEP 604):
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Preferred (modern)
|
||||||
|
def process(items: list[str], config: dict[str, int] | None = None) -> tuple[int, str]:
|
||||||
|
...
|
||||||
|
|
||||||
|
# Avoid (legacy)
|
||||||
|
from typing import List, Dict, Optional, Tuple
|
||||||
|
def process(items: List[str], config: Optional[Dict[str, int]] = None) -> Tuple[int, str]:
|
||||||
|
...
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Docstrings
|
||||||
|
|
||||||
|
Use [Google-style docstrings](https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings) for all public APIs:
|
||||||
|
|
||||||
|
```python
|
||||||
|
def detect_faces(image: np.ndarray, threshold: float = 0.5) -> list[Face]:
|
||||||
|
"""Detect faces in an image.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image: Input image as a numpy array with shape (H, W, C) in BGR format.
|
||||||
|
threshold: Confidence threshold for filtering detections. Defaults to 0.5.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of Face objects containing bounding boxes, confidence scores,
|
||||||
|
and facial landmarks.
|
||||||
|
|
||||||
|
Raises:
|
||||||
|
ValueError: If the input image has invalid dimensions.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> from uniface import detect_faces
|
||||||
|
>>> faces = detect_faces(image, threshold=0.8)
|
||||||
|
>>> print(f"Found {len(faces)} faces")
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Import Order
|
||||||
|
|
||||||
|
Imports are automatically sorted by Ruff with the following order:
|
||||||
|
|
||||||
|
1. **Future** imports (`from __future__ import annotations`)
|
||||||
|
2. **Standard library** (`os`, `sys`, `typing`, etc.)
|
||||||
|
3. **Third-party** (`numpy`, `cv2`, `onnxruntime`, etc.)
|
||||||
|
4. **First-party** (`uniface.*`)
|
||||||
|
5. **Local** (relative imports like `.base`, `.models`)
|
||||||
|
|
||||||
|
```python
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import os
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
import cv2
|
||||||
|
import numpy as np
|
||||||
|
|
||||||
|
from uniface.constants import RetinaFaceWeights
|
||||||
|
from uniface.log import Logger
|
||||||
|
|
||||||
|
from .base import BaseDetector
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Code Comments
|
||||||
|
|
||||||
|
- Add comments for complex logic, magic numbers, and non-obvious behavior
|
||||||
|
- Avoid comments that merely restate the code
|
||||||
|
- Use `# TODO:` with issue links for planned improvements
|
||||||
|
|
||||||
|
```python
|
||||||
|
# RetinaFace FPN strides and corresponding anchor sizes per level
|
||||||
|
steps = [8, 16, 32]
|
||||||
|
min_sizes = [[16, 32], [64, 128], [256, 512]]
|
||||||
|
|
||||||
|
# Add small epsilon to prevent division by zero
|
||||||
|
similarity = np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b) + 1e-5)
|
||||||
|
```
|
||||||
|
|
||||||
## Running Tests
|
## Running Tests
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
# Run all tests
|
||||||
pytest tests/
|
pytest tests/
|
||||||
|
|
||||||
|
# Run with verbose output
|
||||||
|
pytest tests/ -v
|
||||||
|
|
||||||
|
# Run specific test file
|
||||||
|
pytest tests/test_factory.py
|
||||||
|
|
||||||
|
# Run with coverage
|
||||||
|
pytest tests/ --cov=uniface --cov-report=html
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Adding New Features
|
||||||
|
|
||||||
|
When adding a new model or feature:
|
||||||
|
|
||||||
|
1. **Create the model class** in the appropriate submodule (e.g., `uniface/detection/`)
|
||||||
|
2. **Add weight constants** to `uniface/constants.py` with URLs and SHA256 hashes
|
||||||
|
3. **Export in `__init__.py`** files at both module and package levels
|
||||||
|
4. **Write tests** in `tests/` directory
|
||||||
|
5. **Add example usage** in `scripts/` or update existing notebooks
|
||||||
|
6. **Update documentation** if needed
|
||||||
|
|
||||||
## Examples
|
## Examples
|
||||||
|
|
||||||
Example notebooks demonstrating library usage:
|
Example notebooks demonstrating library usage:
|
||||||
|
|
||||||
| Example | Notebook |
|
| Example | Notebook |
|
||||||
|---------|----------|
|
|---------|----------|
|
||||||
| Face Detection | [face_detection.ipynb](examples/face_detection.ipynb) |
|
| Face Detection | [01_face_detection.ipynb](examples/01_face_detection.ipynb) |
|
||||||
| Face Alignment | [face_alignment.ipynb](examples/face_alignment.ipynb) |
|
| Face Alignment | [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) |
|
||||||
| Face Recognition | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
|
| Face Verification | [03_face_verification.ipynb](examples/03_face_verification.ipynb) |
|
||||||
| Face Verification | [face_verification.ipynb](examples/face_verification.ipynb) |
|
| Face Search | [04_face_search.ipynb](examples/04_face_search.ipynb) |
|
||||||
| Face Search | [face_search.ipynb](examples/face_search.ipynb) |
|
| Face Analyzer | [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) |
|
||||||
| Face Anonymization | [face_anonymization.ipynb](examples/face_anonymization.ipynb) |
|
| Face Parsing | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
|
||||||
|
| Face Anonymization | [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) |
|
||||||
|
| Gaze Estimation | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
|
||||||
|
|
||||||
## Questions?
|
## Questions?
|
||||||
|
|
||||||
Open an issue or start a discussion on GitHub.
|
Open an issue or start a discussion on GitHub.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
20
MODELS.md
20
MODELS.md
@@ -34,7 +34,7 @@ detector = RetinaFace() # Uses MNET_V2
|
|||||||
# Specific model
|
# Specific model
|
||||||
detector = RetinaFace(
|
detector = RetinaFace(
|
||||||
model_name=RetinaFaceWeights.MNET_025, # Fastest
|
model_name=RetinaFaceWeights.MNET_025, # Fastest
|
||||||
conf_thresh=0.5,
|
confidence_threshold=0.5,
|
||||||
nms_thresh=0.4,
|
nms_thresh=0.4,
|
||||||
input_size=(640, 640)
|
input_size=(640, 640)
|
||||||
)
|
)
|
||||||
@@ -63,14 +63,14 @@ from uniface.constants import SCRFDWeights
|
|||||||
# Fast real-time detection
|
# Fast real-time detection
|
||||||
detector = SCRFD(
|
detector = SCRFD(
|
||||||
model_name=SCRFDWeights.SCRFD_500M_KPS,
|
model_name=SCRFDWeights.SCRFD_500M_KPS,
|
||||||
conf_thresh=0.5,
|
confidence_threshold=0.5,
|
||||||
input_size=(640, 640)
|
input_size=(640, 640)
|
||||||
)
|
)
|
||||||
|
|
||||||
# High accuracy
|
# High accuracy
|
||||||
detector = SCRFD(
|
detector = SCRFD(
|
||||||
model_name=SCRFDWeights.SCRFD_10G_KPS,
|
model_name=SCRFDWeights.SCRFD_10G_KPS,
|
||||||
conf_thresh=0.5
|
confidence_threshold=0.5
|
||||||
)
|
)
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -99,29 +99,29 @@ from uniface.constants import YOLOv5FaceWeights
|
|||||||
# Lightweight/Mobile
|
# Lightweight/Mobile
|
||||||
detector = YOLOv5Face(
|
detector = YOLOv5Face(
|
||||||
model_name=YOLOv5FaceWeights.YOLOV5N,
|
model_name=YOLOv5FaceWeights.YOLOV5N,
|
||||||
conf_thresh=0.6,
|
confidence_threshold=0.6,
|
||||||
nms_thresh=0.5
|
nms_thresh=0.5
|
||||||
)
|
)
|
||||||
|
|
||||||
# Real-time detection (recommended)
|
# Real-time detection (recommended)
|
||||||
detector = YOLOv5Face(
|
detector = YOLOv5Face(
|
||||||
model_name=YOLOv5FaceWeights.YOLOV5S,
|
model_name=YOLOv5FaceWeights.YOLOV5S,
|
||||||
conf_thresh=0.6,
|
confidence_threshold=0.6,
|
||||||
nms_thresh=0.5
|
nms_thresh=0.5
|
||||||
)
|
)
|
||||||
|
|
||||||
# High accuracy
|
# High accuracy
|
||||||
detector = YOLOv5Face(
|
detector = YOLOv5Face(
|
||||||
model_name=YOLOv5FaceWeights.YOLOV5M,
|
model_name=YOLOv5FaceWeights.YOLOV5M,
|
||||||
conf_thresh=0.6
|
confidence_threshold=0.6
|
||||||
)
|
)
|
||||||
|
|
||||||
# Detect faces with landmarks
|
# Detect faces with landmarks
|
||||||
faces = detector.detect(image)
|
faces = detector.detect(image)
|
||||||
for face in faces:
|
for face in faces:
|
||||||
bbox = face['bbox'] # [x1, y1, x2, y2]
|
bbox = face.bbox # [x1, y1, x2, y2]
|
||||||
confidence = face['confidence']
|
confidence = face.confidence
|
||||||
landmarks = face['landmarks'] # 5-point landmarks (5, 2)
|
landmarks = face.landmarks # 5-point landmarks (5, 2)
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -466,7 +466,7 @@ spoofer = MiniFASNet(model_name=MiniFASNetWeights.V1SE)
|
|||||||
# Detect and check liveness
|
# Detect and check liveness
|
||||||
faces = detector.detect(image)
|
faces = detector.detect(image)
|
||||||
for face in faces:
|
for face in faces:
|
||||||
label_idx, score = spoofer.predict(image, face['bbox'])
|
label_idx, score = spoofer.predict(image, face.bbox)
|
||||||
# label_idx: 0 = Fake, 1 = Real
|
# label_idx: 0 = Fake, 1 = Real
|
||||||
label = 'Real' if label_idx == 1 else 'Fake'
|
label = 'Real' if label_idx == 1 else 'Fake'
|
||||||
print(f"{label}: {score:.1%}")
|
print(f"{label}: {score:.1%}")
|
||||||
|
|||||||
@@ -545,7 +545,7 @@ from uniface.constants import RetinaFaceWeights, SCRFDWeights, YOLOv5FaceWeights
|
|||||||
# Fast detection (mobile/edge devices)
|
# Fast detection (mobile/edge devices)
|
||||||
detector = RetinaFace(
|
detector = RetinaFace(
|
||||||
model_name=RetinaFaceWeights.MNET_025,
|
model_name=RetinaFaceWeights.MNET_025,
|
||||||
conf_thresh=0.7
|
confidence_threshold=0.7
|
||||||
)
|
)
|
||||||
|
|
||||||
# Balanced (recommended)
|
# Balanced (recommended)
|
||||||
@@ -556,14 +556,14 @@ detector = RetinaFace(
|
|||||||
# Real-time with high accuracy
|
# Real-time with high accuracy
|
||||||
detector = YOLOv5Face(
|
detector = YOLOv5Face(
|
||||||
model_name=YOLOv5FaceWeights.YOLOV5S,
|
model_name=YOLOv5FaceWeights.YOLOV5S,
|
||||||
conf_thresh=0.6,
|
confidence_threshold=0.6,
|
||||||
nms_thresh=0.5
|
nms_thresh=0.5
|
||||||
)
|
)
|
||||||
|
|
||||||
# High accuracy (server/GPU)
|
# High accuracy (server/GPU)
|
||||||
detector = SCRFD(
|
detector = SCRFD(
|
||||||
model_name=SCRFDWeights.SCRFD_10G_KPS,
|
model_name=SCRFDWeights.SCRFD_10G_KPS,
|
||||||
conf_thresh=0.5
|
confidence_threshold=0.5
|
||||||
)
|
)
|
||||||
```
|
```
|
||||||
|
|
||||||
@@ -668,14 +668,14 @@ Explore interactive examples for common tasks:
|
|||||||
|
|
||||||
| Example | Description | Notebook |
|
| Example | Description | Notebook |
|
||||||
|---------|-------------|----------|
|
|---------|-------------|----------|
|
||||||
| **Face Detection** | Detect faces and facial landmarks | [face_detection.ipynb](examples/face_detection.ipynb) |
|
| **Face Detection** | Detect faces and facial landmarks | [01_face_detection.ipynb](examples/01_face_detection.ipynb) |
|
||||||
| **Face Alignment** | Align and crop faces for recognition | [face_alignment.ipynb](examples/face_alignment.ipynb) |
|
| **Face Alignment** | Align and crop faces for recognition | [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) |
|
||||||
| **Face Recognition** | Extract face embeddings and compare faces | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
|
| **Face Verification** | Compare two faces to verify identity | [03_face_verification.ipynb](examples/03_face_verification.ipynb) |
|
||||||
| **Face Verification** | Compare two faces to verify identity | [face_verification.ipynb](examples/face_verification.ipynb) |
|
| **Face Search** | Find a person in a group photo | [04_face_search.ipynb](examples/04_face_search.ipynb) |
|
||||||
| **Face Search** | Find a person in a group photo | [face_search.ipynb](examples/face_search.ipynb) |
|
| **Face Analyzer** | All-in-one detection, recognition & attributes | [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) |
|
||||||
| **Face Parsing** | Segment face into semantic components | [face_parsing.ipynb](examples/face_parsing.ipynb) |
|
| **Face Parsing** | Segment face into semantic components | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
|
||||||
| **Face Anonymization** | Blur or pixelate faces for privacy protection | [face_anonymization.ipynb](examples/face_anonymization.ipynb) |
|
| **Face Anonymization** | Blur or pixelate faces for privacy protection | [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) |
|
||||||
| **Gaze Estimation** | Estimate gaze direction | [gaze_estimation.ipynb](examples/gaze_estimation.ipynb) |
|
| **Gaze Estimation** | Estimate gaze direction | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
|
||||||
|
|
||||||
### Additional Resources
|
### Additional Resources
|
||||||
|
|
||||||
|
|||||||
34
README.md
34
README.md
@@ -321,7 +321,7 @@ detector = RetinaFace()
|
|||||||
# Create with custom config
|
# Create with custom config
|
||||||
detector = SCRFD(
|
detector = SCRFD(
|
||||||
model_name=SCRFDWeights.SCRFD_10G_KPS, # SCRFDWeights.SCRFD_500M_KPS
|
model_name=SCRFDWeights.SCRFD_10G_KPS, # SCRFDWeights.SCRFD_500M_KPS
|
||||||
conf_thresh=0.4,
|
confidence_threshold=0.4,
|
||||||
input_size=(640, 640)
|
input_size=(640, 640)
|
||||||
)
|
)
|
||||||
# Or with defaults settings: detector = SCRFD()
|
# Or with defaults settings: detector = SCRFD()
|
||||||
@@ -340,16 +340,16 @@ from uniface.constants import RetinaFaceWeights, YOLOv5FaceWeights
|
|||||||
# Detection
|
# Detection
|
||||||
detector = RetinaFace(
|
detector = RetinaFace(
|
||||||
model_name=RetinaFaceWeights.MNET_V2,
|
model_name=RetinaFaceWeights.MNET_V2,
|
||||||
conf_thresh=0.5,
|
confidence_threshold=0.5,
|
||||||
nms_thresh=0.4
|
nms_threshold=0.4
|
||||||
)
|
)
|
||||||
# Or detector = RetinaFace()
|
# Or detector = RetinaFace()
|
||||||
|
|
||||||
# YOLOv5-Face detection
|
# YOLOv5-Face detection
|
||||||
detector = YOLOv5Face(
|
detector = YOLOv5Face(
|
||||||
model_name=YOLOv5FaceWeights.YOLOV5S,
|
model_name=YOLOv5FaceWeights.YOLOV5S,
|
||||||
conf_thresh=0.6,
|
confidence_threshold=0.6,
|
||||||
nms_thresh=0.5
|
nms_threshold=0.5
|
||||||
)
|
)
|
||||||
# Or detector = YOLOv5Face
|
# Or detector = YOLOv5Face
|
||||||
|
|
||||||
@@ -365,7 +365,7 @@ recognizer = SphereFace() # Angular softmax alternative
|
|||||||
from uniface import detect_faces
|
from uniface import detect_faces
|
||||||
|
|
||||||
# One-line face detection
|
# One-line face detection
|
||||||
faces = detect_faces(image, method='retinaface', conf_thresh=0.8) # methods: retinaface, scrfd, yolov5face
|
faces = detect_faces(image, method='retinaface', confidence_threshold=0.8) # methods: retinaface, scrfd, yolov5face
|
||||||
```
|
```
|
||||||
|
|
||||||
### Key Parameters (quick reference)
|
### Key Parameters (quick reference)
|
||||||
@@ -374,9 +374,9 @@ faces = detect_faces(image, method='retinaface', conf_thresh=0.8) # methods: re
|
|||||||
|
|
||||||
| Class | Key params (defaults) | Notes |
|
| Class | Key params (defaults) | Notes |
|
||||||
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------- |
|
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------- |
|
||||||
| `RetinaFace` | `model_name=RetinaFaceWeights.MNET_V2`, `conf_thresh=0.5`, `nms_thresh=0.4`, `input_size=(640, 640)`, `dynamic_size=False` | Supports 5-point landmarks |
|
| `RetinaFace` | `model_name=RetinaFaceWeights.MNET_V2`, `confidence_threshold=0.5`, `nms_threshold=0.4`, `input_size=(640, 640)`, `dynamic_size=False` | Supports 5-point landmarks |
|
||||||
| `SCRFD` | `model_name=SCRFDWeights.SCRFD_10G_KPS`, `conf_thresh=0.5`, `nms_thresh=0.4`, `input_size=(640, 640)` | Supports 5-point landmarks |
|
| `SCRFD` | `model_name=SCRFDWeights.SCRFD_10G_KPS`, `confidence_threshold=0.5`, `nms_threshold=0.4`, `input_size=(640, 640)` | Supports 5-point landmarks |
|
||||||
| `YOLOv5Face` | `model_name=YOLOv5FaceWeights.YOLOV5S`, `conf_thresh=0.6`, `nms_thresh=0.5`, `input_size=640` (fixed) | Supports 5-point landmarks; models: YOLOV5N/S/M; `input_size` must be 640 |
|
| `YOLOv5Face` | `model_name=YOLOv5FaceWeights.YOLOV5S`, `confidence_threshold=0.6`, `nms_threshold=0.5`, `input_size=640` (fixed) | Supports 5-point landmarks; models: YOLOV5N/S/M; `input_size` must be 640 |
|
||||||
|
|
||||||
**Recognition**
|
**Recognition**
|
||||||
|
|
||||||
@@ -454,14 +454,14 @@ Interactive examples covering common face analysis tasks:
|
|||||||
|
|
||||||
| Example | Description | Notebook |
|
| Example | Description | Notebook |
|
||||||
|---------|-------------|----------|
|
|---------|-------------|----------|
|
||||||
| **Face Detection** | Detect faces and facial landmarks | [face_detection.ipynb](examples/face_detection.ipynb) |
|
| **Face Detection** | Detect faces and facial landmarks | [01_face_detection.ipynb](examples/01_face_detection.ipynb) |
|
||||||
| **Face Alignment** | Align and crop faces for recognition | [face_alignment.ipynb](examples/face_alignment.ipynb) |
|
| **Face Alignment** | Align and crop faces for recognition | [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) |
|
||||||
| **Face Recognition** | Extract face embeddings and compare faces | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
|
| **Face Verification** | Compare two faces to verify identity | [03_face_verification.ipynb](examples/03_face_verification.ipynb) |
|
||||||
| **Face Verification** | Compare two faces to verify identity | [face_verification.ipynb](examples/face_verification.ipynb) |
|
| **Face Search** | Find a person in a group photo | [04_face_search.ipynb](examples/04_face_search.ipynb) |
|
||||||
| **Face Search** | Find a person in a group photo | [face_search.ipynb](examples/face_search.ipynb) |
|
| **Face Analyzer** | All-in-one detection, recognition & attributes | [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) |
|
||||||
| **Face Parsing** | Segment face into semantic components | [face_parsing.ipynb](examples/face_parsing.ipynb) |
|
| **Face Parsing** | Segment face into semantic components | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
|
||||||
| **Face Anonymization** | Blur or pixelate faces for privacy protection | [face_anonymization.ipynb](examples/face_anonymization.ipynb) |
|
| **Face Anonymization** | Blur or pixelate faces for privacy protection | [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) |
|
||||||
| **Gaze Estimation** | Estimate gaze direction from face images | [gaze_estimation.ipynb](examples/gaze_estimation.ipynb) |
|
| **Gaze Estimation** | Estimate gaze direction from face images | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
|
||||||
|
|
||||||
### Webcam Face Detection
|
### Webcam Face Detection
|
||||||
|
|
||||||
|
|||||||
@@ -44,7 +44,7 @@
|
|||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"1.6.0\n"
|
"2.0.0\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
@@ -82,8 +82,8 @@
|
|||||||
],
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"detector = RetinaFace(\n",
|
"detector = RetinaFace(\n",
|
||||||
" conf_thresh=0.5,\n",
|
" confidence_threshold=0.5,\n",
|
||||||
" nms_thresh=0.4,\n",
|
" nms_threshold=0.4,\n",
|
||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -48,7 +48,7 @@
|
|||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"1.6.0\n"
|
"2.0.0\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
@@ -87,8 +87,8 @@
|
|||||||
],
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"detector = RetinaFace(\n",
|
"detector = RetinaFace(\n",
|
||||||
" conf_thresh=0.5,\n",
|
" confidence_threshold=0.5,\n",
|
||||||
" nms_thresh=0.4,\n",
|
" nms_threshold=0.4,\n",
|
||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
@@ -37,7 +37,7 @@
|
|||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"1.6.0\n"
|
"2.0.0\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
@@ -78,7 +78,7 @@
|
|||||||
],
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"analyzer = FaceAnalyzer(\n",
|
"analyzer = FaceAnalyzer(\n",
|
||||||
" detector=RetinaFace(conf_thresh=0.5),\n",
|
" detector=RetinaFace(confidence_threshold=0.5),\n",
|
||||||
" recognizer=ArcFace()\n",
|
" recognizer=ArcFace()\n",
|
||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
@@ -42,7 +42,7 @@
|
|||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"1.6.0\n"
|
"2.0.0\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
@@ -74,7 +74,7 @@
|
|||||||
],
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"analyzer = FaceAnalyzer(\n",
|
"analyzer = FaceAnalyzer(\n",
|
||||||
" detector=RetinaFace(conf_thresh=0.5),\n",
|
" detector=RetinaFace(confidence_threshold=0.5),\n",
|
||||||
" recognizer=ArcFace()\n",
|
" recognizer=ArcFace()\n",
|
||||||
")"
|
")"
|
||||||
]
|
]
|
||||||
@@ -44,7 +44,7 @@
|
|||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"1.6.0\n"
|
"2.0.0\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
@@ -88,7 +88,7 @@
|
|||||||
],
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"analyzer = FaceAnalyzer(\n",
|
"analyzer = FaceAnalyzer(\n",
|
||||||
" detector=RetinaFace(conf_thresh=0.5),\n",
|
" detector=RetinaFace(confidence_threshold=0.5),\n",
|
||||||
" recognizer=ArcFace(),\n",
|
" recognizer=ArcFace(),\n",
|
||||||
" age_gender=AgeGender()\n",
|
" age_gender=AgeGender()\n",
|
||||||
")"
|
")"
|
||||||
@@ -46,7 +46,7 @@
|
|||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"UniFace version: 1.6.0\n"
|
"UniFace version: 2.0.0\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
File diff suppressed because one or more lines are too long
@@ -44,7 +44,7 @@
|
|||||||
"name": "stdout",
|
"name": "stdout",
|
||||||
"output_type": "stream",
|
"output_type": "stream",
|
||||||
"text": [
|
"text": [
|
||||||
"UniFace version: 1.6.0\n"
|
"UniFace version: 2.0.0\n"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
@@ -86,7 +86,7 @@
|
|||||||
],
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"# Initialize face detector\n",
|
"# Initialize face detector\n",
|
||||||
"detector = RetinaFace(conf_thresh=0.5)\n",
|
"detector = RetinaFace(confidence_threshold=0.5)\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Initialize gaze estimator (uses ResNet34 by default)\n",
|
"# Initialize gaze estimator (uses ResNet34 by default)\n",
|
||||||
"gaze_estimator = MobileGaze()"
|
"gaze_estimator = MobileGaze()"
|
||||||
@@ -1,6 +1,6 @@
|
|||||||
[project]
|
[project]
|
||||||
name = "uniface"
|
name = "uniface"
|
||||||
version = "1.6.0"
|
version = "2.0.0"
|
||||||
description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Landmark Analysis, Face Parsing, Gaze Estimation, Age, and Gender Detection"
|
description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Landmark Analysis, Face Parsing, Gaze Estimation, Age, and Gender Detection"
|
||||||
readme = "README.md"
|
readme = "README.md"
|
||||||
license = { text = "MIT" }
|
license = { text = "MIT" }
|
||||||
@@ -89,13 +89,60 @@ exclude = [
|
|||||||
|
|
||||||
[tool.ruff.format]
|
[tool.ruff.format]
|
||||||
quote-style = "single"
|
quote-style = "single"
|
||||||
|
docstring-code-format = true
|
||||||
|
|
||||||
[tool.ruff.lint]
|
[tool.ruff.lint]
|
||||||
select = ["E", "F", "I", "W"]
|
select = [
|
||||||
|
"E", # pycodestyle errors
|
||||||
|
"F", # pyflakes
|
||||||
|
"I", # isort
|
||||||
|
"W", # pycodestyle warnings
|
||||||
|
"UP", # pyupgrade (modern Python syntax)
|
||||||
|
"B", # flake8-bugbear
|
||||||
|
"C4", # flake8-comprehensions
|
||||||
|
"SIM", # flake8-simplify
|
||||||
|
"RUF", # Ruff-specific rules
|
||||||
|
]
|
||||||
|
ignore = [
|
||||||
|
"E501", # Line too long (handled by formatter)
|
||||||
|
"B008", # Function call in default argument (common in FastAPI/Click)
|
||||||
|
"SIM108", # Use ternary operator (can reduce readability)
|
||||||
|
"RUF022", # Allow logical grouping in __all__ instead of alphabetical sorting
|
||||||
|
]
|
||||||
|
|
||||||
[tool.ruff.lint.flake8-quotes]
|
[tool.ruff.lint.flake8-quotes]
|
||||||
docstring-quotes = "double"
|
docstring-quotes = "double"
|
||||||
|
|
||||||
[tool.ruff.lint.isort]
|
[tool.ruff.lint.isort]
|
||||||
|
force-single-line = false
|
||||||
|
force-sort-within-sections = true
|
||||||
known-first-party = ["uniface"]
|
known-first-party = ["uniface"]
|
||||||
|
section-order = [
|
||||||
|
"future",
|
||||||
|
"standard-library",
|
||||||
|
"third-party",
|
||||||
|
"first-party",
|
||||||
|
"local-folder",
|
||||||
|
]
|
||||||
|
|
||||||
|
[tool.ruff.lint.pydocstyle]
|
||||||
|
convention = "google"
|
||||||
|
|
||||||
|
[tool.mypy]
|
||||||
|
python_version = "3.11"
|
||||||
|
warn_return_any = false
|
||||||
|
warn_unused_ignores = true
|
||||||
|
ignore_missing_imports = true
|
||||||
|
exclude = ["tests/", "scripts/", "examples/"]
|
||||||
|
# Disable strict return type checking for numpy operations
|
||||||
|
disable_error_code = ["no-any-return"]
|
||||||
|
|
||||||
|
[tool.bandit]
|
||||||
|
exclude_dirs = ["tests", "scripts", "examples"]
|
||||||
|
skips = ["B101", "B614"] # B101: assert, B614: torch.jit.load (models are SHA256 verified)
|
||||||
|
|
||||||
|
[tool.pytest.ini_options]
|
||||||
|
testpaths = ["tests"]
|
||||||
|
python_files = ["test_*.py"]
|
||||||
|
python_functions = ["test_*"]
|
||||||
|
addopts = "-v --tb=short"
|
||||||
|
|||||||
@@ -28,9 +28,9 @@ def process_image(detector, image_path: Path, output_path: Path, threshold: floa
|
|||||||
faces = detector.detect(image)
|
faces = detector.detect(image)
|
||||||
|
|
||||||
# unpack face data for visualization
|
# unpack face data for visualization
|
||||||
bboxes = [f['bbox'] for f in faces]
|
bboxes = [f.bbox for f in faces]
|
||||||
scores = [f['confidence'] for f in faces]
|
scores = [f.confidence for f in faces]
|
||||||
landmarks = [f['landmarks'] for f in faces]
|
landmarks = [f.landmarks for f in faces]
|
||||||
draw_detections(
|
draw_detections(
|
||||||
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -39,17 +39,17 @@ def process_image(
|
|||||||
if not faces:
|
if not faces:
|
||||||
return
|
return
|
||||||
|
|
||||||
bboxes = [f['bbox'] for f in faces]
|
bboxes = [f.bbox for f in faces]
|
||||||
scores = [f['confidence'] for f in faces]
|
scores = [f.confidence for f in faces]
|
||||||
landmarks = [f['landmarks'] for f in faces]
|
landmarks = [f.landmarks for f in faces]
|
||||||
draw_detections(
|
draw_detections(
|
||||||
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
||||||
)
|
)
|
||||||
|
|
||||||
for i, face in enumerate(faces):
|
for i, face in enumerate(faces):
|
||||||
result = age_gender.predict(image, face['bbox'])
|
result = age_gender.predict(image, face.bbox)
|
||||||
print(f' Face {i + 1}: {result.sex}, {result.age} years old')
|
print(f' Face {i + 1}: {result.sex}, {result.age} years old')
|
||||||
draw_age_gender_label(image, face['bbox'], result.sex, result.age)
|
draw_age_gender_label(image, face.bbox, result.sex, result.age)
|
||||||
|
|
||||||
os.makedirs(save_dir, exist_ok=True)
|
os.makedirs(save_dir, exist_ok=True)
|
||||||
output_path = os.path.join(save_dir, f'{Path(image_path).stem}_age_gender.jpg')
|
output_path = os.path.join(save_dir, f'{Path(image_path).stem}_age_gender.jpg')
|
||||||
@@ -74,16 +74,16 @@ def run_webcam(detector, age_gender, threshold: float = 0.6):
|
|||||||
faces = detector.detect(frame)
|
faces = detector.detect(frame)
|
||||||
|
|
||||||
# unpack face data for visualization
|
# unpack face data for visualization
|
||||||
bboxes = [f['bbox'] for f in faces]
|
bboxes = [f.bbox for f in faces]
|
||||||
scores = [f['confidence'] for f in faces]
|
scores = [f.confidence for f in faces]
|
||||||
landmarks = [f['landmarks'] for f in faces]
|
landmarks = [f.landmarks for f in faces]
|
||||||
draw_detections(
|
draw_detections(
|
||||||
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
||||||
)
|
)
|
||||||
|
|
||||||
for face in faces:
|
for face in faces:
|
||||||
result = age_gender.predict(frame, face['bbox'])
|
result = age_gender.predict(frame, face.bbox)
|
||||||
draw_age_gender_label(frame, face['bbox'], result.sex, result.age)
|
draw_age_gender_label(frame, face.bbox, result.sex, result.age)
|
||||||
|
|
||||||
cv2.putText(
|
cv2.putText(
|
||||||
frame,
|
frame,
|
||||||
|
|||||||
@@ -33,9 +33,9 @@ def process_image(
|
|||||||
from uniface.visualization import draw_detections
|
from uniface.visualization import draw_detections
|
||||||
|
|
||||||
preview = image.copy()
|
preview = image.copy()
|
||||||
bboxes = [face['bbox'] for face in faces]
|
bboxes = [face.bbox for face in faces]
|
||||||
scores = [face['confidence'] for face in faces]
|
scores = [face.confidence for face in faces]
|
||||||
landmarks = [face['landmarks'] for face in faces]
|
landmarks = [face.landmarks for face in faces]
|
||||||
draw_detections(preview, bboxes, scores, landmarks)
|
draw_detections(preview, bboxes, scores, landmarks)
|
||||||
|
|
||||||
# Show preview
|
# Show preview
|
||||||
@@ -157,7 +157,7 @@ Examples:
|
|||||||
|
|
||||||
# Detection
|
# Detection
|
||||||
parser.add_argument(
|
parser.add_argument(
|
||||||
'--conf-thresh',
|
'--confidence-threshold',
|
||||||
type=float,
|
type=float,
|
||||||
default=0.5,
|
default=0.5,
|
||||||
help='Detection confidence threshold (default: 0.5)',
|
help='Detection confidence threshold (default: 0.5)',
|
||||||
@@ -183,8 +183,8 @@ Examples:
|
|||||||
color = tuple(color_values)
|
color = tuple(color_values)
|
||||||
|
|
||||||
# Initialize detector
|
# Initialize detector
|
||||||
print(f'Initializing face detector (conf_thresh={args.conf_thresh})...')
|
print(f'Initializing face detector (confidence_threshold={args.confidence_threshold})...')
|
||||||
detector = RetinaFace(conf_thresh=args.conf_thresh)
|
detector = RetinaFace(confidence_threshold=args.confidence_threshold)
|
||||||
|
|
||||||
# Initialize blurrer
|
# Initialize blurrer
|
||||||
print(f'Initializing blur method: {args.method}')
|
print(f'Initializing blur method: {args.method}')
|
||||||
|
|||||||
@@ -1,6 +1,15 @@
|
|||||||
# Face detection on image or webcam
|
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||||
# Usage: python run_detection.py --image path/to/image.jpg
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# python run_detection.py --webcam
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
"""Face detection on image or webcam.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python run_detection.py --image path/to/image.jpg
|
||||||
|
python run_detection.py --webcam
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
import argparse
|
import argparse
|
||||||
import os
|
import os
|
||||||
@@ -20,9 +29,9 @@ def process_image(detector, image_path: str, threshold: float = 0.6, save_dir: s
|
|||||||
faces = detector.detect(image)
|
faces = detector.detect(image)
|
||||||
|
|
||||||
if faces:
|
if faces:
|
||||||
bboxes = [face['bbox'] for face in faces]
|
bboxes = [face.bbox for face in faces]
|
||||||
scores = [face['confidence'] for face in faces]
|
scores = [face.confidence for face in faces]
|
||||||
landmarks = [face['landmarks'] for face in faces]
|
landmarks = [face.landmarks for face in faces]
|
||||||
draw_detections(image, bboxes, scores, landmarks, vis_threshold=threshold)
|
draw_detections(image, bboxes, scores, landmarks, vis_threshold=threshold)
|
||||||
|
|
||||||
os.makedirs(save_dir, exist_ok=True)
|
os.makedirs(save_dir, exist_ok=True)
|
||||||
@@ -48,9 +57,9 @@ def run_webcam(detector, threshold: float = 0.6):
|
|||||||
faces = detector.detect(frame)
|
faces = detector.detect(frame)
|
||||||
|
|
||||||
# unpack face data for visualization
|
# unpack face data for visualization
|
||||||
bboxes = [f['bbox'] for f in faces]
|
bboxes = [f.bbox for f in faces]
|
||||||
scores = [f['confidence'] for f in faces]
|
scores = [f.confidence for f in faces]
|
||||||
landmarks = [f['landmarks'] for f in faces]
|
landmarks = [f.landmarks for f in faces]
|
||||||
draw_detections(
|
draw_detections(
|
||||||
image=frame,
|
image=frame,
|
||||||
bboxes=bboxes,
|
bboxes=bboxes,
|
||||||
|
|||||||
@@ -39,17 +39,17 @@ def process_image(
|
|||||||
if not faces:
|
if not faces:
|
||||||
return
|
return
|
||||||
|
|
||||||
bboxes = [f['bbox'] for f in faces]
|
bboxes = [f.bbox for f in faces]
|
||||||
scores = [f['confidence'] for f in faces]
|
scores = [f.confidence for f in faces]
|
||||||
landmarks = [f['landmarks'] for f in faces]
|
landmarks = [f.landmarks for f in faces]
|
||||||
draw_detections(
|
draw_detections(
|
||||||
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
||||||
)
|
)
|
||||||
|
|
||||||
for i, face in enumerate(faces):
|
for i, face in enumerate(faces):
|
||||||
emotion, confidence = emotion_predictor.predict(image, face['landmarks'])
|
emotion, confidence = emotion_predictor.predict(image, face.landmarks)
|
||||||
print(f' Face {i + 1}: {emotion} (confidence: {confidence:.3f})')
|
print(f' Face {i + 1}: {emotion} (confidence: {confidence:.3f})')
|
||||||
draw_emotion_label(image, face['bbox'], emotion, confidence)
|
draw_emotion_label(image, face.bbox, emotion, confidence)
|
||||||
|
|
||||||
os.makedirs(save_dir, exist_ok=True)
|
os.makedirs(save_dir, exist_ok=True)
|
||||||
output_path = os.path.join(save_dir, f'{Path(image_path).stem}_emotion.jpg')
|
output_path = os.path.join(save_dir, f'{Path(image_path).stem}_emotion.jpg')
|
||||||
@@ -74,14 +74,16 @@ def run_webcam(detector, emotion_predictor, threshold: float = 0.6):
|
|||||||
faces = detector.detect(frame)
|
faces = detector.detect(frame)
|
||||||
|
|
||||||
# unpack face data for visualization
|
# unpack face data for visualization
|
||||||
bboxes = [f['bbox'] for f in faces]
|
bboxes = [f.bbox for f in faces]
|
||||||
scores = [f['confidence'] for f in faces]
|
scores = [f.confidence for f in faces]
|
||||||
landmarks = [f['landmarks'] for f in faces]
|
landmarks = [f.landmarks for f in faces]
|
||||||
draw_detections(frame, bboxes, scores, landmarks, vis_threshold=threshold)
|
draw_detections(
|
||||||
|
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
||||||
|
)
|
||||||
|
|
||||||
for face in faces:
|
for face in faces:
|
||||||
emotion, confidence = emotion_predictor.predict(frame, face['landmarks'])
|
emotion, confidence = emotion_predictor.predict(frame, face.landmarks)
|
||||||
draw_emotion_label(frame, face['bbox'], emotion, confidence)
|
draw_emotion_label(frame, face.bbox, emotion, confidence)
|
||||||
|
|
||||||
cv2.putText(
|
cv2.putText(
|
||||||
frame,
|
frame,
|
||||||
|
|||||||
@@ -7,6 +7,7 @@ import os
|
|||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
import cv2
|
import cv2
|
||||||
|
import numpy as np
|
||||||
|
|
||||||
from uniface import RetinaFace
|
from uniface import RetinaFace
|
||||||
from uniface.constants import ParsingWeights
|
from uniface.constants import ParsingWeights
|
||||||
@@ -14,7 +15,49 @@ from uniface.parsing import BiSeNet
|
|||||||
from uniface.visualization import vis_parsing_maps
|
from uniface.visualization import vis_parsing_maps
|
||||||
|
|
||||||
|
|
||||||
def process_image(detector, parser, image_path: str, save_dir: str = 'outputs'):
|
def expand_bbox(
|
||||||
|
bbox: np.ndarray,
|
||||||
|
image_shape: tuple[int, int],
|
||||||
|
expand_ratio: float = 0.2,
|
||||||
|
expand_top_ratio: float = 0.4,
|
||||||
|
) -> tuple[int, int, int, int]:
|
||||||
|
"""
|
||||||
|
Expand bounding box to include full head region for face parsing.
|
||||||
|
|
||||||
|
Face detection typically returns tight face boxes, but face parsing
|
||||||
|
requires the full head including hair, ears, and neck.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
bbox: Original bounding box [x1, y1, x2, y2].
|
||||||
|
image_shape: Image dimensions as (height, width).
|
||||||
|
expand_ratio: Expansion ratio for left, right, and bottom (default: 0.2 = 20%).
|
||||||
|
expand_top_ratio: Expansion ratio for top to capture hair/forehead (default: 0.4 = 40%).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Tuple[int, int, int, int]: Expanded bbox (x1, y1, x2, y2) clamped to image bounds.
|
||||||
|
"""
|
||||||
|
x1, y1, x2, y2 = map(int, bbox[:4])
|
||||||
|
height, width = image_shape[:2]
|
||||||
|
|
||||||
|
# Calculate face dimensions
|
||||||
|
face_width = x2 - x1
|
||||||
|
face_height = y2 - y1
|
||||||
|
|
||||||
|
# Calculate expansion amounts
|
||||||
|
expand_x = int(face_width * expand_ratio)
|
||||||
|
expand_y_bottom = int(face_height * expand_ratio)
|
||||||
|
expand_y_top = int(face_height * expand_top_ratio)
|
||||||
|
|
||||||
|
# Expand and clamp to image boundaries
|
||||||
|
new_x1 = max(0, x1 - expand_x)
|
||||||
|
new_y1 = max(0, y1 - expand_y_top)
|
||||||
|
new_x2 = min(width, x2 + expand_x)
|
||||||
|
new_y2 = min(height, y2 + expand_y_bottom)
|
||||||
|
|
||||||
|
return new_x1, new_y1, new_x2, new_y2
|
||||||
|
|
||||||
|
|
||||||
|
def process_image(detector, parser, image_path: str, save_dir: str = 'outputs', expand_ratio: float = 0.2):
|
||||||
image = cv2.imread(image_path)
|
image = cv2.imread(image_path)
|
||||||
if image is None:
|
if image is None:
|
||||||
print(f"Error: Failed to load image from '{image_path}'")
|
print(f"Error: Failed to load image from '{image_path}'")
|
||||||
@@ -26,8 +69,8 @@ def process_image(detector, parser, image_path: str, save_dir: str = 'outputs'):
|
|||||||
result_image = image.copy()
|
result_image = image.copy()
|
||||||
|
|
||||||
for i, face in enumerate(faces):
|
for i, face in enumerate(faces):
|
||||||
bbox = face['bbox']
|
# Expand bbox to include full head for parsing
|
||||||
x1, y1, x2, y2 = map(int, bbox[:4])
|
x1, y1, x2, y2 = expand_bbox(face.bbox, image.shape, expand_ratio=expand_ratio)
|
||||||
face_crop = image[y1:y2, x1:x2]
|
face_crop = image[y1:y2, x1:x2]
|
||||||
|
|
||||||
if face_crop.size == 0:
|
if face_crop.size == 0:
|
||||||
@@ -44,7 +87,7 @@ def process_image(detector, parser, image_path: str, save_dir: str = 'outputs'):
|
|||||||
# Place the visualization back on the original image
|
# Place the visualization back on the original image
|
||||||
result_image[y1:y2, x1:x2] = vis_result
|
result_image[y1:y2, x1:x2] = vis_result
|
||||||
|
|
||||||
# Draw bounding box
|
# Draw expanded bounding box
|
||||||
cv2.rectangle(result_image, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
cv2.rectangle(result_image, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
||||||
|
|
||||||
os.makedirs(save_dir, exist_ok=True)
|
os.makedirs(save_dir, exist_ok=True)
|
||||||
@@ -53,7 +96,7 @@ def process_image(detector, parser, image_path: str, save_dir: str = 'outputs'):
|
|||||||
print(f'Output saved: {output_path}')
|
print(f'Output saved: {output_path}')
|
||||||
|
|
||||||
|
|
||||||
def run_webcam(detector, parser):
|
def run_webcam(detector, parser, expand_ratio: float = 0.2):
|
||||||
cap = cv2.VideoCapture(0)
|
cap = cv2.VideoCapture(0)
|
||||||
if not cap.isOpened():
|
if not cap.isOpened():
|
||||||
print('Cannot open webcam')
|
print('Cannot open webcam')
|
||||||
@@ -70,8 +113,8 @@ def run_webcam(detector, parser):
|
|||||||
faces = detector.detect(frame)
|
faces = detector.detect(frame)
|
||||||
|
|
||||||
for face in faces:
|
for face in faces:
|
||||||
bbox = face['bbox']
|
# Expand bbox to include full head for parsing
|
||||||
x1, y1, x2, y2 = map(int, bbox[:4])
|
x1, y1, x2, y2 = expand_bbox(face.bbox, frame.shape, expand_ratio=expand_ratio)
|
||||||
face_crop = frame[y1:y2, x1:x2]
|
face_crop = frame[y1:y2, x1:x2]
|
||||||
|
|
||||||
if face_crop.size == 0:
|
if face_crop.size == 0:
|
||||||
@@ -87,7 +130,7 @@ def run_webcam(detector, parser):
|
|||||||
# Place the visualization back on the frame
|
# Place the visualization back on the frame
|
||||||
frame[y1:y2, x1:x2] = vis_result
|
frame[y1:y2, x1:x2] = vis_result
|
||||||
|
|
||||||
# Draw bounding box
|
# Draw expanded bounding box
|
||||||
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
||||||
|
|
||||||
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
|
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
|
||||||
@@ -108,6 +151,12 @@ def main():
|
|||||||
parser_arg.add_argument(
|
parser_arg.add_argument(
|
||||||
'--model', type=str, default=ParsingWeights.RESNET18, choices=[ParsingWeights.RESNET18, ParsingWeights.RESNET34]
|
'--model', type=str, default=ParsingWeights.RESNET18, choices=[ParsingWeights.RESNET18, ParsingWeights.RESNET34]
|
||||||
)
|
)
|
||||||
|
parser_arg.add_argument(
|
||||||
|
'--expand-ratio',
|
||||||
|
type=float,
|
||||||
|
default=0.2,
|
||||||
|
help='Bbox expansion ratio for full head coverage (default: 0.2 = 20%%)',
|
||||||
|
)
|
||||||
args = parser_arg.parse_args()
|
args = parser_arg.parse_args()
|
||||||
|
|
||||||
if not args.image and not args.webcam:
|
if not args.image and not args.webcam:
|
||||||
@@ -117,9 +166,9 @@ def main():
|
|||||||
parser = BiSeNet(model_name=ParsingWeights.RESNET34)
|
parser = BiSeNet(model_name=ParsingWeights.RESNET34)
|
||||||
|
|
||||||
if args.webcam:
|
if args.webcam:
|
||||||
run_webcam(detector, parser)
|
run_webcam(detector, parser, expand_ratio=args.expand_ratio)
|
||||||
else:
|
else:
|
||||||
process_image(detector, parser, args.image, args.save_dir)
|
process_image(detector, parser, args.image, args.save_dir, expand_ratio=args.expand_ratio)
|
||||||
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
|
|||||||
@@ -29,7 +29,7 @@ def extract_reference_embedding(detector, recognizer, image_path: str) -> np.nda
|
|||||||
if not faces:
|
if not faces:
|
||||||
raise RuntimeError('No faces found in reference image.')
|
raise RuntimeError('No faces found in reference image.')
|
||||||
|
|
||||||
landmarks = faces[0]['landmarks']
|
landmarks = faces[0].landmarks
|
||||||
return recognizer.get_normalized_embedding(image, landmarks)
|
return recognizer.get_normalized_embedding(image, landmarks)
|
||||||
|
|
||||||
|
|
||||||
@@ -49,8 +49,8 @@ def run_webcam(detector, recognizer, ref_embedding: np.ndarray, threshold: float
|
|||||||
faces = detector.detect(frame)
|
faces = detector.detect(frame)
|
||||||
|
|
||||||
for face in faces:
|
for face in faces:
|
||||||
bbox = face['bbox']
|
bbox = face.bbox
|
||||||
landmarks = face['landmarks']
|
landmarks = face.landmarks
|
||||||
x1, y1, x2, y2 = map(int, bbox)
|
x1, y1, x2, y2 = map(int, bbox)
|
||||||
|
|
||||||
embedding = recognizer.get_normalized_embedding(frame, landmarks)
|
embedding = recognizer.get_normalized_embedding(frame, landmarks)
|
||||||
|
|||||||
@@ -24,7 +24,7 @@ def process_image(detector, gaze_estimator, image_path: str, save_dir: str = 'ou
|
|||||||
print(f'Detected {len(faces)} face(s)')
|
print(f'Detected {len(faces)} face(s)')
|
||||||
|
|
||||||
for i, face in enumerate(faces):
|
for i, face in enumerate(faces):
|
||||||
bbox = face['bbox']
|
bbox = face.bbox
|
||||||
x1, y1, x2, y2 = map(int, bbox[:4])
|
x1, y1, x2, y2 = map(int, bbox[:4])
|
||||||
face_crop = image[y1:y2, x1:x2]
|
face_crop = image[y1:y2, x1:x2]
|
||||||
|
|
||||||
@@ -60,7 +60,7 @@ def run_webcam(detector, gaze_estimator):
|
|||||||
faces = detector.detect(frame)
|
faces = detector.detect(frame)
|
||||||
|
|
||||||
for face in faces:
|
for face in faces:
|
||||||
bbox = face['bbox']
|
bbox = face.bbox
|
||||||
x1, y1, x2, y2 = map(int, bbox[:4])
|
x1, y1, x2, y2 = map(int, bbox[:4])
|
||||||
face_crop = frame[y1:y2, x1:x2]
|
face_crop = frame[y1:y2, x1:x2]
|
||||||
|
|
||||||
|
|||||||
@@ -24,7 +24,7 @@ def process_image(detector, landmarker, image_path: str, save_dir: str = 'output
|
|||||||
return
|
return
|
||||||
|
|
||||||
for i, face in enumerate(faces):
|
for i, face in enumerate(faces):
|
||||||
bbox = face['bbox']
|
bbox = face.bbox
|
||||||
x1, y1, x2, y2 = map(int, bbox)
|
x1, y1, x2, y2 = map(int, bbox)
|
||||||
cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
||||||
|
|
||||||
@@ -67,7 +67,7 @@ def run_webcam(detector, landmarker):
|
|||||||
faces = detector.detect(frame)
|
faces = detector.detect(frame)
|
||||||
|
|
||||||
for face in faces:
|
for face in faces:
|
||||||
bbox = face['bbox']
|
bbox = face.bbox
|
||||||
x1, y1, x2, y2 = map(int, bbox)
|
x1, y1, x2, y2 = map(int, bbox)
|
||||||
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
||||||
|
|
||||||
|
|||||||
@@ -70,13 +70,13 @@ def process_image(detector, spoofer, image_path: str, save_dir: str = 'outputs')
|
|||||||
|
|
||||||
# Run anti-spoofing on each face
|
# Run anti-spoofing on each face
|
||||||
for i, face in enumerate(faces, 1):
|
for i, face in enumerate(faces, 1):
|
||||||
label_idx, score = spoofer.predict(image, face['bbox'])
|
label_idx, score = spoofer.predict(image, face.bbox)
|
||||||
# label_idx: 0 = Fake, 1 = Real
|
# label_idx: 0 = Fake, 1 = Real
|
||||||
label = 'Real' if label_idx == 1 else 'Fake'
|
label = 'Real' if label_idx == 1 else 'Fake'
|
||||||
print(f' Face {i}: {label} ({score:.1%})')
|
print(f' Face {i}: {label} ({score:.1%})')
|
||||||
|
|
||||||
# Draw result on image
|
# Draw result on image
|
||||||
draw_spoofing_result(image, face['bbox'], label_idx, score)
|
draw_spoofing_result(image, face.bbox, label_idx, score)
|
||||||
|
|
||||||
# Save output
|
# Save output
|
||||||
os.makedirs(save_dir, exist_ok=True)
|
os.makedirs(save_dir, exist_ok=True)
|
||||||
@@ -128,8 +128,8 @@ def process_video(detector, spoofer, source, save_dir: str = 'outputs') -> None:
|
|||||||
|
|
||||||
# Run anti-spoofing on each face
|
# Run anti-spoofing on each face
|
||||||
for face in faces:
|
for face in faces:
|
||||||
label_idx, score = spoofer.predict(frame, face['bbox'])
|
label_idx, score = spoofer.predict(frame, face.bbox)
|
||||||
draw_spoofing_result(frame, face['bbox'], label_idx, score)
|
draw_spoofing_result(frame, face.bbox, label_idx, score)
|
||||||
|
|
||||||
# Write frame
|
# Write frame
|
||||||
writer.write(frame)
|
writer.write(frame)
|
||||||
|
|||||||
@@ -52,9 +52,9 @@ def process_video(
|
|||||||
faces = detector.detect(frame)
|
faces = detector.detect(frame)
|
||||||
total_faces += len(faces)
|
total_faces += len(faces)
|
||||||
|
|
||||||
bboxes = [f['bbox'] for f in faces]
|
bboxes = [f.bbox for f in faces]
|
||||||
scores = [f['confidence'] for f in faces]
|
scores = [f.confidence for f in faces]
|
||||||
landmarks = [f['landmarks'] for f in faces]
|
landmarks = [f.landmarks for f in faces]
|
||||||
draw_detections(
|
draw_detections(
|
||||||
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -1,3 +1,11 @@
|
|||||||
|
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||||
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
"""Tests for AgeGender attribute predictor."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
|
|||||||
@@ -1,3 +1,11 @@
|
|||||||
|
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||||
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
"""Tests for factory functions (create_detector, create_recognizer, etc.)."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
@@ -35,8 +43,8 @@ def test_create_detector_with_config():
|
|||||||
detector = create_detector(
|
detector = create_detector(
|
||||||
'retinaface',
|
'retinaface',
|
||||||
model_name=RetinaFaceWeights.MNET_V2,
|
model_name=RetinaFaceWeights.MNET_V2,
|
||||||
conf_thresh=0.8,
|
confidence_threshold=0.8,
|
||||||
nms_thresh=0.3,
|
nms_threshold=0.3,
|
||||||
)
|
)
|
||||||
assert detector is not None, 'Failed to create detector with custom config'
|
assert detector is not None, 'Failed to create detector with custom config'
|
||||||
|
|
||||||
@@ -53,7 +61,7 @@ def test_create_detector_scrfd_with_model():
|
|||||||
"""
|
"""
|
||||||
Test creating SCRFD detector with specific model.
|
Test creating SCRFD detector with specific model.
|
||||||
"""
|
"""
|
||||||
detector = create_detector('scrfd', model_name=SCRFDWeights.SCRFD_10G_KPS, conf_thresh=0.5)
|
detector = create_detector('scrfd', model_name=SCRFDWeights.SCRFD_10G_KPS, confidence_threshold=0.5)
|
||||||
assert detector is not None, 'Failed to create SCRFD with specific model'
|
assert detector is not None, 'Failed to create SCRFD with specific model'
|
||||||
|
|
||||||
|
|
||||||
@@ -141,13 +149,13 @@ def test_detect_faces_with_threshold():
|
|||||||
Test detect_faces with custom confidence threshold.
|
Test detect_faces with custom confidence threshold.
|
||||||
"""
|
"""
|
||||||
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
|
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
|
||||||
faces = detect_faces(mock_image, method='retinaface', conf_thresh=0.8)
|
faces = detect_faces(mock_image, method='retinaface', confidence_threshold=0.8)
|
||||||
|
|
||||||
assert isinstance(faces, list), 'detect_faces should return a list'
|
assert isinstance(faces, list), 'detect_faces should return a list'
|
||||||
|
|
||||||
# All detections should respect threshold
|
# All detections should respect threshold
|
||||||
for face in faces:
|
for face in faces:
|
||||||
assert face['confidence'] >= 0.8, 'All detections should meet confidence threshold'
|
assert face.confidence >= 0.8, 'All detections should meet confidence threshold'
|
||||||
|
|
||||||
|
|
||||||
def test_detect_faces_default_method():
|
def test_detect_faces_default_method():
|
||||||
@@ -246,8 +254,8 @@ def test_detector_with_different_configs():
|
|||||||
"""
|
"""
|
||||||
Test creating multiple detectors with different configurations.
|
Test creating multiple detectors with different configurations.
|
||||||
"""
|
"""
|
||||||
detector_high_thresh = create_detector('retinaface', conf_thresh=0.9)
|
detector_high_thresh = create_detector('retinaface', confidence_threshold=0.9)
|
||||||
detector_low_thresh = create_detector('retinaface', conf_thresh=0.3)
|
detector_low_thresh = create_detector('retinaface', confidence_threshold=0.3)
|
||||||
|
|
||||||
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
|
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
|
||||||
|
|
||||||
|
|||||||
@@ -1,3 +1,11 @@
|
|||||||
|
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||||
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
"""Tests for 106-point facial landmark detector."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
|
|||||||
@@ -2,6 +2,10 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
"""Tests for BiSeNet face parsing model."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
|
|||||||
@@ -1,3 +1,11 @@
|
|||||||
|
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||||
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
"""Tests for face recognition models (ArcFace, MobileFace, SphereFace)."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
|
|||||||
@@ -1,3 +1,11 @@
|
|||||||
|
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||||
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
"""Tests for RetinaFace detector."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
@@ -9,9 +17,9 @@ from uniface.detection import RetinaFace
|
|||||||
def retinaface_model():
|
def retinaface_model():
|
||||||
return RetinaFace(
|
return RetinaFace(
|
||||||
model_name=RetinaFaceWeights.MNET_V2,
|
model_name=RetinaFaceWeights.MNET_V2,
|
||||||
conf_thresh=0.5,
|
confidence_threshold=0.5,
|
||||||
pre_nms_topk=5000,
|
pre_nms_topk=5000,
|
||||||
nms_thresh=0.4,
|
nms_threshold=0.4,
|
||||||
post_nms_topk=750,
|
post_nms_topk=750,
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -27,15 +35,15 @@ def test_inference_on_640x640_image(retinaface_model):
|
|||||||
assert isinstance(faces, list), 'Detections should be a list.'
|
assert isinstance(faces, list), 'Detections should be a list.'
|
||||||
|
|
||||||
for face in faces:
|
for face in faces:
|
||||||
assert isinstance(face, dict), 'Each detection should be a dictionary.'
|
# Face is a dataclass, check attributes exist
|
||||||
assert 'bbox' in face, "Each detection should have a 'bbox' key."
|
assert hasattr(face, 'bbox'), "Each detection should have a 'bbox' attribute."
|
||||||
assert 'confidence' in face, "Each detection should have a 'confidence' key."
|
assert hasattr(face, 'confidence'), "Each detection should have a 'confidence' attribute."
|
||||||
assert 'landmarks' in face, "Each detection should have a 'landmarks' key."
|
assert hasattr(face, 'landmarks'), "Each detection should have a 'landmarks' attribute."
|
||||||
|
|
||||||
bbox = face['bbox']
|
bbox = face.bbox
|
||||||
assert len(bbox) == 4, 'BBox should have 4 values (x1, y1, x2, y2).'
|
assert len(bbox) == 4, 'BBox should have 4 values (x1, y1, x2, y2).'
|
||||||
|
|
||||||
landmarks = face['landmarks']
|
landmarks = face.landmarks
|
||||||
assert len(landmarks) == 5, 'Should have 5 landmark points.'
|
assert len(landmarks) == 5, 'Should have 5 landmark points.'
|
||||||
assert all(len(pt) == 2 for pt in landmarks), 'Each landmark should be (x, y).'
|
assert all(len(pt) == 2 for pt in landmarks), 'Each landmark should be (x, y).'
|
||||||
|
|
||||||
@@ -45,7 +53,7 @@ def test_confidence_threshold(retinaface_model):
|
|||||||
faces = retinaface_model.detect(mock_image)
|
faces = retinaface_model.detect(mock_image)
|
||||||
|
|
||||||
for face in faces:
|
for face in faces:
|
||||||
confidence = face['confidence']
|
confidence = face.confidence
|
||||||
assert confidence >= 0.5, f'Detection has confidence {confidence} below threshold 0.5'
|
assert confidence >= 0.5, f'Detection has confidence {confidence} below threshold 0.5'
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -1,3 +1,11 @@
|
|||||||
|
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||||
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
"""Tests for SCRFD detector."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
@@ -9,8 +17,8 @@ from uniface.detection import SCRFD
|
|||||||
def scrfd_model():
|
def scrfd_model():
|
||||||
return SCRFD(
|
return SCRFD(
|
||||||
model_name=SCRFDWeights.SCRFD_500M_KPS,
|
model_name=SCRFDWeights.SCRFD_500M_KPS,
|
||||||
conf_thresh=0.5,
|
confidence_threshold=0.5,
|
||||||
nms_thresh=0.4,
|
nms_threshold=0.4,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@@ -25,15 +33,15 @@ def test_inference_on_640x640_image(scrfd_model):
|
|||||||
assert isinstance(faces, list), 'Detections should be a list.'
|
assert isinstance(faces, list), 'Detections should be a list.'
|
||||||
|
|
||||||
for face in faces:
|
for face in faces:
|
||||||
assert isinstance(face, dict), 'Each detection should be a dictionary.'
|
# Face is a dataclass, check attributes exist
|
||||||
assert 'bbox' in face, "Each detection should have a 'bbox' key."
|
assert hasattr(face, 'bbox'), "Each detection should have a 'bbox' attribute."
|
||||||
assert 'confidence' in face, "Each detection should have a 'confidence' key."
|
assert hasattr(face, 'confidence'), "Each detection should have a 'confidence' attribute."
|
||||||
assert 'landmarks' in face, "Each detection should have a 'landmarks' key."
|
assert hasattr(face, 'landmarks'), "Each detection should have a 'landmarks' attribute."
|
||||||
|
|
||||||
bbox = face['bbox']
|
bbox = face.bbox
|
||||||
assert len(bbox) == 4, 'BBox should have 4 values (x1, y1, x2, y2).'
|
assert len(bbox) == 4, 'BBox should have 4 values (x1, y1, x2, y2).'
|
||||||
|
|
||||||
landmarks = face['landmarks']
|
landmarks = face.landmarks
|
||||||
assert len(landmarks) == 5, 'Should have 5 landmark points.'
|
assert len(landmarks) == 5, 'Should have 5 landmark points.'
|
||||||
assert all(len(pt) == 2 for pt in landmarks), 'Each landmark should be (x, y).'
|
assert all(len(pt) == 2 for pt in landmarks), 'Each landmark should be (x, y).'
|
||||||
|
|
||||||
@@ -43,7 +51,7 @@ def test_confidence_threshold(scrfd_model):
|
|||||||
faces = scrfd_model.detect(mock_image)
|
faces = scrfd_model.detect(mock_image)
|
||||||
|
|
||||||
for face in faces:
|
for face in faces:
|
||||||
confidence = face['confidence']
|
confidence = face.confidence
|
||||||
assert confidence >= 0.5, f'Detection has confidence {confidence} below threshold 0.5'
|
assert confidence >= 0.5, f'Detection has confidence {confidence} below threshold 0.5'
|
||||||
|
|
||||||
|
|
||||||
@@ -63,7 +71,7 @@ def test_different_input_sizes(scrfd_model):
|
|||||||
|
|
||||||
|
|
||||||
def test_scrfd_10g_model():
|
def test_scrfd_10g_model():
|
||||||
model = SCRFD(model_name=SCRFDWeights.SCRFD_10G_KPS, conf_thresh=0.5)
|
model = SCRFD(model_name=SCRFDWeights.SCRFD_10G_KPS, confidence_threshold=0.5)
|
||||||
assert model is not None, 'SCRFD 10G model initialization failed.'
|
assert model is not None, 'SCRFD 10G model initialization failed.'
|
||||||
|
|
||||||
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
|
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
|
||||||
|
|||||||
@@ -1,3 +1,11 @@
|
|||||||
|
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||||
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
"""Tests for utility functions (compute_similarity, face_alignment, etc.)."""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
@@ -116,7 +124,7 @@ def test_compute_similarity_dtype():
|
|||||||
emb2 = emb2 / np.linalg.norm(emb2)
|
emb2 = emb2 / np.linalg.norm(emb2)
|
||||||
|
|
||||||
similarity = compute_similarity(emb1, emb2)
|
similarity = compute_similarity(emb1, emb2)
|
||||||
assert isinstance(similarity, (float, np.floating)), f'Similarity should be float, got {type(similarity)}'
|
assert isinstance(similarity, float | np.floating), f'Similarity should be float, got {type(similarity)}'
|
||||||
|
|
||||||
|
|
||||||
# face_alignment tests
|
# face_alignment tests
|
||||||
@@ -259,4 +267,4 @@ def test_compute_similarity_with_recognition_embeddings():
|
|||||||
|
|
||||||
# Should be a valid similarity score
|
# Should be a valid similarity score
|
||||||
assert -1.0 <= similarity <= 1.0
|
assert -1.0 <= similarity <= 1.0
|
||||||
assert isinstance(similarity, (float, np.floating))
|
assert isinstance(similarity, float | np.floating)
|
||||||
|
|||||||
@@ -11,10 +11,24 @@
|
|||||||
# See the License for the specific language governing permissions and
|
# See the License for the specific language governing permissions and
|
||||||
# limitations under the License.
|
# limitations under the License.
|
||||||
|
|
||||||
|
"""UniFace: A comprehensive library for face analysis.
|
||||||
|
|
||||||
|
This library provides unified APIs for:
|
||||||
|
- Face detection (RetinaFace, SCRFD, YOLOv5Face)
|
||||||
|
- Face recognition (ArcFace, MobileFace, SphereFace)
|
||||||
|
- Facial landmarks (106-point detection)
|
||||||
|
- Face parsing (semantic segmentation)
|
||||||
|
- Gaze estimation
|
||||||
|
- Age, gender, and emotion prediction
|
||||||
|
- Face anti-spoofing
|
||||||
|
- Privacy/anonymization
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
__license__ = 'MIT'
|
__license__ = 'MIT'
|
||||||
__author__ = 'Yakhyokhuja Valikhujaev'
|
__author__ = 'Yakhyokhuja Valikhujaev'
|
||||||
__version__ = '1.6.0'
|
__version__ = '2.0.0'
|
||||||
|
|
||||||
|
|
||||||
from uniface.face_utils import compute_similarity, face_alignment
|
from uniface.face_utils import compute_similarity, face_alignment
|
||||||
from uniface.log import Logger, enable_logging
|
from uniface.log import Logger, enable_logging
|
||||||
@@ -23,12 +37,6 @@ from uniface.visualization import draw_detections, vis_parsing_maps
|
|||||||
|
|
||||||
from .analyzer import FaceAnalyzer
|
from .analyzer import FaceAnalyzer
|
||||||
from .attribute import AgeGender, AttributeResult, FairFace
|
from .attribute import AgeGender, AttributeResult, FairFace
|
||||||
from .face import Face
|
|
||||||
|
|
||||||
try:
|
|
||||||
from .attribute import Emotion
|
|
||||||
except ImportError:
|
|
||||||
Emotion = None # PyTorch not installed
|
|
||||||
from .detection import (
|
from .detection import (
|
||||||
SCRFD,
|
SCRFD,
|
||||||
RetinaFace,
|
RetinaFace,
|
||||||
@@ -37,6 +45,7 @@ from .detection import (
|
|||||||
detect_faces,
|
detect_faces,
|
||||||
list_available_detectors,
|
list_available_detectors,
|
||||||
)
|
)
|
||||||
|
from .face import Face
|
||||||
from .gaze import MobileGaze, create_gaze_estimator
|
from .gaze import MobileGaze, create_gaze_estimator
|
||||||
from .landmark import Landmark106, create_landmarker
|
from .landmark import Landmark106, create_landmarker
|
||||||
from .parsing import BiSeNet, create_face_parser
|
from .parsing import BiSeNet, create_face_parser
|
||||||
@@ -44,7 +53,15 @@ from .privacy import BlurFace, anonymize_faces
|
|||||||
from .recognition import ArcFace, MobileFace, SphereFace, create_recognizer
|
from .recognition import ArcFace, MobileFace, SphereFace, create_recognizer
|
||||||
from .spoofing import MiniFASNet, create_spoofer
|
from .spoofing import MiniFASNet, create_spoofer
|
||||||
|
|
||||||
|
# Optional: Emotion requires PyTorch
|
||||||
|
Emotion: type | None
|
||||||
|
try:
|
||||||
|
from .attribute import Emotion
|
||||||
|
except ImportError:
|
||||||
|
Emotion = None
|
||||||
|
|
||||||
__all__ = [
|
__all__ = [
|
||||||
|
# Metadata
|
||||||
'__author__',
|
'__author__',
|
||||||
'__license__',
|
'__license__',
|
||||||
'__version__',
|
'__version__',
|
||||||
@@ -85,11 +102,11 @@ __all__ = [
|
|||||||
'BlurFace',
|
'BlurFace',
|
||||||
'anonymize_faces',
|
'anonymize_faces',
|
||||||
# Utilities
|
# Utilities
|
||||||
|
'Logger',
|
||||||
'compute_similarity',
|
'compute_similarity',
|
||||||
'draw_detections',
|
'draw_detections',
|
||||||
'vis_parsing_maps',
|
'enable_logging',
|
||||||
'face_alignment',
|
'face_alignment',
|
||||||
'verify_model_weights',
|
'verify_model_weights',
|
||||||
'Logger',
|
'vis_parsing_maps',
|
||||||
'enable_logging',
|
|
||||||
]
|
]
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import List, Optional
|
from __future__ import annotations
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
@@ -17,14 +17,32 @@ __all__ = ['FaceAnalyzer']
|
|||||||
|
|
||||||
|
|
||||||
class FaceAnalyzer:
|
class FaceAnalyzer:
|
||||||
"""Unified face analyzer combining detection, recognition, and attributes."""
|
"""Unified face analyzer combining detection, recognition, and attributes.
|
||||||
|
|
||||||
|
This class provides a high-level interface for face analysis by combining
|
||||||
|
multiple components: face detection, recognition (embedding extraction),
|
||||||
|
and attribute prediction (age, gender, race).
|
||||||
|
|
||||||
|
Args:
|
||||||
|
detector: Face detector instance for detecting faces in images.
|
||||||
|
recognizer: Optional face recognizer for extracting embeddings.
|
||||||
|
age_gender: Optional age/gender predictor.
|
||||||
|
fairface: Optional FairFace predictor for demographics.
|
||||||
|
|
||||||
|
Example:
|
||||||
|
>>> from uniface import RetinaFace, ArcFace, FaceAnalyzer
|
||||||
|
>>> detector = RetinaFace()
|
||||||
|
>>> recognizer = ArcFace()
|
||||||
|
>>> analyzer = FaceAnalyzer(detector, recognizer=recognizer)
|
||||||
|
>>> faces = analyzer.analyze(image)
|
||||||
|
"""
|
||||||
|
|
||||||
def __init__(
|
def __init__(
|
||||||
self,
|
self,
|
||||||
detector: BaseDetector,
|
detector: BaseDetector,
|
||||||
recognizer: Optional[BaseRecognizer] = None,
|
recognizer: BaseRecognizer | None = None,
|
||||||
age_gender: Optional[AgeGender] = None,
|
age_gender: AgeGender | None = None,
|
||||||
fairface: Optional[FairFace] = None,
|
fairface: FairFace | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
self.detector = detector
|
self.detector = detector
|
||||||
self.recognizer = recognizer
|
self.recognizer = recognizer
|
||||||
@@ -39,8 +57,18 @@ class FaceAnalyzer:
|
|||||||
if fairface:
|
if fairface:
|
||||||
Logger.info(f' - FairFace enabled: {fairface.__class__.__name__}')
|
Logger.info(f' - FairFace enabled: {fairface.__class__.__name__}')
|
||||||
|
|
||||||
def analyze(self, image: np.ndarray) -> List[Face]:
|
def analyze(self, image: np.ndarray) -> list[Face]:
|
||||||
"""Analyze faces in an image."""
|
"""Analyze faces in an image.
|
||||||
|
|
||||||
|
Performs face detection and optionally extracts embeddings and
|
||||||
|
predicts attributes for each detected face.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image: Input image as numpy array with shape (H, W, C) in BGR format.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of Face objects with detection results and any predicted attributes.
|
||||||
|
"""
|
||||||
faces = self.detector.detect(image)
|
faces = self.detector.detect(image)
|
||||||
Logger.debug(f'Detected {len(faces)} face(s)')
|
Logger.debug(f'Detected {len(faces)} face(s)')
|
||||||
|
|
||||||
|
|||||||
@@ -2,7 +2,9 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import Any, Dict, List, Union
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
@@ -10,6 +12,7 @@ from uniface.attribute.age_gender import AgeGender
|
|||||||
from uniface.attribute.base import Attribute, AttributeResult
|
from uniface.attribute.base import Attribute, AttributeResult
|
||||||
from uniface.attribute.fairface import FairFace
|
from uniface.attribute.fairface import FairFace
|
||||||
from uniface.constants import AgeGenderWeights, DDAMFNWeights, FairFaceWeights
|
from uniface.constants import AgeGenderWeights, DDAMFNWeights, FairFaceWeights
|
||||||
|
from uniface.face import Face
|
||||||
|
|
||||||
# Emotion requires PyTorch - make it optional
|
# Emotion requires PyTorch - make it optional
|
||||||
try:
|
try:
|
||||||
@@ -32,17 +35,17 @@ __all__ = [
|
|||||||
|
|
||||||
# A mapping from model enums to their corresponding attribute classes
|
# A mapping from model enums to their corresponding attribute classes
|
||||||
_ATTRIBUTE_MODELS = {
|
_ATTRIBUTE_MODELS = {
|
||||||
**{model: AgeGender for model in AgeGenderWeights},
|
**dict.fromkeys(AgeGenderWeights, AgeGender),
|
||||||
**{model: FairFace for model in FairFaceWeights},
|
**dict.fromkeys(FairFaceWeights, FairFace),
|
||||||
}
|
}
|
||||||
|
|
||||||
# Add Emotion models only if PyTorch is available
|
# Add Emotion models only if PyTorch is available
|
||||||
if _EMOTION_AVAILABLE:
|
if _EMOTION_AVAILABLE:
|
||||||
_ATTRIBUTE_MODELS.update({model: Emotion for model in DDAMFNWeights})
|
_ATTRIBUTE_MODELS.update(dict.fromkeys(DDAMFNWeights, Emotion))
|
||||||
|
|
||||||
|
|
||||||
def create_attribute_predictor(
|
def create_attribute_predictor(
|
||||||
model_name: Union[AgeGenderWeights, DDAMFNWeights, FairFaceWeights], **kwargs: Any
|
model_name: AgeGenderWeights | DDAMFNWeights | FairFaceWeights, **kwargs: Any
|
||||||
) -> Attribute:
|
) -> Attribute:
|
||||||
"""
|
"""
|
||||||
Factory function to create an attribute predictor instance.
|
Factory function to create an attribute predictor instance.
|
||||||
@@ -75,46 +78,36 @@ def create_attribute_predictor(
|
|||||||
return model_class(model_name=model_name, **kwargs)
|
return model_class(model_name=model_name, **kwargs)
|
||||||
|
|
||||||
|
|
||||||
def predict_attributes(
|
def predict_attributes(image: np.ndarray, faces: list[Face], predictor: Attribute) -> list[Face]:
|
||||||
image: np.ndarray, detections: List[Dict[str, np.ndarray]], predictor: Attribute
|
|
||||||
) -> List[Dict[str, Any]]:
|
|
||||||
"""
|
"""
|
||||||
High-level API to predict attributes for multiple detected faces.
|
High-level API to predict attributes for multiple detected faces.
|
||||||
|
|
||||||
This function iterates through a list of face detections, runs the
|
This function iterates through a list of Face objects, runs the
|
||||||
specified attribute predictor on each one, and appends the results back
|
specified attribute predictor on each one, and updates the Face
|
||||||
into the detection dictionary.
|
objects with the predicted attributes.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image (np.ndarray): The full input image in BGR format.
|
image (np.ndarray): The full input image in BGR format.
|
||||||
detections (List[Dict]): A list of detection results, where each dict
|
faces (List[Face]): A list of Face objects from face detection.
|
||||||
must contain a 'bbox' and optionally 'landmark'.
|
|
||||||
predictor (Attribute): An initialized attribute predictor instance,
|
predictor (Attribute): An initialized attribute predictor instance,
|
||||||
created by `create_attribute_predictor`.
|
created by `create_attribute_predictor`.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
The list of detections, where each dictionary is updated with a new
|
List[Face]: The list of Face objects with updated attribute fields.
|
||||||
'attributes' key containing the prediction result.
|
|
||||||
"""
|
"""
|
||||||
for face in detections:
|
for face in faces:
|
||||||
# Initialize attributes dict if it doesn't exist
|
|
||||||
if 'attributes' not in face:
|
|
||||||
face['attributes'] = {}
|
|
||||||
|
|
||||||
if isinstance(predictor, AgeGender):
|
if isinstance(predictor, AgeGender):
|
||||||
result = predictor(image, face['bbox'])
|
result = predictor(image, face.bbox)
|
||||||
face['attributes']['gender'] = result.gender
|
face.gender = result.gender
|
||||||
face['attributes']['sex'] = result.sex
|
face.age = result.age
|
||||||
face['attributes']['age'] = result.age
|
|
||||||
elif isinstance(predictor, FairFace):
|
elif isinstance(predictor, FairFace):
|
||||||
result = predictor(image, face['bbox'])
|
result = predictor(image, face.bbox)
|
||||||
face['attributes']['gender'] = result.gender
|
face.gender = result.gender
|
||||||
face['attributes']['sex'] = result.sex
|
face.age_group = result.age_group
|
||||||
face['attributes']['age_group'] = result.age_group
|
face.race = result.race
|
||||||
face['attributes']['race'] = result.race
|
|
||||||
elif isinstance(predictor, Emotion):
|
elif isinstance(predictor, Emotion):
|
||||||
emotion, confidence = predictor(image, face['landmark'])
|
emotion, confidence = predictor(image, face.landmarks)
|
||||||
face['attributes']['emotion'] = emotion
|
face.emotion = emotion
|
||||||
face['attributes']['confidence'] = confidence
|
face.emotion_confidence = confidence
|
||||||
|
|
||||||
return detections
|
return faces
|
||||||
|
|||||||
@@ -2,7 +2,6 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import List, Optional, Tuple, Union
|
|
||||||
|
|
||||||
import cv2
|
import cv2
|
||||||
import numpy as np
|
import numpy as np
|
||||||
@@ -35,7 +34,7 @@ class AgeGender(Attribute):
|
|||||||
def __init__(
|
def __init__(
|
||||||
self,
|
self,
|
||||||
model_name: AgeGenderWeights = AgeGenderWeights.DEFAULT,
|
model_name: AgeGenderWeights = AgeGenderWeights.DEFAULT,
|
||||||
input_size: Optional[Tuple[int, int]] = None,
|
input_size: tuple[int, int] | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
"""
|
"""
|
||||||
Initializes the AgeGender prediction model.
|
Initializes the AgeGender prediction model.
|
||||||
@@ -81,7 +80,7 @@ class AgeGender(Attribute):
|
|||||||
)
|
)
|
||||||
raise RuntimeError(f'Failed to initialize AgeGender model: {e}') from e
|
raise RuntimeError(f'Failed to initialize AgeGender model: {e}') from e
|
||||||
|
|
||||||
def preprocess(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> np.ndarray:
|
def preprocess(self, image: np.ndarray, bbox: list | np.ndarray) -> np.ndarray:
|
||||||
"""
|
"""
|
||||||
Aligns the face based on the bounding box and preprocesses it for inference.
|
Aligns the face based on the bounding box and preprocesses it for inference.
|
||||||
|
|
||||||
@@ -127,7 +126,7 @@ class AgeGender(Attribute):
|
|||||||
age = int(np.round(prediction[2] * 100))
|
age = int(np.round(prediction[2] * 100))
|
||||||
return AttributeResult(gender=gender, age=age)
|
return AttributeResult(gender=gender, age=age)
|
||||||
|
|
||||||
def predict(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> AttributeResult:
|
def predict(self, image: np.ndarray, bbox: list | np.ndarray) -> AttributeResult:
|
||||||
"""
|
"""
|
||||||
Predicts age and gender for a single face specified by a bounding box.
|
Predicts age and gender for a single face specified by a bounding box.
|
||||||
|
|
||||||
|
|||||||
@@ -4,7 +4,7 @@
|
|||||||
|
|
||||||
from abc import ABC, abstractmethod
|
from abc import ABC, abstractmethod
|
||||||
from dataclasses import dataclass
|
from dataclasses import dataclass
|
||||||
from typing import Any, Optional
|
from typing import Any
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
@@ -38,7 +38,7 @@ class AttributeResult:
|
|||||||
25
|
25
|
||||||
|
|
||||||
>>> # FairFace result
|
>>> # FairFace result
|
||||||
>>> result = AttributeResult(gender=0, age_group="20-29", race="East Asian")
|
>>> result = AttributeResult(gender=0, age_group='20-29', race='East Asian')
|
||||||
>>> result.sex
|
>>> result.sex
|
||||||
'Female'
|
'Female'
|
||||||
>>> result.race
|
>>> result.race
|
||||||
@@ -46,9 +46,9 @@ class AttributeResult:
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
gender: int
|
gender: int
|
||||||
age: Optional[int] = None
|
age: int | None = None
|
||||||
age_group: Optional[str] = None
|
age_group: str | None = None
|
||||||
race: Optional[str] = None
|
race: str | None = None
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def sex(self) -> str:
|
def sex(self) -> str:
|
||||||
|
|||||||
@@ -2,7 +2,6 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import List, Tuple, Union
|
|
||||||
|
|
||||||
import cv2
|
import cv2
|
||||||
import numpy as np
|
import numpy as np
|
||||||
@@ -29,7 +28,7 @@ class Emotion(Attribute):
|
|||||||
def __init__(
|
def __init__(
|
||||||
self,
|
self,
|
||||||
model_weights: DDAMFNWeights = DDAMFNWeights.AFFECNET7,
|
model_weights: DDAMFNWeights = DDAMFNWeights.AFFECNET7,
|
||||||
input_size: Tuple[int, int] = (112, 112),
|
input_size: tuple[int, int] = (112, 112),
|
||||||
) -> None:
|
) -> None:
|
||||||
"""
|
"""
|
||||||
Initializes the emotion recognition model.
|
Initializes the emotion recognition model.
|
||||||
@@ -81,7 +80,7 @@ class Emotion(Attribute):
|
|||||||
Logger.error(f"Failed to load Emotion model from '{self.model_path}'", exc_info=True)
|
Logger.error(f"Failed to load Emotion model from '{self.model_path}'", exc_info=True)
|
||||||
raise RuntimeError(f'Failed to initialize Emotion model: {e}') from e
|
raise RuntimeError(f'Failed to initialize Emotion model: {e}') from e
|
||||||
|
|
||||||
def preprocess(self, image: np.ndarray, landmark: Union[List, np.ndarray]) -> torch.Tensor:
|
def preprocess(self, image: np.ndarray, landmark: list | np.ndarray) -> torch.Tensor:
|
||||||
"""
|
"""
|
||||||
Aligns the face using landmarks and preprocesses it into a tensor.
|
Aligns the face using landmarks and preprocesses it into a tensor.
|
||||||
|
|
||||||
@@ -106,7 +105,7 @@ class Emotion(Attribute):
|
|||||||
|
|
||||||
return torch.from_numpy(transposed_image).unsqueeze(0).to(self.device)
|
return torch.from_numpy(transposed_image).unsqueeze(0).to(self.device)
|
||||||
|
|
||||||
def postprocess(self, prediction: torch.Tensor) -> Tuple[str, float]:
|
def postprocess(self, prediction: torch.Tensor) -> tuple[str, float]:
|
||||||
"""
|
"""
|
||||||
Processes the raw model output to get the emotion label and confidence score.
|
Processes the raw model output to get the emotion label and confidence score.
|
||||||
"""
|
"""
|
||||||
@@ -116,7 +115,7 @@ class Emotion(Attribute):
|
|||||||
confidence = float(probabilities[pred_index])
|
confidence = float(probabilities[pred_index])
|
||||||
return emotion_label, confidence
|
return emotion_label, confidence
|
||||||
|
|
||||||
def predict(self, image: np.ndarray, landmark: Union[List, np.ndarray]) -> Tuple[str, float]:
|
def predict(self, image: np.ndarray, landmark: list | np.ndarray) -> tuple[str, float]:
|
||||||
"""
|
"""
|
||||||
Predicts the emotion from a single face specified by its landmarks.
|
Predicts the emotion from a single face specified by its landmarks.
|
||||||
"""
|
"""
|
||||||
|
|||||||
@@ -2,7 +2,6 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import List, Optional, Tuple, Union
|
|
||||||
|
|
||||||
import cv2
|
import cv2
|
||||||
import numpy as np
|
import numpy as np
|
||||||
@@ -13,7 +12,7 @@ from uniface.log import Logger
|
|||||||
from uniface.model_store import verify_model_weights
|
from uniface.model_store import verify_model_weights
|
||||||
from uniface.onnx_utils import create_onnx_session
|
from uniface.onnx_utils import create_onnx_session
|
||||||
|
|
||||||
__all__ = ['FairFace', 'RACE_LABELS', 'AGE_LABELS']
|
__all__ = ['AGE_LABELS', 'RACE_LABELS', 'FairFace']
|
||||||
|
|
||||||
# Label definitions
|
# Label definitions
|
||||||
RACE_LABELS = [
|
RACE_LABELS = [
|
||||||
@@ -49,7 +48,7 @@ class FairFace(Attribute):
|
|||||||
def __init__(
|
def __init__(
|
||||||
self,
|
self,
|
||||||
model_name: FairFaceWeights = FairFaceWeights.DEFAULT,
|
model_name: FairFaceWeights = FairFaceWeights.DEFAULT,
|
||||||
input_size: Optional[Tuple[int, int]] = None,
|
input_size: tuple[int, int] | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
"""
|
"""
|
||||||
Initializes the FairFace prediction model.
|
Initializes the FairFace prediction model.
|
||||||
@@ -82,7 +81,7 @@ class FairFace(Attribute):
|
|||||||
)
|
)
|
||||||
raise RuntimeError(f'Failed to initialize FairFace model: {e}') from e
|
raise RuntimeError(f'Failed to initialize FairFace model: {e}') from e
|
||||||
|
|
||||||
def preprocess(self, image: np.ndarray, bbox: Optional[Union[List, np.ndarray]] = None) -> np.ndarray:
|
def preprocess(self, image: np.ndarray, bbox: list | np.ndarray | None = None) -> np.ndarray:
|
||||||
"""
|
"""
|
||||||
Preprocesses the face image for inference.
|
Preprocesses the face image for inference.
|
||||||
|
|
||||||
@@ -130,7 +129,7 @@ class FairFace(Attribute):
|
|||||||
|
|
||||||
return image
|
return image
|
||||||
|
|
||||||
def postprocess(self, prediction: Tuple[np.ndarray, np.ndarray, np.ndarray]) -> AttributeResult:
|
def postprocess(self, prediction: tuple[np.ndarray, np.ndarray, np.ndarray]) -> AttributeResult:
|
||||||
"""
|
"""
|
||||||
Processes the raw model output to extract race, gender, and age.
|
Processes the raw model output to extract race, gender, and age.
|
||||||
|
|
||||||
@@ -162,7 +161,7 @@ class FairFace(Attribute):
|
|||||||
race=RACE_LABELS[race_idx],
|
race=RACE_LABELS[race_idx],
|
||||||
)
|
)
|
||||||
|
|
||||||
def predict(self, image: np.ndarray, bbox: Optional[Union[List, np.ndarray]] = None) -> AttributeResult:
|
def predict(self, image: np.ndarray, bbox: list | np.ndarray | None = None) -> AttributeResult:
|
||||||
"""
|
"""
|
||||||
Predicts race, gender, and age for a face.
|
Predicts race, gender, and age for a face.
|
||||||
|
|
||||||
|
|||||||
@@ -2,34 +2,42 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
import itertools
|
import itertools
|
||||||
import math
|
import math
|
||||||
from typing import List, Optional, Tuple
|
|
||||||
|
|
||||||
import cv2
|
import cv2
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
__all__ = [
|
__all__ = [
|
||||||
'resize_image',
|
|
||||||
'generate_anchors',
|
|
||||||
'non_max_suppression',
|
|
||||||
'decode_boxes',
|
'decode_boxes',
|
||||||
'decode_landmarks',
|
'decode_landmarks',
|
||||||
'distance2bbox',
|
'distance2bbox',
|
||||||
'distance2kps',
|
'distance2kps',
|
||||||
|
'generate_anchors',
|
||||||
|
'non_max_suppression',
|
||||||
|
'resize_image',
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|
||||||
def resize_image(frame, target_shape: Tuple[int, int] = (640, 640)) -> Tuple[np.ndarray, float]:
|
def resize_image(
|
||||||
"""
|
frame: np.ndarray,
|
||||||
Resize an image to fit within a target shape while keeping its aspect ratio.
|
target_shape: tuple[int, int] = (640, 640),
|
||||||
|
) -> tuple[np.ndarray, float]:
|
||||||
|
"""Resize an image to fit within a target shape while keeping its aspect ratio.
|
||||||
|
|
||||||
|
The image is resized to fit within the target dimensions and placed on a
|
||||||
|
blank canvas (zero-padded to target size).
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
frame (np.ndarray): Input image.
|
frame: Input image with shape (H, W, C).
|
||||||
target_shape (Tuple[int, int]): Target size (width, height). Defaults to (640, 640).
|
target_shape: Target size as (width, height). Defaults to (640, 640).
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Tuple[np.ndarray, float]: Resized image on a blank canvas and the resize factor.
|
A tuple containing:
|
||||||
|
- Resized image on a blank canvas with shape (height, width, 3).
|
||||||
|
- The resize factor as a float.
|
||||||
"""
|
"""
|
||||||
width, height = target_shape
|
width, height = target_shape
|
||||||
|
|
||||||
@@ -53,16 +61,16 @@ def resize_image(frame, target_shape: Tuple[int, int] = (640, 640)) -> Tuple[np.
|
|||||||
return image, resize_factor
|
return image, resize_factor
|
||||||
|
|
||||||
|
|
||||||
def generate_anchors(image_size: Tuple[int, int] = (640, 640)) -> np.ndarray:
|
def generate_anchors(image_size: tuple[int, int] = (640, 640)) -> np.ndarray:
|
||||||
"""
|
"""Generate anchor boxes for a given image size (RetinaFace specific).
|
||||||
Generate anchor boxes for a given image size (RetinaFace specific).
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image_size (Tuple[int, int]): Input image size (width, height). Defaults to (640, 640).
|
image_size: Input image size as (width, height). Defaults to (640, 640).
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
np.ndarray: Anchor box coordinates as a NumPy array with shape (num_anchors, 4).
|
Anchor box coordinates as a numpy array with shape (num_anchors, 4).
|
||||||
"""
|
"""
|
||||||
|
# RetinaFace FPN strides and corresponding anchor sizes per level
|
||||||
steps = [8, 16, 32]
|
steps = [8, 16, 32]
|
||||||
min_sizes = [[16, 32], [64, 128], [256, 512]]
|
min_sizes = [[16, 32], [64, 128], [256, 512]]
|
||||||
|
|
||||||
@@ -85,16 +93,15 @@ def generate_anchors(image_size: Tuple[int, int] = (640, 640)) -> np.ndarray:
|
|||||||
return output
|
return output
|
||||||
|
|
||||||
|
|
||||||
def non_max_suppression(dets: np.ndarray, threshold: float) -> List[int]:
|
def non_max_suppression(dets: np.ndarray, threshold: float) -> list[int]:
|
||||||
"""
|
"""Apply Non-Maximum Suppression (NMS) to reduce overlapping bounding boxes.
|
||||||
Apply Non-Maximum Suppression (NMS) to reduce overlapping bounding boxes based on a threshold.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
dets (np.ndarray): Array of detections with each row as [x1, y1, x2, y2, score].
|
dets: Array of detections with each row as [x1, y1, x2, y2, score].
|
||||||
threshold (float): IoU threshold for suppression.
|
threshold: IoU threshold for suppression.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
List[int]: Indices of bounding boxes retained after suppression.
|
Indices of bounding boxes retained after suppression.
|
||||||
"""
|
"""
|
||||||
x1 = dets[:, 0]
|
x1 = dets[:, 0]
|
||||||
y1 = dets[:, 1]
|
y1 = dets[:, 1]
|
||||||
@@ -125,18 +132,22 @@ def non_max_suppression(dets: np.ndarray, threshold: float) -> List[int]:
|
|||||||
return keep
|
return keep
|
||||||
|
|
||||||
|
|
||||||
def decode_boxes(loc: np.ndarray, priors: np.ndarray, variances: Optional[List[float]] = None) -> np.ndarray:
|
def decode_boxes(
|
||||||
"""
|
loc: np.ndarray,
|
||||||
Decode locations from predictions using priors to undo
|
priors: np.ndarray,
|
||||||
the encoding done for offset regression at train time (RetinaFace specific).
|
variances: list[float] | None = None,
|
||||||
|
) -> np.ndarray:
|
||||||
|
"""Decode locations from predictions using priors (RetinaFace specific).
|
||||||
|
|
||||||
|
Undoes the encoding done for offset regression at train time.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
loc (np.ndarray): Location predictions for loc layers, shape: [num_priors, 4]
|
loc: Location predictions for loc layers, shape: [num_priors, 4].
|
||||||
priors (np.ndarray): Prior boxes in center-offset form, shape: [num_priors, 4]
|
priors: Prior boxes in center-offset form, shape: [num_priors, 4].
|
||||||
variances (Optional[List[float]]): Variances of prior boxes. Defaults to [0.1, 0.2].
|
variances: Variances of prior boxes. Defaults to [0.1, 0.2].
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
np.ndarray: Decoded bounding box predictions with shape [num_priors, 4]
|
Decoded bounding box predictions with shape [num_priors, 4].
|
||||||
"""
|
"""
|
||||||
if variances is None:
|
if variances is None:
|
||||||
variances = [0.1, 0.2]
|
variances = [0.1, 0.2]
|
||||||
@@ -155,18 +166,19 @@ def decode_boxes(loc: np.ndarray, priors: np.ndarray, variances: Optional[List[f
|
|||||||
|
|
||||||
|
|
||||||
def decode_landmarks(
|
def decode_landmarks(
|
||||||
predictions: np.ndarray, priors: np.ndarray, variances: Optional[List[float]] = None
|
predictions: np.ndarray,
|
||||||
|
priors: np.ndarray,
|
||||||
|
variances: list[float] | None = None,
|
||||||
) -> np.ndarray:
|
) -> np.ndarray:
|
||||||
"""
|
"""Decode landmark predictions using prior boxes (RetinaFace specific).
|
||||||
Decode landmark predictions using prior boxes (RetinaFace specific).
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
predictions (np.ndarray): Landmark predictions, shape: [num_priors, 10]
|
predictions: Landmark predictions, shape: [num_priors, 10].
|
||||||
priors (np.ndarray): Prior boxes, shape: [num_priors, 4]
|
priors: Prior boxes, shape: [num_priors, 4].
|
||||||
variances (Optional[List[float]]): Scaling factors for landmark offsets. Defaults to [0.1, 0.2].
|
variances: Scaling factors for landmark offsets. Defaults to [0.1, 0.2].
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
np.ndarray: Decoded landmarks, shape: [num_priors, 10]
|
Decoded landmarks, shape: [num_priors, 10].
|
||||||
"""
|
"""
|
||||||
if variances is None:
|
if variances is None:
|
||||||
variances = [0.1, 0.2]
|
variances = [0.1, 0.2]
|
||||||
@@ -187,18 +199,21 @@ def decode_landmarks(
|
|||||||
return landmarks
|
return landmarks
|
||||||
|
|
||||||
|
|
||||||
def distance2bbox(points: np.ndarray, distance: np.ndarray, max_shape: Optional[Tuple[int, int]] = None) -> np.ndarray:
|
def distance2bbox(
|
||||||
"""
|
points: np.ndarray,
|
||||||
Decode distance prediction to bounding box (SCRFD specific).
|
distance: np.ndarray,
|
||||||
|
max_shape: tuple[int, int] | None = None,
|
||||||
|
) -> np.ndarray:
|
||||||
|
"""Decode distance prediction to bounding box (SCRFD specific).
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
points (np.ndarray): Anchor points with shape (n, 2), [x, y].
|
points: Anchor points with shape (n, 2), [x, y].
|
||||||
distance (np.ndarray): Distance from the given point to 4
|
distance: Distance from the given point to 4 boundaries
|
||||||
boundaries (left, top, right, bottom) with shape (n, 4).
|
(left, top, right, bottom) with shape (n, 4).
|
||||||
max_shape (Optional[Tuple[int, int]]): Shape of the image (height, width) for clipping.
|
max_shape: Shape of the image (height, width) for clipping.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
np.ndarray: Decoded bounding boxes with shape (n, 4) as [x1, y1, x2, y2].
|
Decoded bounding boxes with shape (n, 4) as [x1, y1, x2, y2].
|
||||||
"""
|
"""
|
||||||
x1 = points[:, 0] - distance[:, 0]
|
x1 = points[:, 0] - distance[:, 0]
|
||||||
y1 = points[:, 1] - distance[:, 1]
|
y1 = points[:, 1] - distance[:, 1]
|
||||||
@@ -219,17 +234,20 @@ def distance2bbox(points: np.ndarray, distance: np.ndarray, max_shape: Optional[
|
|||||||
return np.stack([x1, y1, x2, y2], axis=-1)
|
return np.stack([x1, y1, x2, y2], axis=-1)
|
||||||
|
|
||||||
|
|
||||||
def distance2kps(points: np.ndarray, distance: np.ndarray, max_shape: Optional[Tuple[int, int]] = None) -> np.ndarray:
|
def distance2kps(
|
||||||
"""
|
points: np.ndarray,
|
||||||
Decode distance prediction to keypoints (SCRFD specific).
|
distance: np.ndarray,
|
||||||
|
max_shape: tuple[int, int] | None = None,
|
||||||
|
) -> np.ndarray:
|
||||||
|
"""Decode distance prediction to keypoints (SCRFD specific).
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
points (np.ndarray): Anchor points with shape (n, 2), [x, y].
|
points: Anchor points with shape (n, 2), [x, y].
|
||||||
distance (np.ndarray): Distance from the given point to keypoints with shape (n, 2k).
|
distance: Distance from the given point to keypoints with shape (n, 2k).
|
||||||
max_shape (Optional[Tuple[int, int]]): Shape of the image (height, width) for clipping.
|
max_shape: Shape of the image (height, width) for clipping.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
np.ndarray: Decoded keypoints with shape (n, 2k).
|
Decoded keypoints with shape (n, 2k).
|
||||||
"""
|
"""
|
||||||
preds = []
|
preds = []
|
||||||
for i in range(0, distance.shape[1], 2):
|
for i in range(0, distance.shape[1], 2):
|
||||||
|
|||||||
@@ -3,7 +3,6 @@
|
|||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from enum import Enum
|
from enum import Enum
|
||||||
from typing import Dict
|
|
||||||
|
|
||||||
|
|
||||||
# fmt: off
|
# fmt: off
|
||||||
@@ -142,7 +141,7 @@ class MiniFASNetWeights(str, Enum):
|
|||||||
V2 = "minifasnet_v2"
|
V2 = "minifasnet_v2"
|
||||||
|
|
||||||
|
|
||||||
MODEL_URLS: Dict[Enum, str] = {
|
MODEL_URLS: dict[Enum, str] = {
|
||||||
# RetinaFace
|
# RetinaFace
|
||||||
RetinaFaceWeights.MNET_025: 'https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1_0.25.onnx',
|
RetinaFaceWeights.MNET_025: 'https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1_0.25.onnx',
|
||||||
RetinaFaceWeights.MNET_050: 'https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1_0.50.onnx',
|
RetinaFaceWeights.MNET_050: 'https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1_0.50.onnx',
|
||||||
@@ -191,7 +190,7 @@ MODEL_URLS: Dict[Enum, str] = {
|
|||||||
MiniFASNetWeights.V2: 'https://github.com/yakhyo/face-anti-spoofing/releases/download/weights/MiniFASNetV2.onnx',
|
MiniFASNetWeights.V2: 'https://github.com/yakhyo/face-anti-spoofing/releases/download/weights/MiniFASNetV2.onnx',
|
||||||
}
|
}
|
||||||
|
|
||||||
MODEL_SHA256: Dict[Enum, str] = {
|
MODEL_SHA256: dict[Enum, str] = {
|
||||||
# RetinaFace
|
# RetinaFace
|
||||||
RetinaFaceWeights.MNET_025: 'b7a7acab55e104dce6f32cdfff929bd83946da5cd869b9e2e9bdffafd1b7e4a5',
|
RetinaFaceWeights.MNET_025: 'b7a7acab55e104dce6f32cdfff929bd83946da5cd869b9e2e9bdffafd1b7e4a5',
|
||||||
RetinaFaceWeights.MNET_050: 'd8977186f6037999af5b4113d42ba77a84a6ab0c996b17c713cc3d53b88bfc37',
|
RetinaFaceWeights.MNET_050: 'd8977186f6037999af5b4113d42ba77a84a6ab0c996b17c713cc3d53b88bfc37',
|
||||||
|
|||||||
@@ -2,8 +2,9 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
from typing import Any, Dict, List
|
from typing import Any
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
@@ -14,37 +15,40 @@ from .retinaface import RetinaFace
|
|||||||
from .scrfd import SCRFD
|
from .scrfd import SCRFD
|
||||||
from .yolov5 import YOLOv5Face
|
from .yolov5 import YOLOv5Face
|
||||||
|
|
||||||
# Global cache for detector instances
|
# Global cache for detector instances (keyed by method name + config hash)
|
||||||
_detector_cache: Dict[str, BaseDetector] = {}
|
_detector_cache: dict[str, BaseDetector] = {}
|
||||||
|
|
||||||
|
|
||||||
def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs) -> List[Face]:
|
def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs: Any) -> list[Face]:
|
||||||
"""
|
"""High-level face detection function.
|
||||||
High-level face detection function.
|
|
||||||
|
Detects faces in an image using the specified detection method.
|
||||||
|
Results are cached for repeated calls with the same configuration.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image (np.ndarray): Input image as numpy array.
|
image: Input image as numpy array with shape (H, W, C) in BGR format.
|
||||||
method (str): Detection method to use. Options: 'retinaface', 'scrfd', 'yolov5face'.
|
method: Detection method to use. Options: 'retinaface', 'scrfd', 'yolov5face'.
|
||||||
**kwargs: Additional arguments passed to the detector.
|
**kwargs: Additional arguments passed to the detector.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
List[Face]: A list of Face objects, each containing:
|
A list of Face objects, each containing:
|
||||||
- bbox (np.ndarray): [x1, y1, x2, y2] bounding box coordinates.
|
- bbox: [x1, y1, x2, y2] bounding box coordinates.
|
||||||
- confidence (float): The confidence score of the detection.
|
- confidence: The confidence score of the detection.
|
||||||
- landmarks (np.ndarray): 5-point facial landmarks with shape (5, 2).
|
- landmarks: 5-point facial landmarks with shape (5, 2).
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
>>> from uniface import detect_faces
|
>>> from uniface import detect_faces
|
||||||
>>> image = cv2.imread("your_image.jpg")
|
>>> import cv2
|
||||||
>>> faces = detect_faces(image, method='retinaface', conf_thresh=0.8)
|
>>> image = cv2.imread('your_image.jpg')
|
||||||
|
>>> faces = detect_faces(image, method='retinaface', confidence_threshold=0.8)
|
||||||
>>> for face in faces:
|
>>> for face in faces:
|
||||||
... print(f"Found face with confidence: {face.confidence}")
|
... print(f'Found face with confidence: {face.confidence}')
|
||||||
... print(f"BBox: {face.bbox}")
|
... print(f'BBox: {face.bbox}')
|
||||||
"""
|
"""
|
||||||
method_name = method.lower()
|
method_name = method.lower()
|
||||||
|
|
||||||
sorted_kwargs = sorted(kwargs.items())
|
sorted_kwargs = sorted(kwargs.items())
|
||||||
cache_key = f'{method_name}_{str(sorted_kwargs)}'
|
cache_key = f'{method_name}_{sorted_kwargs!s}'
|
||||||
|
|
||||||
if cache_key not in _detector_cache:
|
if cache_key not in _detector_cache:
|
||||||
# Pass kwargs to create the correctly configured detector
|
# Pass kwargs to create the correctly configured detector
|
||||||
@@ -54,49 +58,36 @@ def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs) -> Lis
|
|||||||
return detector.detect(image)
|
return detector.detect(image)
|
||||||
|
|
||||||
|
|
||||||
def create_detector(method: str = 'retinaface', **kwargs) -> BaseDetector:
|
def create_detector(method: str = 'retinaface', **kwargs: Any) -> BaseDetector:
|
||||||
"""
|
"""Factory function to create face detectors.
|
||||||
Factory function to create face detectors.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
method (str): Detection method. Options:
|
method: Detection method. Options:
|
||||||
- 'retinaface': RetinaFace detector (default)
|
- 'retinaface': RetinaFace detector (default)
|
||||||
- 'scrfd': SCRFD detector (fast and accurate)
|
- 'scrfd': SCRFD detector (fast and accurate)
|
||||||
- 'yolov5face': YOLOv5-Face detector (accurate with landmarks)
|
- 'yolov5face': YOLOv5-Face detector (accurate with landmarks)
|
||||||
**kwargs: Detector-specific parameters
|
**kwargs: Detector-specific parameters.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
BaseDetector: Initialized detector instance
|
Initialized detector instance.
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
ValueError: If method is not supported
|
ValueError: If method is not supported.
|
||||||
|
|
||||||
Examples:
|
Example:
|
||||||
>>> # Basic usage
|
>>> # Basic usage
|
||||||
>>> detector = create_detector('retinaface')
|
>>> detector = create_detector('retinaface')
|
||||||
|
|
||||||
>>> # SCRFD detector with custom parameters
|
>>> # SCRFD detector with custom parameters
|
||||||
|
>>> from uniface.constants import SCRFDWeights
|
||||||
>>> detector = create_detector(
|
>>> detector = create_detector(
|
||||||
... 'scrfd',
|
... 'scrfd', model_name=SCRFDWeights.SCRFD_10G_KPS, confidence_threshold=0.8, input_size=(640, 640)
|
||||||
... model_name=SCRFDWeights.SCRFD_10G_KPS,
|
|
||||||
... conf_thresh=0.8,
|
|
||||||
... input_size=(640, 640)
|
|
||||||
... )
|
... )
|
||||||
|
|
||||||
>>> # RetinaFace detector
|
>>> # RetinaFace detector
|
||||||
|
>>> from uniface.constants import RetinaFaceWeights
|
||||||
>>> detector = create_detector(
|
>>> detector = create_detector(
|
||||||
... 'retinaface',
|
... 'retinaface', model_name=RetinaFaceWeights.MNET_V2, confidence_threshold=0.8, nms_threshold=0.4
|
||||||
... model_name=RetinaFaceWeights.MNET_V2,
|
|
||||||
... conf_thresh=0.8,
|
|
||||||
... nms_thresh=0.4
|
|
||||||
... )
|
|
||||||
|
|
||||||
>>> # YOLOv5-Face detector
|
|
||||||
>>> detector = create_detector(
|
|
||||||
... 'yolov5face',
|
|
||||||
... model_name=YOLOv5FaceWeights.YOLOV5S,
|
|
||||||
... conf_thresh=0.25,
|
|
||||||
... nms_thresh=0.45
|
|
||||||
... )
|
... )
|
||||||
"""
|
"""
|
||||||
method = method.lower()
|
method = method.lower()
|
||||||
@@ -115,12 +106,12 @@ def create_detector(method: str = 'retinaface', **kwargs) -> BaseDetector:
|
|||||||
raise ValueError(f"Unsupported detection method: '{method}'. Available methods: {available_methods}")
|
raise ValueError(f"Unsupported detection method: '{method}'. Available methods: {available_methods}")
|
||||||
|
|
||||||
|
|
||||||
def list_available_detectors() -> Dict[str, Dict[str, Any]]:
|
def list_available_detectors() -> dict[str, dict[str, Any]]:
|
||||||
"""
|
"""List all available detection methods with their descriptions and parameters.
|
||||||
List all available detection methods with their descriptions and parameters.
|
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Dict[str, Dict[str, Any]]: Dictionary of detector information
|
Dictionary mapping detector names to their information including
|
||||||
|
description, landmark support, paper reference, and default parameters.
|
||||||
"""
|
"""
|
||||||
return {
|
return {
|
||||||
'retinaface': {
|
'retinaface': {
|
||||||
@@ -129,8 +120,8 @@ def list_available_detectors() -> Dict[str, Dict[str, Any]]:
|
|||||||
'paper': 'https://arxiv.org/abs/1905.00641',
|
'paper': 'https://arxiv.org/abs/1905.00641',
|
||||||
'default_params': {
|
'default_params': {
|
||||||
'model_name': 'mnet_v2',
|
'model_name': 'mnet_v2',
|
||||||
'conf_thresh': 0.5,
|
'confidence_threshold': 0.5,
|
||||||
'nms_thresh': 0.4,
|
'nms_threshold': 0.4,
|
||||||
'input_size': (640, 640),
|
'input_size': (640, 640),
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
@@ -140,8 +131,8 @@ def list_available_detectors() -> Dict[str, Dict[str, Any]]:
|
|||||||
'paper': 'https://arxiv.org/abs/2105.04714',
|
'paper': 'https://arxiv.org/abs/2105.04714',
|
||||||
'default_params': {
|
'default_params': {
|
||||||
'model_name': 'scrfd_10g_kps',
|
'model_name': 'scrfd_10g_kps',
|
||||||
'conf_thresh': 0.5,
|
'confidence_threshold': 0.5,
|
||||||
'nms_thresh': 0.4,
|
'nms_threshold': 0.4,
|
||||||
'input_size': (640, 640),
|
'input_size': (640, 640),
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
@@ -151,8 +142,8 @@ def list_available_detectors() -> Dict[str, Dict[str, Any]]:
|
|||||||
'paper': 'https://arxiv.org/abs/2105.12931',
|
'paper': 'https://arxiv.org/abs/2105.12931',
|
||||||
'default_params': {
|
'default_params': {
|
||||||
'model_name': 'yolov5s_face',
|
'model_name': 'yolov5s_face',
|
||||||
'conf_thresh': 0.25,
|
'confidence_threshold': 0.25,
|
||||||
'nms_thresh': 0.45,
|
'nms_threshold': 0.45,
|
||||||
'input_size': 640,
|
'input_size': 640,
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
@@ -160,11 +151,11 @@ def list_available_detectors() -> Dict[str, Dict[str, Any]]:
|
|||||||
|
|
||||||
|
|
||||||
__all__ = [
|
__all__ = [
|
||||||
'detect_faces',
|
|
||||||
'create_detector',
|
|
||||||
'list_available_detectors',
|
|
||||||
'SCRFD',
|
'SCRFD',
|
||||||
|
'BaseDetector',
|
||||||
'RetinaFace',
|
'RetinaFace',
|
||||||
'YOLOv5Face',
|
'YOLOv5Face',
|
||||||
'BaseDetector',
|
'create_detector',
|
||||||
|
'detect_faces',
|
||||||
|
'list_available_detectors',
|
||||||
]
|
]
|
||||||
|
|||||||
@@ -2,40 +2,51 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
from abc import ABC, abstractmethod
|
from abc import ABC, abstractmethod
|
||||||
from typing import Any, Dict, List
|
from typing import Any
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
from uniface.face import Face
|
from uniface.face import Face
|
||||||
|
|
||||||
|
__all__ = ['BaseDetector']
|
||||||
|
|
||||||
|
|
||||||
class BaseDetector(ABC):
|
class BaseDetector(ABC):
|
||||||
"""
|
"""Abstract base class for all face detectors.
|
||||||
Abstract base class for all face detectors.
|
|
||||||
|
|
||||||
This class defines the interface that all face detectors must implement,
|
This class defines the interface that all face detectors must implement,
|
||||||
ensuring consistency across different detection methods.
|
ensuring consistency across different detection methods.
|
||||||
|
|
||||||
|
Attributes:
|
||||||
|
config: Dictionary containing detector configuration parameters.
|
||||||
|
_supports_landmarks: Flag indicating if detector supports landmark detection.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
def __init__(self, **kwargs):
|
def __init__(self, **kwargs: Any) -> None:
|
||||||
"""Initialize the detector with configuration parameters."""
|
"""Initialize the detector with configuration parameters.
|
||||||
self.config = kwargs
|
|
||||||
|
|
||||||
@abstractmethod
|
|
||||||
def detect(self, image: np.ndarray, **kwargs) -> List[Face]:
|
|
||||||
"""
|
|
||||||
Detect faces in an image.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image (np.ndarray): Input image as numpy array with shape (H, W, C)
|
**kwargs: Detector-specific configuration parameters.
|
||||||
**kwargs: Additional detection parameters
|
"""
|
||||||
|
self.config: dict[str, Any] = kwargs
|
||||||
|
self._supports_landmarks: bool = False
|
||||||
|
|
||||||
|
@abstractmethod
|
||||||
|
def detect(self, image: np.ndarray, **kwargs: Any) -> list[Face]:
|
||||||
|
"""Detect faces in an image.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image: Input image as numpy array with shape (H, W, C) in BGR format.
|
||||||
|
**kwargs: Additional detection parameters.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
List[Face]: List of detected Face objects, each containing:
|
List of detected Face objects, each containing:
|
||||||
- bbox (np.ndarray): Bounding box coordinates with shape (4,) as [x1, y1, x2, y2]
|
- bbox: Bounding box coordinates with shape (4,) as [x1, y1, x2, y2].
|
||||||
- confidence (float): Detection confidence score (0.0 to 1.0)
|
- confidence: Detection confidence score (0.0 to 1.0).
|
||||||
- landmarks (np.ndarray): Facial landmarks with shape (5, 2) for 5-point landmarks
|
- landmarks: Facial landmarks with shape (5, 2) for 5-point landmarks.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
>>> faces = detector.detect(image)
|
>>> faces = detector.detect(image)
|
||||||
@@ -44,34 +55,29 @@ class BaseDetector(ABC):
|
|||||||
... confidence = face.confidence # float
|
... confidence = face.confidence # float
|
||||||
... landmarks = face.landmarks # np.ndarray with shape (5, 2)
|
... landmarks = face.landmarks # np.ndarray with shape (5, 2)
|
||||||
"""
|
"""
|
||||||
pass
|
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
def preprocess(self, image: np.ndarray) -> np.ndarray:
|
def preprocess(self, image: np.ndarray) -> np.ndarray:
|
||||||
"""
|
"""Preprocess input image for detection.
|
||||||
Preprocess input image for detection.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image (np.ndarray): Input image
|
image: Input image with shape (H, W, C).
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
np.ndarray: Preprocessed image tensor
|
Preprocessed image tensor ready for inference.
|
||||||
"""
|
"""
|
||||||
pass
|
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
def postprocess(self, outputs, **kwargs) -> Any:
|
def postprocess(self, outputs: Any, **kwargs: Any) -> Any:
|
||||||
"""
|
"""Postprocess model outputs to get final detections.
|
||||||
Postprocess model outputs to get final detections.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
outputs: Raw model outputs
|
outputs: Raw model outputs.
|
||||||
**kwargs: Additional postprocessing parameters
|
**kwargs: Additional postprocessing parameters.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Any: Processed outputs (implementation-specific format, typically tuple of arrays)
|
Processed outputs (implementation-specific format, typically tuple of arrays).
|
||||||
"""
|
"""
|
||||||
pass
|
|
||||||
|
|
||||||
def __str__(self) -> str:
|
def __str__(self) -> str:
|
||||||
"""String representation of the detector."""
|
"""String representation of the detector."""
|
||||||
@@ -83,20 +89,18 @@ class BaseDetector(ABC):
|
|||||||
|
|
||||||
@property
|
@property
|
||||||
def supports_landmarks(self) -> bool:
|
def supports_landmarks(self) -> bool:
|
||||||
"""
|
"""Whether this detector supports landmark detection.
|
||||||
Whether this detector supports landmark detection.
|
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
bool: True if landmarks are supported, False otherwise
|
True if landmarks are supported, False otherwise.
|
||||||
"""
|
"""
|
||||||
return hasattr(self, '_supports_landmarks') and self._supports_landmarks
|
return hasattr(self, '_supports_landmarks') and self._supports_landmarks
|
||||||
|
|
||||||
def get_info(self) -> Dict[str, Any]:
|
def get_info(self) -> dict[str, Any]:
|
||||||
"""
|
"""Get detector information and configuration.
|
||||||
Get detector information and configuration.
|
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Dict[str, Any]: Detector information
|
Dictionary containing detector name, landmark support, and config.
|
||||||
"""
|
"""
|
||||||
return {
|
return {
|
||||||
'name': self.__class__.__name__,
|
'name': self.__class__.__name__,
|
||||||
|
|||||||
@@ -2,7 +2,9 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import Any, List, Literal, Tuple
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Any, Literal
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
@@ -32,8 +34,8 @@ class RetinaFace(BaseDetector):
|
|||||||
|
|
||||||
Args:
|
Args:
|
||||||
model_name (RetinaFaceWeights): Model weights to use. Defaults to `RetinaFaceWeights.MNET_V2`.
|
model_name (RetinaFaceWeights): Model weights to use. Defaults to `RetinaFaceWeights.MNET_V2`.
|
||||||
conf_thresh (float): Confidence threshold for filtering detections. Defaults to 0.5.
|
confidence_threshold (float): Confidence threshold for filtering detections. Defaults to 0.5.
|
||||||
nms_thresh (float): Non-maximum suppression (NMS) IoU threshold. Defaults to 0.4.
|
nms_threshold (float): Non-maximum suppression (NMS) IoU threshold. Defaults to 0.4.
|
||||||
input_size (Tuple[int, int]): Fixed input size (width, height) if `dynamic_size=False`.
|
input_size (Tuple[int, int]): Fixed input size (width, height) if `dynamic_size=False`.
|
||||||
Defaults to (640, 640).
|
Defaults to (640, 640).
|
||||||
Note: Non-default sizes may cause slower inference and CoreML compatibility issues.
|
Note: Non-default sizes may cause slower inference and CoreML compatibility issues.
|
||||||
@@ -44,8 +46,8 @@ class RetinaFace(BaseDetector):
|
|||||||
|
|
||||||
Attributes:
|
Attributes:
|
||||||
model_name (RetinaFaceWeights): Selected model variant.
|
model_name (RetinaFaceWeights): Selected model variant.
|
||||||
conf_thresh (float): Threshold for confidence-based filtering.
|
confidence_threshold (float): Threshold for confidence-based filtering.
|
||||||
nms_thresh (float): IoU threshold used for NMS.
|
nms_threshold (float): IoU threshold used for NMS.
|
||||||
pre_nms_topk (int): Limit on proposals before applying NMS.
|
pre_nms_topk (int): Limit on proposals before applying NMS.
|
||||||
post_nms_topk (int): Limit on retained detections after NMS.
|
post_nms_topk (int): Limit on retained detections after NMS.
|
||||||
dynamic_size (bool): Flag indicating dynamic or static input sizing.
|
dynamic_size (bool): Flag indicating dynamic or static input sizing.
|
||||||
@@ -63,23 +65,23 @@ class RetinaFace(BaseDetector):
|
|||||||
self,
|
self,
|
||||||
*,
|
*,
|
||||||
model_name: RetinaFaceWeights = RetinaFaceWeights.MNET_V2,
|
model_name: RetinaFaceWeights = RetinaFaceWeights.MNET_V2,
|
||||||
conf_thresh: float = 0.5,
|
confidence_threshold: float = 0.5,
|
||||||
nms_thresh: float = 0.4,
|
nms_threshold: float = 0.4,
|
||||||
input_size: Tuple[int, int] = (640, 640),
|
input_size: tuple[int, int] = (640, 640),
|
||||||
**kwargs: Any,
|
**kwargs: Any,
|
||||||
) -> None:
|
) -> None:
|
||||||
super().__init__(
|
super().__init__(
|
||||||
model_name=model_name,
|
model_name=model_name,
|
||||||
conf_thresh=conf_thresh,
|
confidence_threshold=confidence_threshold,
|
||||||
nms_thresh=nms_thresh,
|
nms_threshold=nms_threshold,
|
||||||
input_size=input_size,
|
input_size=input_size,
|
||||||
**kwargs,
|
**kwargs,
|
||||||
)
|
)
|
||||||
self._supports_landmarks = True # RetinaFace supports landmarks
|
self._supports_landmarks = True # RetinaFace supports landmarks
|
||||||
|
|
||||||
self.model_name = model_name
|
self.model_name = model_name
|
||||||
self.conf_thresh = conf_thresh
|
self.confidence_threshold = confidence_threshold
|
||||||
self.nms_thresh = nms_thresh
|
self.nms_threshold = nms_threshold
|
||||||
self.input_size = input_size
|
self.input_size = input_size
|
||||||
|
|
||||||
# Advanced options from kwargs
|
# Advanced options from kwargs
|
||||||
@@ -88,8 +90,8 @@ class RetinaFace(BaseDetector):
|
|||||||
self.dynamic_size = kwargs.get('dynamic_size', False)
|
self.dynamic_size = kwargs.get('dynamic_size', False)
|
||||||
|
|
||||||
Logger.info(
|
Logger.info(
|
||||||
f'Initializing RetinaFace with model={self.model_name}, conf_thresh={self.conf_thresh}, '
|
f'Initializing RetinaFace with model={self.model_name}, confidence_threshold={self.confidence_threshold}, '
|
||||||
f'nms_thresh={self.nms_thresh}, input_size={self.input_size}'
|
f'nms_threshold={self.nms_threshold}, input_size={self.input_size}'
|
||||||
)
|
)
|
||||||
|
|
||||||
# Get path to model weights
|
# Get path to model weights
|
||||||
@@ -105,14 +107,13 @@ class RetinaFace(BaseDetector):
|
|||||||
self._initialize_model(self._model_path)
|
self._initialize_model(self._model_path)
|
||||||
|
|
||||||
def _initialize_model(self, model_path: str) -> None:
|
def _initialize_model(self, model_path: str) -> None:
|
||||||
"""
|
"""Initialize an ONNX model session from the given path.
|
||||||
Initializes an ONNX model session from the given path.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
model_path (str): The file path to the ONNX model.
|
model_path: The file path to the ONNX model.
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
RuntimeError: If the model fails to load, logs an error and raises an exception.
|
RuntimeError: If the model fails to load.
|
||||||
"""
|
"""
|
||||||
try:
|
try:
|
||||||
self.session = create_onnx_session(model_path)
|
self.session = create_onnx_session(model_path)
|
||||||
@@ -137,14 +138,14 @@ class RetinaFace(BaseDetector):
|
|||||||
image = np.expand_dims(image, axis=0) # Add batch dimension (1, C, H, W)
|
image = np.expand_dims(image, axis=0) # Add batch dimension (1, C, H, W)
|
||||||
return image
|
return image
|
||||||
|
|
||||||
def inference(self, input_tensor: np.ndarray) -> List[np.ndarray]:
|
def inference(self, input_tensor: np.ndarray) -> list[np.ndarray]:
|
||||||
"""Perform model inference on the preprocessed image tensor.
|
"""Perform model inference on the preprocessed image tensor.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
input_tensor (np.ndarray): Preprocessed input tensor.
|
input_tensor: Preprocessed input tensor with shape (1, C, H, W).
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Tuple[np.ndarray, np.ndarray]: Raw model outputs.
|
List of raw model outputs (location, confidence, landmarks).
|
||||||
"""
|
"""
|
||||||
return self.session.run(self.output_names, {self.input_names: input_tensor})
|
return self.session.run(self.output_names, {self.input_names: input_tensor})
|
||||||
|
|
||||||
@@ -155,7 +156,7 @@ class RetinaFace(BaseDetector):
|
|||||||
max_num: int = 0,
|
max_num: int = 0,
|
||||||
metric: Literal['default', 'max'] = 'max',
|
metric: Literal['default', 'max'] = 'max',
|
||||||
center_weight: float = 2.0,
|
center_weight: float = 2.0,
|
||||||
) -> List[Face]:
|
) -> list[Face]:
|
||||||
"""
|
"""
|
||||||
Perform face detection on an input image and return bounding boxes and facial landmarks.
|
Perform face detection on an input image and return bounding boxes and facial landmarks.
|
||||||
|
|
||||||
@@ -240,41 +241,43 @@ class RetinaFace(BaseDetector):
|
|||||||
return faces
|
return faces
|
||||||
|
|
||||||
def postprocess(
|
def postprocess(
|
||||||
self, outputs: List[np.ndarray], resize_factor: float, shape: Tuple[int, int]
|
self,
|
||||||
) -> Tuple[np.ndarray, np.ndarray]:
|
outputs: list[np.ndarray],
|
||||||
"""
|
resize_factor: float,
|
||||||
Process the model outputs into final detection results.
|
shape: tuple[int, int],
|
||||||
|
) -> tuple[np.ndarray, np.ndarray]:
|
||||||
|
"""Process the model outputs into final detection results.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
outputs (List[np.ndarray]): Raw outputs from the detection model.
|
outputs: Raw outputs from the detection model containing:
|
||||||
- outputs[0]: Location predictions (bounding box coordinates).
|
- outputs[0]: Location predictions (bounding box coordinates).
|
||||||
- outputs[1]: Class confidence scores.
|
- outputs[1]: Class confidence scores.
|
||||||
- outputs[2]: Landmark predictions.
|
- outputs[2]: Landmark predictions.
|
||||||
resize_factor (float): Factor used to resize the input image during preprocessing.
|
resize_factor: Factor used to resize the input image during preprocessing.
|
||||||
shape (Tuple[int, int]): Original shape of the image as (height, width).
|
shape: Original shape of the image as (width, height).
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Tuple[np.ndarray, np.ndarray]: Processed results containing:
|
A tuple containing:
|
||||||
- detections (np.ndarray): Array of detected bounding boxes with confidence scores.
|
- detections: Array of detected bounding boxes with confidence scores,
|
||||||
Shape: (num_detections, 5), where each row is [x_min, y_min, x_max, y_max, score].
|
shape (num_detections, 5), each row is [x1, y1, x2, y2, score].
|
||||||
- landmarks (np.ndarray): Array of detected facial landmarks.
|
- landmarks: Array of detected facial landmarks,
|
||||||
Shape: (num_detections, 5, 2), where each row contains 5 landmark points (x, y).
|
shape (num_detections, 5, 2), each row contains 5 landmark points (x, y).
|
||||||
"""
|
"""
|
||||||
loc, conf, landmarks = (
|
location_predictions, confidence_scores, landmark_predictions = (
|
||||||
outputs[0].squeeze(0),
|
outputs[0].squeeze(0),
|
||||||
outputs[1].squeeze(0),
|
outputs[1].squeeze(0),
|
||||||
outputs[2].squeeze(0),
|
outputs[2].squeeze(0),
|
||||||
)
|
)
|
||||||
|
|
||||||
# Decode boxes and landmarks
|
# Decode boxes and landmarks
|
||||||
boxes = decode_boxes(loc, self._priors)
|
boxes = decode_boxes(location_predictions, self._priors)
|
||||||
landmarks = decode_landmarks(landmarks, self._priors)
|
landmarks = decode_landmarks(landmark_predictions, self._priors)
|
||||||
|
|
||||||
boxes, landmarks = self._scale_detections(boxes, landmarks, resize_factor, shape=(shape[0], shape[1]))
|
boxes, landmarks = self._scale_detections(boxes, landmarks, resize_factor, shape=(shape[0], shape[1]))
|
||||||
|
|
||||||
# Extract confidence scores for the face class
|
# Extract confidence scores for the face class
|
||||||
scores = conf[:, 1]
|
scores = confidence_scores[:, 1]
|
||||||
mask = scores > self.conf_thresh
|
mask = scores > self.confidence_threshold
|
||||||
|
|
||||||
# Filter by confidence threshold
|
# Filter by confidence threshold
|
||||||
boxes, landmarks, scores = boxes[mask], landmarks[mask], scores[mask]
|
boxes, landmarks, scores = boxes[mask], landmarks[mask], scores[mask]
|
||||||
@@ -285,7 +288,7 @@ class RetinaFace(BaseDetector):
|
|||||||
|
|
||||||
# Apply NMS
|
# Apply NMS
|
||||||
detections = np.hstack((boxes, scores[:, np.newaxis])).astype(np.float32, copy=False)
|
detections = np.hstack((boxes, scores[:, np.newaxis])).astype(np.float32, copy=False)
|
||||||
keep = non_max_suppression(detections, self.nms_thresh)
|
keep = non_max_suppression(detections, self.nms_threshold)
|
||||||
detections, landmarks = detections[keep], landmarks[keep]
|
detections, landmarks = detections[keep], landmarks[keep]
|
||||||
|
|
||||||
# Keep top-k detections
|
# Keep top-k detections
|
||||||
@@ -303,9 +306,9 @@ class RetinaFace(BaseDetector):
|
|||||||
boxes: np.ndarray,
|
boxes: np.ndarray,
|
||||||
landmarks: np.ndarray,
|
landmarks: np.ndarray,
|
||||||
resize_factor: float,
|
resize_factor: float,
|
||||||
shape: Tuple[int, int],
|
shape: tuple[int, int],
|
||||||
) -> Tuple[np.ndarray, np.ndarray]:
|
) -> tuple[np.ndarray, np.ndarray]:
|
||||||
# Scale bounding boxes and landmarks to the original image size.
|
"""Scale bounding boxes and landmarks to the original image size."""
|
||||||
bbox_scale = np.array([shape[0], shape[1]] * 2)
|
bbox_scale = np.array([shape[0], shape[1]] * 2)
|
||||||
boxes = boxes * bbox_scale / resize_factor
|
boxes = boxes * bbox_scale / resize_factor
|
||||||
|
|
||||||
|
|||||||
@@ -2,7 +2,9 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import Any, List, Literal, Tuple
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Any, Literal
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
@@ -29,8 +31,8 @@ class SCRFD(BaseDetector):
|
|||||||
Args:
|
Args:
|
||||||
model_name (SCRFDWeights): Predefined model enum (e.g., `SCRFD_10G_KPS`).
|
model_name (SCRFDWeights): Predefined model enum (e.g., `SCRFD_10G_KPS`).
|
||||||
Specifies the SCRFD variant to load. Defaults to SCRFD_10G_KPS.
|
Specifies the SCRFD variant to load. Defaults to SCRFD_10G_KPS.
|
||||||
conf_thresh (float): Confidence threshold for filtering detections. Defaults to 0.5.
|
confidence_threshold (float): Confidence threshold for filtering detections. Defaults to 0.5.
|
||||||
nms_thresh (float): Non-Maximum Suppression threshold. Defaults to 0.4.
|
nms_threshold (float): Non-Maximum Suppression threshold. Defaults to 0.4.
|
||||||
input_size (Tuple[int, int]): Input image size (width, height).
|
input_size (Tuple[int, int]): Input image size (width, height).
|
||||||
Defaults to (640, 640).
|
Defaults to (640, 640).
|
||||||
Note: Non-default sizes may cause slower inference and CoreML compatibility issues.
|
Note: Non-default sizes may cause slower inference and CoreML compatibility issues.
|
||||||
@@ -38,10 +40,10 @@ class SCRFD(BaseDetector):
|
|||||||
|
|
||||||
Attributes:
|
Attributes:
|
||||||
model_name (SCRFDWeights): Selected model variant.
|
model_name (SCRFDWeights): Selected model variant.
|
||||||
conf_thresh (float): Threshold used to filter low-confidence detections.
|
confidence_threshold (float): Threshold used to filter low-confidence detections.
|
||||||
nms_thresh (float): Threshold used during NMS to suppress overlapping boxes.
|
nms_threshold (float): Threshold used during NMS to suppress overlapping boxes.
|
||||||
input_size (Tuple[int, int]): Image size to which inputs are resized before inference.
|
input_size (Tuple[int, int]): Image size to which inputs are resized before inference.
|
||||||
_fmc (int): Number of feature map levels used in the model.
|
_num_feature_maps (int): Number of feature map levels used in the model.
|
||||||
_feat_stride_fpn (List[int]): Feature map strides corresponding to each detection level.
|
_feat_stride_fpn (List[int]): Feature map strides corresponding to each detection level.
|
||||||
_num_anchors (int): Number of anchors per feature location.
|
_num_anchors (int): Number of anchors per feature location.
|
||||||
_center_cache (Dict): Cached anchor centers for efficient forward passes.
|
_center_cache (Dict): Cached anchor centers for efficient forward passes.
|
||||||
@@ -56,35 +58,35 @@ class SCRFD(BaseDetector):
|
|||||||
self,
|
self,
|
||||||
*,
|
*,
|
||||||
model_name: SCRFDWeights = SCRFDWeights.SCRFD_10G_KPS,
|
model_name: SCRFDWeights = SCRFDWeights.SCRFD_10G_KPS,
|
||||||
conf_thresh: float = 0.5,
|
confidence_threshold: float = 0.5,
|
||||||
nms_thresh: float = 0.4,
|
nms_threshold: float = 0.4,
|
||||||
input_size: Tuple[int, int] = (640, 640),
|
input_size: tuple[int, int] = (640, 640),
|
||||||
**kwargs: Any,
|
**kwargs: Any,
|
||||||
) -> None:
|
) -> None:
|
||||||
super().__init__(
|
super().__init__(
|
||||||
model_name=model_name,
|
model_name=model_name,
|
||||||
conf_thresh=conf_thresh,
|
confidence_threshold=confidence_threshold,
|
||||||
nms_thresh=nms_thresh,
|
nms_threshold=nms_threshold,
|
||||||
input_size=input_size,
|
input_size=input_size,
|
||||||
**kwargs,
|
**kwargs,
|
||||||
)
|
)
|
||||||
self._supports_landmarks = True # SCRFD supports landmarks
|
self._supports_landmarks = True # SCRFD supports landmarks
|
||||||
|
|
||||||
self.model_name = model_name
|
self.model_name = model_name
|
||||||
self.conf_thresh = conf_thresh
|
self.confidence_threshold = confidence_threshold
|
||||||
self.nms_thresh = nms_thresh
|
self.nms_threshold = nms_threshold
|
||||||
self.input_size = input_size
|
self.input_size = input_size
|
||||||
|
|
||||||
# ------- SCRFD model params ------
|
# ------- SCRFD model params ------
|
||||||
self._fmc = 3
|
self._num_feature_maps = 3
|
||||||
self._feat_stride_fpn = [8, 16, 32]
|
self._feat_stride_fpn = [8, 16, 32]
|
||||||
self._num_anchors = 2
|
self._num_anchors = 2
|
||||||
self._center_cache = {}
|
self._center_cache = {}
|
||||||
# ---------------------------------
|
# ---------------------------------
|
||||||
|
|
||||||
Logger.info(
|
Logger.info(
|
||||||
f'Initializing SCRFD with model={self.model_name}, conf_thresh={self.conf_thresh}, '
|
f'Initializing SCRFD with model={self.model_name}, confidence_threshold={self.confidence_threshold}, '
|
||||||
f'nms_thresh={self.nms_thresh}, input_size={self.input_size}'
|
f'nms_threshold={self.nms_threshold}, input_size={self.input_size}'
|
||||||
)
|
)
|
||||||
|
|
||||||
# Get path to model weights
|
# Get path to model weights
|
||||||
@@ -95,14 +97,13 @@ class SCRFD(BaseDetector):
|
|||||||
self._initialize_model(self._model_path)
|
self._initialize_model(self._model_path)
|
||||||
|
|
||||||
def _initialize_model(self, model_path: str) -> None:
|
def _initialize_model(self, model_path: str) -> None:
|
||||||
"""
|
"""Initialize an ONNX model session from the given path.
|
||||||
Initializes an ONNX model session from the given path.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
model_path (str): The file path to the ONNX model.
|
model_path: The file path to the ONNX model.
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
RuntimeError: If the model fails to load, logs an error and raises an exception.
|
RuntimeError: If the model fails to load.
|
||||||
"""
|
"""
|
||||||
try:
|
try:
|
||||||
self.session = create_onnx_session(model_path)
|
self.session = create_onnx_session(model_path)
|
||||||
@@ -113,14 +114,14 @@ class SCRFD(BaseDetector):
|
|||||||
Logger.error(f"Failed to load model from '{model_path}': {e}", exc_info=True)
|
Logger.error(f"Failed to load model from '{model_path}': {e}", exc_info=True)
|
||||||
raise RuntimeError(f"Failed to initialize model session for '{model_path}'") from e
|
raise RuntimeError(f"Failed to initialize model session for '{model_path}'") from e
|
||||||
|
|
||||||
def preprocess(self, image: np.ndarray) -> Tuple[np.ndarray, Tuple[int, int]]:
|
def preprocess(self, image: np.ndarray) -> np.ndarray:
|
||||||
"""Preprocess image for inference.
|
"""Preprocess image for inference.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image (np.ndarray): Input image
|
image: Input image with shape (H, W, C).
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Tuple[np.ndarray, Tuple[int, int]]: Preprocessed blob and input size
|
Preprocessed image tensor with shape (1, C, H, W).
|
||||||
"""
|
"""
|
||||||
image = image.astype(np.float32)
|
image = image.astype(np.float32)
|
||||||
image = (image - 127.5) / 127.5
|
image = (image - 127.5) / 127.5
|
||||||
@@ -129,29 +130,42 @@ class SCRFD(BaseDetector):
|
|||||||
|
|
||||||
return image
|
return image
|
||||||
|
|
||||||
def inference(self, input_tensor: np.ndarray) -> List[np.ndarray]:
|
def inference(self, input_tensor: np.ndarray) -> list[np.ndarray]:
|
||||||
"""Perform model inference on the preprocessed image tensor.
|
"""Perform model inference on the preprocessed image tensor.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
input_tensor (np.ndarray): Preprocessed input tensor.
|
input_tensor: Preprocessed input tensor with shape (1, C, H, W).
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Tuple[np.ndarray, np.ndarray]: Raw model outputs.
|
List of raw model outputs.
|
||||||
"""
|
"""
|
||||||
return self.session.run(self.output_names, {self.input_names: input_tensor})
|
return self.session.run(self.output_names, {self.input_names: input_tensor})
|
||||||
|
|
||||||
def postprocess(self, outputs: List[np.ndarray], image_size: Tuple[int, int]):
|
def postprocess(
|
||||||
scores_list = []
|
self,
|
||||||
|
outputs: list[np.ndarray],
|
||||||
|
image_size: tuple[int, int],
|
||||||
|
) -> tuple[list[np.ndarray], list[np.ndarray], list[np.ndarray]]:
|
||||||
|
"""Process model outputs into detection results.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
outputs: Raw outputs from the detection model.
|
||||||
|
image_size: Size of the input image as (height, width).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Tuple of (scores_list, bboxes_list, landmarks_list).
|
||||||
|
"""
|
||||||
|
scores_list: list[np.ndarray] = []
|
||||||
bboxes_list = []
|
bboxes_list = []
|
||||||
kpss_list = []
|
kpss_list = []
|
||||||
|
|
||||||
image_size = image_size
|
image_size = image_size
|
||||||
|
|
||||||
fmc = self._fmc
|
num_feature_maps = self._num_feature_maps
|
||||||
for idx, stride in enumerate(self._feat_stride_fpn):
|
for idx, stride in enumerate(self._feat_stride_fpn):
|
||||||
scores = outputs[idx]
|
scores = outputs[idx]
|
||||||
bbox_preds = outputs[fmc + idx] * stride
|
bbox_preds = outputs[num_feature_maps + idx] * stride
|
||||||
kps_preds = outputs[2 * fmc + idx] * stride
|
kps_preds = outputs[2 * num_feature_maps + idx] * stride
|
||||||
|
|
||||||
# Generate anchors
|
# Generate anchors
|
||||||
fm_height = image_size[0] // stride
|
fm_height = image_size[0] // stride
|
||||||
@@ -171,7 +185,7 @@ class SCRFD(BaseDetector):
|
|||||||
if len(self._center_cache) < 100:
|
if len(self._center_cache) < 100:
|
||||||
self._center_cache[cache_key] = anchor_centers
|
self._center_cache[cache_key] = anchor_centers
|
||||||
|
|
||||||
pos_indices = np.where(scores >= self.conf_thresh)[0]
|
pos_indices = np.where(scores >= self.confidence_threshold)[0]
|
||||||
if len(pos_indices) == 0:
|
if len(pos_indices) == 0:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
@@ -193,7 +207,7 @@ class SCRFD(BaseDetector):
|
|||||||
max_num: int = 0,
|
max_num: int = 0,
|
||||||
metric: Literal['default', 'max'] = 'max',
|
metric: Literal['default', 'max'] = 'max',
|
||||||
center_weight: float = 2.0,
|
center_weight: float = 2.0,
|
||||||
) -> List[Face]:
|
) -> list[Face]:
|
||||||
"""
|
"""
|
||||||
Perform face detection on an input image and return bounding boxes and facial landmarks.
|
Perform face detection on an input image and return bounding boxes and facial landmarks.
|
||||||
|
|
||||||
@@ -247,7 +261,7 @@ class SCRFD(BaseDetector):
|
|||||||
pre_det = np.hstack((bboxes, scores)).astype(np.float32, copy=False)
|
pre_det = np.hstack((bboxes, scores)).astype(np.float32, copy=False)
|
||||||
pre_det = pre_det[order, :]
|
pre_det = pre_det[order, :]
|
||||||
|
|
||||||
keep = non_max_suppression(pre_det, threshold=self.nms_thresh)
|
keep = non_max_suppression(pre_det, threshold=self.nms_threshold)
|
||||||
|
|
||||||
detections = pre_det[keep, :]
|
detections = pre_det[keep, :]
|
||||||
landmarks = landmarks[order, :, :]
|
landmarks = landmarks[order, :, :]
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import Any, List, Literal, Tuple
|
from typing import Any, Literal
|
||||||
|
|
||||||
import cv2
|
import cv2
|
||||||
import numpy as np
|
import numpy as np
|
||||||
@@ -30,8 +30,8 @@ class YOLOv5Face(BaseDetector):
|
|||||||
Args:
|
Args:
|
||||||
model_name (YOLOv5FaceWeights): Predefined model enum (e.g., `YOLOV5S`).
|
model_name (YOLOv5FaceWeights): Predefined model enum (e.g., `YOLOV5S`).
|
||||||
Specifies the YOLOv5-Face variant to load. Defaults to YOLOV5S.
|
Specifies the YOLOv5-Face variant to load. Defaults to YOLOV5S.
|
||||||
conf_thresh (float): Confidence threshold for filtering detections. Defaults to 0.6.
|
confidence_threshold (float): Confidence threshold for filtering detections. Defaults to 0.6.
|
||||||
nms_thresh (float): Non-Maximum Suppression threshold. Defaults to 0.5.
|
nms_threshold (float): Non-Maximum Suppression threshold. Defaults to 0.5.
|
||||||
input_size (int): Input image size. Defaults to 640.
|
input_size (int): Input image size. Defaults to 640.
|
||||||
Note: ONNX model is fixed at 640. Changing this will cause inference errors.
|
Note: ONNX model is fixed at 640. Changing this will cause inference errors.
|
||||||
**kwargs: Advanced options:
|
**kwargs: Advanced options:
|
||||||
@@ -39,8 +39,8 @@ class YOLOv5Face(BaseDetector):
|
|||||||
|
|
||||||
Attributes:
|
Attributes:
|
||||||
model_name (YOLOv5FaceWeights): Selected model variant.
|
model_name (YOLOv5FaceWeights): Selected model variant.
|
||||||
conf_thresh (float): Threshold used to filter low-confidence detections.
|
confidence_threshold (float): Threshold used to filter low-confidence detections.
|
||||||
nms_thresh (float): Threshold used during NMS to suppress overlapping boxes.
|
nms_threshold (float): Threshold used during NMS to suppress overlapping boxes.
|
||||||
input_size (int): Image size to which inputs are resized before inference.
|
input_size (int): Image size to which inputs are resized before inference.
|
||||||
max_det (int): Maximum number of detections to return.
|
max_det (int): Maximum number of detections to return.
|
||||||
_model_path (str): Absolute path to the downloaded/verified model weights.
|
_model_path (str): Absolute path to the downloaded/verified model weights.
|
||||||
@@ -54,15 +54,15 @@ class YOLOv5Face(BaseDetector):
|
|||||||
self,
|
self,
|
||||||
*,
|
*,
|
||||||
model_name: YOLOv5FaceWeights = YOLOv5FaceWeights.YOLOV5S,
|
model_name: YOLOv5FaceWeights = YOLOv5FaceWeights.YOLOV5S,
|
||||||
conf_thresh: float = 0.6,
|
confidence_threshold: float = 0.6,
|
||||||
nms_thresh: float = 0.5,
|
nms_threshold: float = 0.5,
|
||||||
input_size: int = 640,
|
input_size: int = 640,
|
||||||
**kwargs: Any,
|
**kwargs: Any,
|
||||||
) -> None:
|
) -> None:
|
||||||
super().__init__(
|
super().__init__(
|
||||||
model_name=model_name,
|
model_name=model_name,
|
||||||
conf_thresh=conf_thresh,
|
confidence_threshold=confidence_threshold,
|
||||||
nms_thresh=nms_thresh,
|
nms_threshold=nms_threshold,
|
||||||
input_size=input_size,
|
input_size=input_size,
|
||||||
**kwargs,
|
**kwargs,
|
||||||
)
|
)
|
||||||
@@ -75,16 +75,16 @@ class YOLOv5Face(BaseDetector):
|
|||||||
)
|
)
|
||||||
|
|
||||||
self.model_name = model_name
|
self.model_name = model_name
|
||||||
self.conf_thresh = conf_thresh
|
self.confidence_threshold = confidence_threshold
|
||||||
self.nms_thresh = nms_thresh
|
self.nms_threshold = nms_threshold
|
||||||
self.input_size = input_size
|
self.input_size = input_size
|
||||||
|
|
||||||
# Advanced options from kwargs
|
# Advanced options from kwargs
|
||||||
self.max_det = kwargs.get('max_det', 750)
|
self.max_det = kwargs.get('max_det', 750)
|
||||||
|
|
||||||
Logger.info(
|
Logger.info(
|
||||||
f'Initializing YOLOv5Face with model={self.model_name}, conf_thresh={self.conf_thresh}, '
|
f'Initializing YOLOv5Face with model={self.model_name}, confidence_threshold={self.confidence_threshold}, '
|
||||||
f'nms_thresh={self.nms_thresh}, input_size={self.input_size}'
|
f'nms_threshold={self.nms_threshold}, input_size={self.input_size}'
|
||||||
)
|
)
|
||||||
|
|
||||||
# Get path to model weights
|
# Get path to model weights
|
||||||
@@ -113,7 +113,7 @@ class YOLOv5Face(BaseDetector):
|
|||||||
Logger.error(f"Failed to load model from '{model_path}': {e}", exc_info=True)
|
Logger.error(f"Failed to load model from '{model_path}': {e}", exc_info=True)
|
||||||
raise RuntimeError(f"Failed to initialize model session for '{model_path}'") from e
|
raise RuntimeError(f"Failed to initialize model session for '{model_path}'") from e
|
||||||
|
|
||||||
def preprocess(self, image: np.ndarray) -> Tuple[np.ndarray, float, Tuple[int, int]]:
|
def preprocess(self, image: np.ndarray) -> tuple[np.ndarray, float, tuple[int, int]]:
|
||||||
"""
|
"""
|
||||||
Preprocess image for inference.
|
Preprocess image for inference.
|
||||||
|
|
||||||
@@ -154,7 +154,7 @@ class YOLOv5Face(BaseDetector):
|
|||||||
|
|
||||||
return img_batch, scale, (pad_w, pad_h)
|
return img_batch, scale, (pad_w, pad_h)
|
||||||
|
|
||||||
def inference(self, input_tensor: np.ndarray) -> List[np.ndarray]:
|
def inference(self, input_tensor: np.ndarray) -> list[np.ndarray]:
|
||||||
"""Perform model inference on the preprocessed image tensor.
|
"""Perform model inference on the preprocessed image tensor.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
@@ -169,8 +169,8 @@ class YOLOv5Face(BaseDetector):
|
|||||||
self,
|
self,
|
||||||
predictions: np.ndarray,
|
predictions: np.ndarray,
|
||||||
scale: float,
|
scale: float,
|
||||||
padding: Tuple[int, int],
|
padding: tuple[int, int],
|
||||||
) -> Tuple[np.ndarray, np.ndarray]:
|
) -> tuple[np.ndarray, np.ndarray]:
|
||||||
"""
|
"""
|
||||||
Postprocess model predictions.
|
Postprocess model predictions.
|
||||||
|
|
||||||
@@ -190,7 +190,7 @@ class YOLOv5Face(BaseDetector):
|
|||||||
predictions = predictions[0] # Remove batch dimension
|
predictions = predictions[0] # Remove batch dimension
|
||||||
|
|
||||||
# Filter by confidence
|
# Filter by confidence
|
||||||
mask = predictions[:, 4] >= self.conf_thresh
|
mask = predictions[:, 4] >= self.confidence_threshold
|
||||||
predictions = predictions[mask]
|
predictions = predictions[mask]
|
||||||
|
|
||||||
if len(predictions) == 0:
|
if len(predictions) == 0:
|
||||||
@@ -207,7 +207,7 @@ class YOLOv5Face(BaseDetector):
|
|||||||
|
|
||||||
# Apply NMS
|
# Apply NMS
|
||||||
detections_for_nms = np.hstack((boxes, scores[:, None])).astype(np.float32, copy=False)
|
detections_for_nms = np.hstack((boxes, scores[:, None])).astype(np.float32, copy=False)
|
||||||
keep = non_max_suppression(detections_for_nms, self.nms_thresh)
|
keep = non_max_suppression(detections_for_nms, self.nms_threshold)
|
||||||
|
|
||||||
if len(keep) == 0:
|
if len(keep) == 0:
|
||||||
return np.array([]), np.array([])
|
return np.array([]), np.array([])
|
||||||
@@ -260,7 +260,7 @@ class YOLOv5Face(BaseDetector):
|
|||||||
max_num: int = 0,
|
max_num: int = 0,
|
||||||
metric: Literal['default', 'max'] = 'max',
|
metric: Literal['default', 'max'] = 'max',
|
||||||
center_weight: float = 2.0,
|
center_weight: float = 2.0,
|
||||||
) -> List[Face]:
|
) -> list[Face]:
|
||||||
"""
|
"""
|
||||||
Perform face detection on an input image and return bounding boxes and facial landmarks.
|
Perform face detection on an input image and return bounding boxes and facial landmarks.
|
||||||
|
|
||||||
|
|||||||
@@ -2,8 +2,9 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
from dataclasses import dataclass, fields
|
from dataclasses import dataclass, fields
|
||||||
from typing import Optional
|
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
@@ -29,6 +30,8 @@ class Face:
|
|||||||
age: Predicted exact age in years (optional, from AgeGender model).
|
age: Predicted exact age in years (optional, from AgeGender model).
|
||||||
age_group: Predicted age range like "20-29" (optional, from FairFace).
|
age_group: Predicted age range like "20-29" (optional, from FairFace).
|
||||||
race: Predicted race/ethnicity (optional, from FairFace).
|
race: Predicted race/ethnicity (optional, from FairFace).
|
||||||
|
emotion: Predicted emotion label (optional, from Emotion model).
|
||||||
|
emotion_confidence: Confidence score for emotion prediction (optional).
|
||||||
|
|
||||||
Properties:
|
Properties:
|
||||||
sex: Gender as a human-readable string ("Female" or "Male").
|
sex: Gender as a human-readable string ("Female" or "Male").
|
||||||
@@ -42,13 +45,15 @@ class Face:
|
|||||||
landmarks: np.ndarray
|
landmarks: np.ndarray
|
||||||
|
|
||||||
# Optional attributes
|
# Optional attributes
|
||||||
embedding: Optional[np.ndarray] = None
|
embedding: np.ndarray | None = None
|
||||||
gender: Optional[int] = None
|
gender: int | None = None
|
||||||
age: Optional[int] = None
|
age: int | None = None
|
||||||
age_group: Optional[str] = None
|
age_group: str | None = None
|
||||||
race: Optional[str] = None
|
race: str | None = None
|
||||||
|
emotion: str | None = None
|
||||||
|
emotion_confidence: float | None = None
|
||||||
|
|
||||||
def compute_similarity(self, other: 'Face') -> float:
|
def compute_similarity(self, other: Face) -> float:
|
||||||
"""Compute cosine similarity with another face."""
|
"""Compute cosine similarity with another face."""
|
||||||
if self.embedding is None or other.embedding is None:
|
if self.embedding is None or other.embedding is None:
|
||||||
raise ValueError('Both faces must have embeddings for similarity computation')
|
raise ValueError('Both faces must have embeddings for similarity computation')
|
||||||
@@ -59,7 +64,7 @@ class Face:
|
|||||||
return {f.name: getattr(self, f.name) for f in fields(self)}
|
return {f.name: getattr(self, f.name) for f in fields(self)}
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def sex(self) -> Optional[str]:
|
def sex(self) -> str | None:
|
||||||
"""Get gender as a string label (Female or Male)."""
|
"""Get gender as a string label (Female or Male)."""
|
||||||
if self.gender is None:
|
if self.gender is None:
|
||||||
return None
|
return None
|
||||||
@@ -85,6 +90,8 @@ class Face:
|
|||||||
parts.append(f'sex={self.sex}')
|
parts.append(f'sex={self.sex}')
|
||||||
if self.race is not None:
|
if self.race is not None:
|
||||||
parts.append(f'race={self.race}')
|
parts.append(f'race={self.race}')
|
||||||
|
if self.emotion is not None:
|
||||||
|
parts.append(f'emotion={self.emotion}')
|
||||||
if self.embedding is not None:
|
if self.embedding is not None:
|
||||||
parts.append(f'embedding_dim={self.embedding.shape[0]}')
|
parts.append(f'embedding_dim={self.embedding.shape[0]}')
|
||||||
return ', '.join(parts) + ')'
|
return ', '.join(parts) + ')'
|
||||||
|
|||||||
@@ -2,21 +2,21 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import Tuple, Union
|
from __future__ import annotations
|
||||||
|
|
||||||
import cv2
|
import cv2
|
||||||
import numpy as np
|
import numpy as np
|
||||||
from skimage.transform import SimilarityTransform
|
from skimage.transform import SimilarityTransform
|
||||||
|
|
||||||
__all__ = [
|
__all__ = [
|
||||||
'face_alignment',
|
|
||||||
'compute_similarity',
|
|
||||||
'bbox_center_alignment',
|
'bbox_center_alignment',
|
||||||
|
'compute_similarity',
|
||||||
|
'face_alignment',
|
||||||
'transform_points_2d',
|
'transform_points_2d',
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|
||||||
# Reference alignment for facial landmarks (ArcFace)
|
# Standard 5-point facial landmark reference for ArcFace alignment (112x112)
|
||||||
reference_alignment: np.ndarray = np.array(
|
reference_alignment: np.ndarray = np.array(
|
||||||
[
|
[
|
||||||
[38.2946, 51.6963],
|
[38.2946, 51.6963],
|
||||||
@@ -29,18 +29,21 @@ reference_alignment: np.ndarray = np.array(
|
|||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def estimate_norm(landmark: np.ndarray, image_size: Union[int, Tuple[int, int]] = 112) -> Tuple[np.ndarray, np.ndarray]:
|
def estimate_norm(
|
||||||
"""
|
landmark: np.ndarray,
|
||||||
Estimate the normalization transformation matrix for facial landmarks.
|
image_size: int | tuple[int, int] = 112,
|
||||||
|
) -> tuple[np.ndarray, np.ndarray]:
|
||||||
|
"""Estimate the normalization transformation matrix for facial landmarks.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
landmark (np.ndarray): Array of shape (5, 2) representing the coordinates of the facial landmarks.
|
landmark: Array of shape (5, 2) representing the coordinates of the facial landmarks.
|
||||||
image_size (Union[int, Tuple[int, int]], optional): The size of the output image.
|
image_size: The size of the output image. Can be an integer (for square images)
|
||||||
Can be an integer (for square images) or a tuple (width, height). Default is 112.
|
or a tuple (width, height). Default is 112.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
np.ndarray: The 2x3 transformation matrix for aligning the landmarks.
|
A tuple containing:
|
||||||
np.ndarray: The 2x3 inverse transformation matrix for aligning the landmarks.
|
- The 2x3 transformation matrix for aligning the landmarks.
|
||||||
|
- The 2x3 inverse transformation matrix.
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
AssertionError: If the input landmark array does not have the shape (5, 2)
|
AssertionError: If the input landmark array does not have the shape (5, 2)
|
||||||
@@ -80,23 +83,23 @@ def estimate_norm(landmark: np.ndarray, image_size: Union[int, Tuple[int, int]]
|
|||||||
def face_alignment(
|
def face_alignment(
|
||||||
image: np.ndarray,
|
image: np.ndarray,
|
||||||
landmark: np.ndarray,
|
landmark: np.ndarray,
|
||||||
image_size: Union[int, Tuple[int, int]] = 112,
|
image_size: int | tuple[int, int] = 112,
|
||||||
) -> Tuple[np.ndarray, np.ndarray]:
|
) -> tuple[np.ndarray, np.ndarray]:
|
||||||
"""
|
"""Align the face in the input image based on the given facial landmarks.
|
||||||
Align the face in the input image based on the given facial landmarks.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image (np.ndarray): Input image as a NumPy array.
|
image: Input image as a NumPy array with shape (H, W, C).
|
||||||
landmark (np.ndarray): Array of shape (5, 2) representing the coordinates of the facial landmarks.
|
landmark: Array of shape (5, 2) representing the facial landmark coordinates.
|
||||||
image_size (Union[int, Tuple[int, int]], optional): The size of the aligned output image.
|
image_size: The size of the aligned output image. Can be an integer
|
||||||
Can be an integer (for square images) or a tuple (width, height). Default is 112.
|
(for square images) or a tuple (width, height). Default is 112.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
np.ndarray: The aligned face as a NumPy array.
|
A tuple containing:
|
||||||
np.ndarray: The 2x3 transformation matrix used for alignment.
|
- The aligned face as a NumPy array.
|
||||||
|
- The 2x3 inverse transformation matrix used for alignment.
|
||||||
"""
|
"""
|
||||||
# Get the transformation matrix
|
# Get the transformation matrix
|
||||||
M, M_inv = estimate_norm(landmark, image_size)
|
transform_matrix, inverse_transform = estimate_norm(landmark, image_size)
|
||||||
|
|
||||||
# Handle both int and tuple for warpAffine output size
|
# Handle both int and tuple for warpAffine output size
|
||||||
if isinstance(image_size, int):
|
if isinstance(image_size, int):
|
||||||
@@ -105,44 +108,50 @@ def face_alignment(
|
|||||||
output_size = image_size
|
output_size = image_size
|
||||||
|
|
||||||
# Warp the input image to align the face
|
# Warp the input image to align the face
|
||||||
warped = cv2.warpAffine(image, M, output_size, borderValue=0.0)
|
warped = cv2.warpAffine(image, transform_matrix, output_size, borderValue=0.0)
|
||||||
|
|
||||||
return warped, M_inv
|
return warped, inverse_transform
|
||||||
|
|
||||||
|
|
||||||
def compute_similarity(feat1: np.ndarray, feat2: np.ndarray, normalized: bool = False) -> np.float32:
|
def compute_similarity(feat1: np.ndarray, feat2: np.ndarray, normalized: bool = False) -> np.float32:
|
||||||
"""Computing Similarity between two faces.
|
"""Compute cosine similarity between two face embeddings.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
feat1 (np.ndarray): First embedding.
|
feat1: First embedding vector.
|
||||||
feat2 (np.ndarray): Second embedding.
|
feat2: Second embedding vector.
|
||||||
normalized (bool): Set True if the embeddings are already L2 normalized.
|
normalized: Set True if the embeddings are already L2 normalized.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
np.float32: Cosine similarity.
|
Cosine similarity score in range [-1, 1].
|
||||||
"""
|
"""
|
||||||
feat1 = feat1.ravel()
|
feat1 = feat1.ravel()
|
||||||
feat2 = feat2.ravel()
|
feat2 = feat2.ravel()
|
||||||
if normalized:
|
if normalized:
|
||||||
return np.dot(feat1, feat2)
|
return np.dot(feat1, feat2)
|
||||||
else:
|
# Add small epsilon to prevent division by zero
|
||||||
return np.dot(feat1, feat2) / (np.linalg.norm(feat1) * np.linalg.norm(feat2) + 1e-5)
|
return np.dot(feat1, feat2) / (np.linalg.norm(feat1) * np.linalg.norm(feat2) + 1e-5)
|
||||||
|
|
||||||
|
|
||||||
def bbox_center_alignment(image, center, output_size, scale, rotation):
|
def bbox_center_alignment(
|
||||||
"""
|
image: np.ndarray,
|
||||||
Apply center-based alignment, scaling, and rotation to an image.
|
center: tuple[float, float],
|
||||||
|
output_size: int,
|
||||||
|
scale: float,
|
||||||
|
rotation: float,
|
||||||
|
) -> tuple[np.ndarray, np.ndarray]:
|
||||||
|
"""Apply center-based alignment, scaling, and rotation to an image.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image (np.ndarray): Input image.
|
image: Input image with shape (H, W, C).
|
||||||
center (Tuple[float, float]): Center point (e.g., face center from bbox).
|
center: Center point (x, y), e.g., face center from bbox.
|
||||||
output_size (int): Desired output image size (square).
|
output_size: Desired output image size (square).
|
||||||
scale (float): Scaling factor to zoom in/out.
|
scale: Scaling factor to zoom in/out.
|
||||||
rotation (float): Rotation angle in degrees (clockwise).
|
rotation: Rotation angle in degrees (clockwise).
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
cropped (np.ndarray): Aligned and cropped image.
|
A tuple containing:
|
||||||
M (np.ndarray): 2x3 affine transform matrix used.
|
- Aligned and cropped image with shape (output_size, output_size, C).
|
||||||
|
- 2x3 affine transform matrix used.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
# Convert rotation from degrees to radians
|
# Convert rotation from degrees to radians
|
||||||
@@ -175,15 +184,14 @@ def bbox_center_alignment(image, center, output_size, scale, rotation):
|
|||||||
|
|
||||||
|
|
||||||
def transform_points_2d(points: np.ndarray, transform: np.ndarray) -> np.ndarray:
|
def transform_points_2d(points: np.ndarray, transform: np.ndarray) -> np.ndarray:
|
||||||
"""
|
"""Apply a 2D affine transformation to an array of 2D points.
|
||||||
Apply a 2D affine transformation to an array of 2D points.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
points (np.ndarray): An (N, 2) array of 2D points.
|
points: An (N, 2) array of 2D points.
|
||||||
transform (np.ndarray): A (2, 3) affine transformation matrix.
|
transform: A (2, 3) affine transformation matrix.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
np.ndarray: Transformed (N, 2) array of points.
|
Transformed (N, 2) array of points.
|
||||||
"""
|
"""
|
||||||
transformed = np.zeros_like(points, dtype=np.float32)
|
transformed = np.zeros_like(points, dtype=np.float32)
|
||||||
for i in range(points.shape[0]):
|
for i in range(points.shape[0]):
|
||||||
|
|||||||
@@ -34,10 +34,7 @@ def create_gaze_estimator(method: str = 'mobilegaze', **kwargs) -> BaseGazeEstim
|
|||||||
|
|
||||||
>>> # Create with MobileNetV2 backbone
|
>>> # Create with MobileNetV2 backbone
|
||||||
>>> from uniface.constants import GazeWeights
|
>>> from uniface.constants import GazeWeights
|
||||||
>>> estimator = create_gaze_estimator(
|
>>> estimator = create_gaze_estimator('mobilegaze', model_name=GazeWeights.MOBILENET_V2)
|
||||||
... 'mobilegaze',
|
|
||||||
... model_name=GazeWeights.MOBILENET_V2
|
|
||||||
... )
|
|
||||||
|
|
||||||
>>> # Use the estimator
|
>>> # Use the estimator
|
||||||
>>> pitch, yaw = estimator.estimate(face_crop)
|
>>> pitch, yaw = estimator.estimate(face_crop)
|
||||||
@@ -51,4 +48,4 @@ def create_gaze_estimator(method: str = 'mobilegaze', **kwargs) -> BaseGazeEstim
|
|||||||
raise ValueError(f"Unsupported gaze estimation method: '{method}'. Available: {available}")
|
raise ValueError(f"Unsupported gaze estimation method: '{method}'. Available: {available}")
|
||||||
|
|
||||||
|
|
||||||
__all__ = ['create_gaze_estimator', 'MobileGaze', 'BaseGazeEstimator']
|
__all__ = ['BaseGazeEstimator', 'MobileGaze', 'create_gaze_estimator']
|
||||||
|
|||||||
@@ -3,7 +3,6 @@
|
|||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from abc import ABC, abstractmethod
|
from abc import ABC, abstractmethod
|
||||||
from typing import Tuple
|
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
@@ -54,7 +53,7 @@ class BaseGazeEstimator(ABC):
|
|||||||
raise NotImplementedError('Subclasses must implement the preprocess method.')
|
raise NotImplementedError('Subclasses must implement the preprocess method.')
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
def postprocess(self, outputs: Tuple[np.ndarray, np.ndarray]) -> Tuple[float, float]:
|
def postprocess(self, outputs: tuple[np.ndarray, np.ndarray]) -> tuple[float, float]:
|
||||||
"""
|
"""
|
||||||
Postprocess raw model outputs into gaze angles.
|
Postprocess raw model outputs into gaze angles.
|
||||||
|
|
||||||
@@ -71,7 +70,7 @@ class BaseGazeEstimator(ABC):
|
|||||||
raise NotImplementedError('Subclasses must implement the postprocess method.')
|
raise NotImplementedError('Subclasses must implement the postprocess method.')
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
def estimate(self, face_image: np.ndarray) -> Tuple[float, float]:
|
def estimate(self, face_image: np.ndarray) -> tuple[float, float]:
|
||||||
"""
|
"""
|
||||||
Perform end-to-end gaze estimation on a face image.
|
Perform end-to-end gaze estimation on a face image.
|
||||||
|
|
||||||
@@ -91,11 +90,11 @@ class BaseGazeEstimator(ABC):
|
|||||||
Example:
|
Example:
|
||||||
>>> estimator = create_gaze_estimator()
|
>>> estimator = create_gaze_estimator()
|
||||||
>>> pitch, yaw = estimator.estimate(face_crop)
|
>>> pitch, yaw = estimator.estimate(face_crop)
|
||||||
>>> print(f"Looking: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°")
|
>>> print(f'Looking: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°')
|
||||||
"""
|
"""
|
||||||
raise NotImplementedError('Subclasses must implement the estimate method.')
|
raise NotImplementedError('Subclasses must implement the estimate method.')
|
||||||
|
|
||||||
def __call__(self, face_image: np.ndarray) -> Tuple[float, float]:
|
def __call__(self, face_image: np.ndarray) -> tuple[float, float]:
|
||||||
"""
|
"""
|
||||||
Provides a convenient, callable shortcut for the `estimate` method.
|
Provides a convenient, callable shortcut for the `estimate` method.
|
||||||
|
|
||||||
|
|||||||
@@ -2,7 +2,6 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import Tuple
|
|
||||||
|
|
||||||
import cv2
|
import cv2
|
||||||
import numpy as np
|
import numpy as np
|
||||||
@@ -54,17 +53,17 @@ class MobileGaze(BaseGazeEstimator):
|
|||||||
>>> # Detect faces and estimate gaze for each
|
>>> # Detect faces and estimate gaze for each
|
||||||
>>> faces = detector.detect(image)
|
>>> faces = detector.detect(image)
|
||||||
>>> for face in faces:
|
>>> for face in faces:
|
||||||
... bbox = face['bbox']
|
... bbox = face.bbox
|
||||||
... x1, y1, x2, y2 = map(int, bbox[:4])
|
... x1, y1, x2, y2 = map(int, bbox[:4])
|
||||||
... face_crop = image[y1:y2, x1:x2]
|
... face_crop = image[y1:y2, x1:x2]
|
||||||
... pitch, yaw = gaze_estimator.estimate(face_crop)
|
... pitch, yaw = gaze_estimator.estimate(face_crop)
|
||||||
... print(f"Gaze: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°")
|
... print(f'Gaze: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°')
|
||||||
"""
|
"""
|
||||||
|
|
||||||
def __init__(
|
def __init__(
|
||||||
self,
|
self,
|
||||||
model_name: GazeWeights = GazeWeights.RESNET34,
|
model_name: GazeWeights = GazeWeights.RESNET34,
|
||||||
input_size: Tuple[int, int] = (448, 448),
|
input_size: tuple[int, int] = (448, 448),
|
||||||
) -> None:
|
) -> None:
|
||||||
Logger.info(f'Initializing MobileGaze with model={model_name}, input_size={input_size}')
|
Logger.info(f'Initializing MobileGaze with model={model_name}, input_size={input_size}')
|
||||||
|
|
||||||
@@ -143,7 +142,7 @@ class MobileGaze(BaseGazeEstimator):
|
|||||||
e_x = np.exp(x - np.max(x, axis=1, keepdims=True))
|
e_x = np.exp(x - np.max(x, axis=1, keepdims=True))
|
||||||
return e_x / e_x.sum(axis=1, keepdims=True)
|
return e_x / e_x.sum(axis=1, keepdims=True)
|
||||||
|
|
||||||
def postprocess(self, outputs: Tuple[np.ndarray, np.ndarray]) -> Tuple[np.ndarray, np.ndarray]:
|
def postprocess(self, outputs: tuple[np.ndarray, np.ndarray]) -> tuple[np.ndarray, np.ndarray]:
|
||||||
"""
|
"""
|
||||||
Postprocess raw model outputs into gaze angles.
|
Postprocess raw model outputs into gaze angles.
|
||||||
|
|
||||||
@@ -173,7 +172,7 @@ class MobileGaze(BaseGazeEstimator):
|
|||||||
|
|
||||||
return pitch, yaw
|
return pitch, yaw
|
||||||
|
|
||||||
def estimate(self, face_image: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
|
def estimate(self, face_image: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
|
||||||
"""
|
"""
|
||||||
Perform end-to-end gaze estimation on a face image.
|
Perform end-to-end gaze estimation on a face image.
|
||||||
|
|
||||||
|
|||||||
@@ -25,4 +25,4 @@ def create_landmarker(method: str = '2d106det', **kwargs) -> BaseLandmarker:
|
|||||||
raise ValueError(f"Unsupported method: '{method}'. Available: {available}")
|
raise ValueError(f"Unsupported method: '{method}'. Available: {available}")
|
||||||
|
|
||||||
|
|
||||||
__all__ = ['create_landmarker', 'Landmark106', 'BaseLandmarker']
|
__all__ = ['BaseLandmarker', 'Landmark106', 'create_landmarker']
|
||||||
|
|||||||
@@ -2,7 +2,6 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import Tuple
|
|
||||||
|
|
||||||
import cv2
|
import cv2
|
||||||
import numpy as np
|
import numpy as np
|
||||||
@@ -46,7 +45,7 @@ class Landmark106(BaseLandmarker):
|
|||||||
def __init__(
|
def __init__(
|
||||||
self,
|
self,
|
||||||
model_name: LandmarkWeights = LandmarkWeights.DEFAULT,
|
model_name: LandmarkWeights = LandmarkWeights.DEFAULT,
|
||||||
input_size: Tuple[int, int] = (192, 192),
|
input_size: tuple[int, int] = (192, 192),
|
||||||
) -> None:
|
) -> None:
|
||||||
Logger.info(f'Initializing Facial Landmark with model={model_name}, input_size={input_size}')
|
Logger.info(f'Initializing Facial Landmark with model={model_name}, input_size={input_size}')
|
||||||
self.input_size = input_size
|
self.input_size = input_size
|
||||||
@@ -85,7 +84,7 @@ class Landmark106(BaseLandmarker):
|
|||||||
Logger.error(f"Failed to load landmark model from '{self.model_path}'", exc_info=True)
|
Logger.error(f"Failed to load landmark model from '{self.model_path}'", exc_info=True)
|
||||||
raise RuntimeError(f'Failed to initialize landmark model: {e}') from e
|
raise RuntimeError(f'Failed to initialize landmark model: {e}') from e
|
||||||
|
|
||||||
def preprocess(self, image: np.ndarray, bbox: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
|
def preprocess(self, image: np.ndarray, bbox: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
|
||||||
"""Prepares a face crop for inference.
|
"""Prepares a face crop for inference.
|
||||||
|
|
||||||
This method takes a face bounding box, performs a center alignment to
|
This method takes a face bounding box, performs a center alignment to
|
||||||
|
|||||||
@@ -1,21 +1,41 @@
|
|||||||
|
# Copyright 2025 Yakhyokhuja Valikhujaev
|
||||||
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
"""Logging utilities for UniFace.
|
||||||
|
|
||||||
|
This module provides a centralized logger for the UniFace library,
|
||||||
|
allowing users to enable verbose logging when debugging or developing.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
import logging
|
import logging
|
||||||
|
|
||||||
|
__all__ = ['Logger', 'enable_logging']
|
||||||
|
|
||||||
# Create logger for uniface
|
# Create logger for uniface
|
||||||
Logger = logging.getLogger('uniface')
|
Logger = logging.getLogger('uniface')
|
||||||
Logger.setLevel(logging.WARNING) # Only show warnings/errors by default
|
Logger.setLevel(logging.WARNING) # Only show warnings/errors by default
|
||||||
Logger.addHandler(logging.NullHandler())
|
Logger.addHandler(logging.NullHandler())
|
||||||
|
|
||||||
|
|
||||||
def enable_logging(level=logging.INFO):
|
def enable_logging(level: int = logging.INFO) -> None:
|
||||||
"""
|
"""Enable verbose logging for uniface.
|
||||||
Enable verbose logging for uniface.
|
|
||||||
|
Configures the logger to output messages to stdout with timestamps.
|
||||||
|
Call this function to see informational messages during model loading
|
||||||
|
and inference.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
level: Logging level (logging.DEBUG, logging.INFO, etc.)
|
level: Logging level. Defaults to logging.INFO.
|
||||||
|
Common values: logging.DEBUG, logging.INFO, logging.WARNING.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
>>> from uniface import enable_logging
|
>>> from uniface import enable_logging
|
||||||
|
>>> import logging
|
||||||
>>> enable_logging() # Show INFO logs
|
>>> enable_logging() # Show INFO logs
|
||||||
|
>>> enable_logging(level=logging.DEBUG) # Show DEBUG logs
|
||||||
"""
|
"""
|
||||||
Logger.handlers.clear()
|
Logger.handlers.clear()
|
||||||
handler = logging.StreamHandler()
|
handler = logging.StreamHandler()
|
||||||
|
|||||||
@@ -2,6 +2,15 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
"""Model weight management for UniFace.
|
||||||
|
|
||||||
|
This module handles downloading, caching, and verifying model weights
|
||||||
|
using SHA-256 checksums for integrity validation.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from enum import Enum
|
||||||
import hashlib
|
import hashlib
|
||||||
import os
|
import os
|
||||||
|
|
||||||
@@ -14,33 +23,32 @@ from uniface.log import Logger
|
|||||||
__all__ = ['verify_model_weights']
|
__all__ = ['verify_model_weights']
|
||||||
|
|
||||||
|
|
||||||
def verify_model_weights(model_name: str, root: str = '~/.uniface/models') -> str:
|
def verify_model_weights(model_name: Enum, root: str = '~/.uniface/models') -> str:
|
||||||
"""
|
"""Ensure model weights are present, downloading and verifying them if necessary.
|
||||||
Ensure model weights are present, downloading and verifying them using SHA-256 if necessary.
|
|
||||||
|
|
||||||
Given a model identifier from an Enum class (e.g., `RetinaFaceWeights.MNET_V2`), this function checks if
|
Given a model identifier from an Enum class (e.g., `RetinaFaceWeights.MNET_V2`),
|
||||||
the corresponding `.onnx` weight file exists locally. If not, it downloads the file from a predefined URL.
|
this function checks if the corresponding weight file exists locally. If not,
|
||||||
After download, the file’s integrity is verified using a SHA-256 hash. If verification fails, the file is deleted
|
it downloads the file from a predefined URL and verifies its integrity using
|
||||||
and an error is raised.
|
a SHA-256 hash.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
model_name (Enum): Model weight identifier (e.g., `RetinaFaceWeights.MNET_V2`, `ArcFaceWeights.RESNET`, etc.).
|
model_name: Model weight identifier enum (e.g., `RetinaFaceWeights.MNET_V2`).
|
||||||
root (str, optional): Directory to store or locate the model weights. Defaults to '~/.uniface/models'.
|
root: Directory to store or locate the model weights.
|
||||||
|
Defaults to '~/.uniface/models'.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
str: Absolute path to the verified model weights file.
|
Absolute path to the verified model weights file.
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
ValueError: If the model is unknown or SHA-256 verification fails.
|
ValueError: If the model is unknown or SHA-256 verification fails.
|
||||||
ConnectionError: If downloading the file fails.
|
ConnectionError: If downloading the file fails.
|
||||||
|
|
||||||
Examples:
|
Example:
|
||||||
>>> from uniface.models import RetinaFaceWeights, verify_model_weights
|
>>> from uniface.constants import RetinaFaceWeights
|
||||||
>>> verify_model_weights(RetinaFaceWeights.MNET_V2)
|
>>> from uniface.model_store import verify_model_weights
|
||||||
|
>>> path = verify_model_weights(RetinaFaceWeights.MNET_V2)
|
||||||
|
>>> print(path)
|
||||||
'/home/user/.uniface/models/retinaface_mnet_v2.onnx'
|
'/home/user/.uniface/models/retinaface_mnet_v2.onnx'
|
||||||
|
|
||||||
>>> verify_model_weights(RetinaFaceWeights.RESNET34, root='/custom/dir')
|
|
||||||
'/custom/dir/retinaface_r34.onnx'
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
root = os.path.expanduser(root)
|
root = os.path.expanduser(root)
|
||||||
@@ -73,10 +81,16 @@ def verify_model_weights(model_name: str, root: str = '~/.uniface/models') -> st
|
|||||||
return model_path
|
return model_path
|
||||||
|
|
||||||
|
|
||||||
def download_file(url: str, dest_path: str) -> None:
|
def download_file(url: str, dest_path: str, timeout: int = 30) -> None:
|
||||||
"""Download a file from a URL in chunks and save it to the destination path."""
|
"""Download a file from a URL in chunks and save it to the destination path.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
url: URL to download from.
|
||||||
|
dest_path: Local file path to save to.
|
||||||
|
timeout: Connection timeout in seconds. Defaults to 30.
|
||||||
|
"""
|
||||||
try:
|
try:
|
||||||
response = requests.get(url, stream=True)
|
response = requests.get(url, stream=True, timeout=timeout)
|
||||||
response.raise_for_status()
|
response.raise_for_status()
|
||||||
with (
|
with (
|
||||||
open(dest_path, 'wb') as file,
|
open(dest_path, 'wb') as file,
|
||||||
|
|||||||
@@ -2,16 +2,23 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import List
|
"""ONNX Runtime utilities for UniFace.
|
||||||
|
|
||||||
|
This module provides helper functions for creating and managing ONNX Runtime
|
||||||
|
inference sessions with automatic hardware acceleration detection.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
import onnxruntime as ort
|
import onnxruntime as ort
|
||||||
|
|
||||||
from uniface.log import Logger
|
from uniface.log import Logger
|
||||||
|
|
||||||
|
__all__ = ['create_onnx_session', 'get_available_providers']
|
||||||
|
|
||||||
def get_available_providers() -> List[str]:
|
|
||||||
"""
|
def get_available_providers() -> list[str]:
|
||||||
Get list of available ONNX Runtime execution providers for the current platform.
|
"""Get list of available ONNX Runtime execution providers.
|
||||||
|
|
||||||
Automatically detects and prioritizes hardware acceleration:
|
Automatically detects and prioritizes hardware acceleration:
|
||||||
- CoreML on Apple Silicon (M1/M2/M3/M4)
|
- CoreML on Apple Silicon (M1/M2/M3/M4)
|
||||||
@@ -19,13 +26,12 @@ def get_available_providers() -> List[str]:
|
|||||||
- CPU as fallback (always available)
|
- CPU as fallback (always available)
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
List[str]: Ordered list of execution providers to use
|
Ordered list of execution providers to use.
|
||||||
|
|
||||||
Examples:
|
Example:
|
||||||
>>> providers = get_available_providers()
|
>>> providers = get_available_providers()
|
||||||
>>> # On M4 Mac: ['CoreMLExecutionProvider', 'CPUExecutionProvider']
|
>>> # On M4 Mac: ['CoreMLExecutionProvider', 'CPUExecutionProvider']
|
||||||
>>> # On Linux with CUDA: ['CUDAExecutionProvider', 'CPUExecutionProvider']
|
>>> # On Linux with CUDA: ['CUDAExecutionProvider', 'CPUExecutionProvider']
|
||||||
>>> # On CPU-only: ['CPUExecutionProvider']
|
|
||||||
"""
|
"""
|
||||||
available = ort.get_available_providers()
|
available = ort.get_available_providers()
|
||||||
providers = []
|
providers = []
|
||||||
@@ -48,26 +54,28 @@ def get_available_providers() -> List[str]:
|
|||||||
return providers
|
return providers
|
||||||
|
|
||||||
|
|
||||||
def create_onnx_session(model_path: str, providers: List[str] = None) -> ort.InferenceSession:
|
def create_onnx_session(
|
||||||
"""
|
model_path: str,
|
||||||
Create an ONNX Runtime inference session with optimal provider selection.
|
providers: list[str] | None = None,
|
||||||
|
) -> ort.InferenceSession:
|
||||||
|
"""Create an ONNX Runtime inference session with optimal provider selection.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
model_path (str): Path to the ONNX model file
|
model_path: Path to the ONNX model file.
|
||||||
providers (List[str], optional): List of providers to use.
|
providers: List of execution providers to use. If None, automatically
|
||||||
If None, automatically detects best available providers.
|
detects best available providers.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
ort.InferenceSession: Configured ONNX Runtime session
|
Configured ONNX Runtime session.
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
RuntimeError: If session creation fails
|
RuntimeError: If session creation fails.
|
||||||
|
|
||||||
Examples:
|
Example:
|
||||||
>>> session = create_onnx_session("model.onnx")
|
>>> session = create_onnx_session('model.onnx')
|
||||||
>>> # Automatically uses best available providers
|
>>> # Automatically uses best available providers
|
||||||
|
|
||||||
>>> session = create_onnx_session("model.onnx", providers=["CPUExecutionProvider"])
|
>>> session = create_onnx_session('model.onnx', providers=['CPUExecutionProvider'])
|
||||||
>>> # Force CPU-only execution
|
>>> # Force CPU-only execution
|
||||||
"""
|
"""
|
||||||
if providers is None:
|
if providers is None:
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import Union
|
from __future__ import annotations
|
||||||
|
|
||||||
from uniface.constants import ParsingWeights
|
from uniface.constants import ParsingWeights
|
||||||
|
|
||||||
@@ -13,38 +13,29 @@ __all__ = ['BaseFaceParser', 'BiSeNet', 'create_face_parser']
|
|||||||
|
|
||||||
|
|
||||||
def create_face_parser(
|
def create_face_parser(
|
||||||
model_name: Union[str, ParsingWeights] = ParsingWeights.RESNET18,
|
model_name: str | ParsingWeights = ParsingWeights.RESNET18,
|
||||||
) -> BaseFaceParser:
|
) -> BaseFaceParser:
|
||||||
"""
|
"""Factory function to create a face parsing model instance.
|
||||||
Factory function to create a face parsing model instance.
|
|
||||||
|
|
||||||
This function provides a convenient way to instantiate face parsing models
|
This function provides a convenient way to instantiate face parsing models
|
||||||
without directly importing the specific model classes. It supports both
|
without directly importing the specific model classes.
|
||||||
string-based and enum-based model selection.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
model_name (Union[str, ParsingWeights]): The face parsing model to create.
|
model_name: The face parsing model to create. Can be either a string
|
||||||
Can be either a string or a ParsingWeights enum value.
|
or a ParsingWeights enum value. Available options:
|
||||||
Available options:
|
|
||||||
- 'parsing_resnet18' or ParsingWeights.RESNET18 (default)
|
- 'parsing_resnet18' or ParsingWeights.RESNET18 (default)
|
||||||
- 'parsing_resnet34' or ParsingWeights.RESNET34
|
- 'parsing_resnet34' or ParsingWeights.RESNET34
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
BaseFaceParser: An instance of the requested face parsing model.
|
An instance of the requested face parsing model.
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
ValueError: If the model_name is not recognized.
|
ValueError: If the model_name is not recognized.
|
||||||
|
|
||||||
Examples:
|
Example:
|
||||||
>>> # Using enum
|
|
||||||
>>> from uniface.parsing import create_face_parser
|
>>> from uniface.parsing import create_face_parser
|
||||||
>>> from uniface.constants import ParsingWeights
|
>>> from uniface.constants import ParsingWeights
|
||||||
>>> parser = create_face_parser(ParsingWeights.RESNET18)
|
>>> parser = create_face_parser(ParsingWeights.RESNET18)
|
||||||
>>>
|
|
||||||
>>> # Using string
|
|
||||||
>>> parser = create_face_parser('parsing_resnet18')
|
|
||||||
>>>
|
|
||||||
>>> # Parse a face image
|
|
||||||
>>> mask = parser.parse(face_crop)
|
>>> mask = parser.parse(face_crop)
|
||||||
"""
|
"""
|
||||||
# Convert string to enum if necessary
|
# Convert string to enum if necessary
|
||||||
|
|||||||
@@ -3,7 +3,6 @@
|
|||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from abc import ABC, abstractmethod
|
from abc import ABC, abstractmethod
|
||||||
from typing import Tuple
|
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
@@ -53,7 +52,7 @@ class BaseFaceParser(ABC):
|
|||||||
raise NotImplementedError('Subclasses must implement the preprocess method.')
|
raise NotImplementedError('Subclasses must implement the preprocess method.')
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
def postprocess(self, outputs: np.ndarray, original_size: Tuple[int, int]) -> np.ndarray:
|
def postprocess(self, outputs: np.ndarray, original_size: tuple[int, int]) -> np.ndarray:
|
||||||
"""
|
"""
|
||||||
Postprocess raw model outputs into a segmentation mask.
|
Postprocess raw model outputs into a segmentation mask.
|
||||||
|
|
||||||
@@ -89,7 +88,7 @@ class BaseFaceParser(ABC):
|
|||||||
Example:
|
Example:
|
||||||
>>> parser = create_face_parser()
|
>>> parser = create_face_parser()
|
||||||
>>> mask = parser.parse(face_crop)
|
>>> mask = parser.parse(face_crop)
|
||||||
>>> print(f"Mask shape: {mask.shape}, unique classes: {np.unique(mask)}")
|
>>> print(f'Mask shape: {mask.shape}, unique classes: {np.unique(mask)}')
|
||||||
"""
|
"""
|
||||||
raise NotImplementedError('Subclasses must implement the parse method.')
|
raise NotImplementedError('Subclasses must implement the parse method.')
|
||||||
|
|
||||||
|
|||||||
@@ -2,7 +2,6 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import Tuple
|
|
||||||
|
|
||||||
import cv2
|
import cv2
|
||||||
import numpy as np
|
import numpy as np
|
||||||
@@ -54,17 +53,17 @@ class BiSeNet(BaseFaceParser):
|
|||||||
>>> # Detect faces and parse each face
|
>>> # Detect faces and parse each face
|
||||||
>>> faces = detector.detect(image)
|
>>> faces = detector.detect(image)
|
||||||
>>> for face in faces:
|
>>> for face in faces:
|
||||||
... bbox = face['bbox']
|
... bbox = face.bbox
|
||||||
... x1, y1, x2, y2 = map(int, bbox[:4])
|
... x1, y1, x2, y2 = map(int, bbox[:4])
|
||||||
... face_crop = image[y1:y2, x1:x2]
|
... face_crop = image[y1:y2, x1:x2]
|
||||||
... mask = parser.parse(face_crop)
|
... mask = parser.parse(face_crop)
|
||||||
... print(f"Mask shape: {mask.shape}, unique classes: {np.unique(mask)}")
|
... print(f'Mask shape: {mask.shape}, unique classes: {np.unique(mask)}')
|
||||||
"""
|
"""
|
||||||
|
|
||||||
def __init__(
|
def __init__(
|
||||||
self,
|
self,
|
||||||
model_name: ParsingWeights = ParsingWeights.RESNET18,
|
model_name: ParsingWeights = ParsingWeights.RESNET18,
|
||||||
input_size: Tuple[int, int] = (512, 512),
|
input_size: tuple[int, int] = (512, 512),
|
||||||
) -> None:
|
) -> None:
|
||||||
Logger.info(f'Initializing BiSeNet with model={model_name}, input_size={input_size}')
|
Logger.info(f'Initializing BiSeNet with model={model_name}, input_size={input_size}')
|
||||||
|
|
||||||
@@ -127,7 +126,7 @@ class BiSeNet(BaseFaceParser):
|
|||||||
|
|
||||||
return image
|
return image
|
||||||
|
|
||||||
def postprocess(self, outputs: np.ndarray, original_size: Tuple[int, int]) -> np.ndarray:
|
def postprocess(self, outputs: np.ndarray, original_size: tuple[int, int]) -> np.ndarray:
|
||||||
"""
|
"""
|
||||||
Postprocess model output to segmentation mask.
|
Postprocess model output to segmentation mask.
|
||||||
|
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import Optional
|
from __future__ import annotations
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
@@ -11,11 +11,11 @@ from .blur import BlurFace
|
|||||||
|
|
||||||
def anonymize_faces(
|
def anonymize_faces(
|
||||||
image: np.ndarray,
|
image: np.ndarray,
|
||||||
detector: Optional[object] = None,
|
detector: object | None = None,
|
||||||
method: str = 'pixelate',
|
method: str = 'pixelate',
|
||||||
blur_strength: float = 3.0,
|
blur_strength: float = 3.0,
|
||||||
pixel_blocks: int = 10,
|
pixel_blocks: int = 10,
|
||||||
conf_thresh: float = 0.5,
|
confidence_threshold: float = 0.5,
|
||||||
**kwargs,
|
**kwargs,
|
||||||
) -> np.ndarray:
|
) -> np.ndarray:
|
||||||
"""One-line face anonymization with automatic detection.
|
"""One-line face anonymization with automatic detection.
|
||||||
@@ -26,7 +26,7 @@ def anonymize_faces(
|
|||||||
method (str): Blur method name. Defaults to 'pixelate'.
|
method (str): Blur method name. Defaults to 'pixelate'.
|
||||||
blur_strength (float): Blur intensity. Defaults to 3.0.
|
blur_strength (float): Blur intensity. Defaults to 3.0.
|
||||||
pixel_blocks (int): Block count for pixelate. Defaults to 10.
|
pixel_blocks (int): Block count for pixelate. Defaults to 10.
|
||||||
conf_thresh (float): Detection confidence threshold. Defaults to 0.5.
|
confidence_threshold (float): Detection confidence threshold. Defaults to 0.5.
|
||||||
**kwargs: Additional detector arguments.
|
**kwargs: Additional detector arguments.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
@@ -40,7 +40,7 @@ def anonymize_faces(
|
|||||||
try:
|
try:
|
||||||
from uniface import RetinaFace
|
from uniface import RetinaFace
|
||||||
|
|
||||||
detector = RetinaFace(conf_thresh=conf_thresh, **kwargs)
|
detector = RetinaFace(confidence_threshold=confidence_threshold, **kwargs)
|
||||||
except ImportError as err:
|
except ImportError as err:
|
||||||
raise ImportError('Could not import RetinaFace. Please ensure UniFace is properly installed.') from err
|
raise ImportError('Could not import RetinaFace. Please ensure UniFace is properly installed.') from err
|
||||||
|
|
||||||
|
|||||||
@@ -2,12 +2,17 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import Dict, List, Tuple, Union
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import TYPE_CHECKING, ClassVar
|
||||||
|
|
||||||
import cv2
|
import cv2
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
__all__ = ['BlurFace']
|
if TYPE_CHECKING:
|
||||||
|
pass
|
||||||
|
|
||||||
|
__all__ = ['BlurFace', 'EllipticalBlur']
|
||||||
|
|
||||||
|
|
||||||
def _gaussian_blur(region: np.ndarray, strength: float = 3.0) -> np.ndarray:
|
def _gaussian_blur(region: np.ndarray, strength: float = 3.0) -> np.ndarray:
|
||||||
@@ -32,7 +37,7 @@ def _pixelate_blur(region: np.ndarray, blocks: int = 10) -> np.ndarray:
|
|||||||
return cv2.resize(temp, (w, h), interpolation=cv2.INTER_NEAREST)
|
return cv2.resize(temp, (w, h), interpolation=cv2.INTER_NEAREST)
|
||||||
|
|
||||||
|
|
||||||
def _blackout_blur(region: np.ndarray, color: Tuple[int, int, int] = (0, 0, 0)) -> np.ndarray:
|
def _blackout_blur(region: np.ndarray, color: tuple[int, int, int] = (0, 0, 0)) -> np.ndarray:
|
||||||
"""Replace region with solid color."""
|
"""Replace region with solid color."""
|
||||||
return np.full_like(region, color)
|
return np.full_like(region, color)
|
||||||
|
|
||||||
@@ -55,7 +60,7 @@ class EllipticalBlur:
|
|||||||
def __call__(
|
def __call__(
|
||||||
self,
|
self,
|
||||||
image: np.ndarray,
|
image: np.ndarray,
|
||||||
bboxes: List[Union[Tuple, List]],
|
bboxes: list[tuple | list],
|
||||||
inplace: bool = False,
|
inplace: bool = False,
|
||||||
) -> np.ndarray:
|
) -> np.ndarray:
|
||||||
if not inplace:
|
if not inplace:
|
||||||
@@ -98,14 +103,14 @@ class BlurFace:
|
|||||||
>>> anonymized = blurrer.anonymize(image, faces)
|
>>> anonymized = blurrer.anonymize(image, faces)
|
||||||
"""
|
"""
|
||||||
|
|
||||||
VALID_METHODS = {'gaussian', 'pixelate', 'blackout', 'elliptical', 'median'}
|
VALID_METHODS: ClassVar[set[str]] = {'gaussian', 'pixelate', 'blackout', 'elliptical', 'median'}
|
||||||
|
|
||||||
def __init__(
|
def __init__(
|
||||||
self,
|
self,
|
||||||
method: str = 'pixelate',
|
method: str = 'pixelate',
|
||||||
blur_strength: float = 3.0,
|
blur_strength: float = 3.0,
|
||||||
pixel_blocks: int = 15,
|
pixel_blocks: int = 15,
|
||||||
color: Tuple[int, int, int] = (0, 0, 0),
|
color: tuple[int, int, int] = (0, 0, 0),
|
||||||
margin: int = 20,
|
margin: int = 20,
|
||||||
):
|
):
|
||||||
self.method = method.lower()
|
self.method = method.lower()
|
||||||
@@ -121,6 +126,7 @@ class BlurFace:
|
|||||||
self._elliptical = EllipticalBlur(blur_strength, margin)
|
self._elliptical = EllipticalBlur(blur_strength, margin)
|
||||||
|
|
||||||
def _blur_region(self, region: np.ndarray) -> np.ndarray:
|
def _blur_region(self, region: np.ndarray) -> np.ndarray:
|
||||||
|
"""Apply blur to a single region based on the configured method."""
|
||||||
if self.method == 'gaussian':
|
if self.method == 'gaussian':
|
||||||
return _gaussian_blur(region, self._blur_strength)
|
return _gaussian_blur(region, self._blur_strength)
|
||||||
elif self.method == 'median':
|
elif self.method == 'median':
|
||||||
@@ -129,11 +135,12 @@ class BlurFace:
|
|||||||
return _pixelate_blur(region, self._pixel_blocks)
|
return _pixelate_blur(region, self._pixel_blocks)
|
||||||
elif self.method == 'blackout':
|
elif self.method == 'blackout':
|
||||||
return _blackout_blur(region, self._color)
|
return _blackout_blur(region, self._color)
|
||||||
|
return region # Fallback (should not reach here)
|
||||||
|
|
||||||
def anonymize(
|
def anonymize(
|
||||||
self,
|
self,
|
||||||
image: np.ndarray,
|
image: np.ndarray,
|
||||||
faces: List[Dict],
|
faces: list,
|
||||||
inplace: bool = False,
|
inplace: bool = False,
|
||||||
) -> np.ndarray:
|
) -> np.ndarray:
|
||||||
"""Anonymize faces in an image.
|
"""Anonymize faces in an image.
|
||||||
@@ -149,13 +156,13 @@ class BlurFace:
|
|||||||
if not faces:
|
if not faces:
|
||||||
return image if inplace else image.copy()
|
return image if inplace else image.copy()
|
||||||
|
|
||||||
bboxes = [face['bbox'] for face in faces]
|
bboxes = [face.bbox for face in faces]
|
||||||
return self.blur_regions(image, bboxes, inplace)
|
return self.blur_regions(image, bboxes, inplace)
|
||||||
|
|
||||||
def blur_regions(
|
def blur_regions(
|
||||||
self,
|
self,
|
||||||
image: np.ndarray,
|
image: np.ndarray,
|
||||||
bboxes: List[Union[Tuple, List]],
|
bboxes: list[tuple | list],
|
||||||
inplace: bool = False,
|
inplace: bool = False,
|
||||||
) -> np.ndarray:
|
) -> np.ndarray:
|
||||||
"""Blur specific rectangular regions in an image.
|
"""Blur specific rectangular regions in an image.
|
||||||
|
|||||||
@@ -34,10 +34,7 @@ def create_recognizer(method: str = 'arcface', **kwargs) -> BaseRecognizer:
|
|||||||
|
|
||||||
>>> # Create a specific MobileFace recognizer
|
>>> # Create a specific MobileFace recognizer
|
||||||
>>> from uniface.constants import MobileFaceWeights
|
>>> from uniface.constants import MobileFaceWeights
|
||||||
>>> recognizer = create_recognizer(
|
>>> recognizer = create_recognizer('mobileface', model_name=MobileFaceWeights.MNET_V2)
|
||||||
... 'mobileface',
|
|
||||||
... model_name=MobileFaceWeights.MNET_V2
|
|
||||||
... )
|
|
||||||
|
|
||||||
>>> # Create a SphereFace recognizer
|
>>> # Create a SphereFace recognizer
|
||||||
>>> recognizer = create_recognizer('sphereface')
|
>>> recognizer = create_recognizer('sphereface')
|
||||||
@@ -55,4 +52,4 @@ def create_recognizer(method: str = 'arcface', **kwargs) -> BaseRecognizer:
|
|||||||
raise ValueError(f"Unsupported method: '{method}'. Available: {available}")
|
raise ValueError(f"Unsupported method: '{method}'. Available: {available}")
|
||||||
|
|
||||||
|
|
||||||
__all__ = ['create_recognizer', 'BaseRecognizer', 'ArcFace', 'MobileFace', 'SphereFace']
|
__all__ = ['ArcFace', 'BaseRecognizer', 'MobileFace', 'SphereFace', 'create_recognizer']
|
||||||
|
|||||||
@@ -2,9 +2,10 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
from abc import ABC, abstractmethod
|
from abc import ABC, abstractmethod
|
||||||
from dataclasses import dataclass
|
from dataclasses import dataclass
|
||||||
from typing import List, Tuple, Union
|
|
||||||
|
|
||||||
import cv2
|
import cv2
|
||||||
import numpy as np
|
import numpy as np
|
||||||
@@ -13,16 +14,22 @@ from uniface.face_utils import face_alignment
|
|||||||
from uniface.log import Logger
|
from uniface.log import Logger
|
||||||
from uniface.onnx_utils import create_onnx_session
|
from uniface.onnx_utils import create_onnx_session
|
||||||
|
|
||||||
|
__all__ = ['BaseRecognizer', 'PreprocessConfig']
|
||||||
|
|
||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
class PreprocessConfig:
|
class PreprocessConfig:
|
||||||
"""
|
"""Configuration for preprocessing images before feeding them into the model.
|
||||||
Configuration for preprocessing images before feeding them into the model.
|
|
||||||
|
Attributes:
|
||||||
|
input_mean: Mean value(s) for normalization.
|
||||||
|
input_std: Standard deviation value(s) for normalization.
|
||||||
|
input_size: Target image size as (height, width).
|
||||||
"""
|
"""
|
||||||
|
|
||||||
input_mean: Union[float, List[float]] = 127.5
|
input_mean: float | list[float] = 127.5
|
||||||
input_std: Union[float, List[float]] = 127.5
|
input_std: float | list[float] = 127.5
|
||||||
input_size: Tuple[int, int] = (112, 112)
|
input_size: tuple[int, int] = (112, 112)
|
||||||
|
|
||||||
|
|
||||||
class BaseRecognizer(ABC):
|
class BaseRecognizer(ABC):
|
||||||
@@ -94,7 +101,7 @@ class BaseRecognizer(ABC):
|
|||||||
"""
|
"""
|
||||||
resized_img = cv2.resize(face_img, self.input_size)
|
resized_img = cv2.resize(face_img, self.input_size)
|
||||||
|
|
||||||
if isinstance(self.input_std, (list, tuple)):
|
if isinstance(self.input_std, list | tuple):
|
||||||
# Per-channel normalization
|
# Per-channel normalization
|
||||||
rgb_img = cv2.cvtColor(resized_img, cv2.COLOR_BGR2RGB).astype(np.float32)
|
rgb_img = cv2.cvtColor(resized_img, cv2.COLOR_BGR2RGB).astype(np.float32)
|
||||||
normalized_img = (rgb_img - np.array(self.input_mean, dtype=np.float32)) / np.array(
|
normalized_img = (rgb_img - np.array(self.input_mean, dtype=np.float32)) / np.array(
|
||||||
@@ -116,13 +123,14 @@ class BaseRecognizer(ABC):
|
|||||||
|
|
||||||
return blob
|
return blob
|
||||||
|
|
||||||
def get_embedding(self, image: np.ndarray, landmarks: np.ndarray = None) -> np.ndarray:
|
def get_embedding(self, image: np.ndarray, landmarks: np.ndarray | None = None) -> np.ndarray:
|
||||||
"""
|
"""Extract face embedding from an image.
|
||||||
Extracts face embedding from an image.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image: Input face image (BGR format). If already aligned (112x112), landmarks can be None.
|
image: Input face image in BGR format. If already aligned (112x112),
|
||||||
landmarks: Facial landmarks (5 points for alignment). Optional if image is already aligned.
|
landmarks can be None.
|
||||||
|
landmarks: Facial landmarks (5 points for alignment). Optional if
|
||||||
|
image is already aligned.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Face embedding vector (typically 512-dimensional).
|
Face embedding vector (typically 512-dimensional).
|
||||||
@@ -141,15 +149,14 @@ class BaseRecognizer(ABC):
|
|||||||
return embedding
|
return embedding
|
||||||
|
|
||||||
def get_normalized_embedding(self, image: np.ndarray, landmarks: np.ndarray) -> np.ndarray:
|
def get_normalized_embedding(self, image: np.ndarray, landmarks: np.ndarray) -> np.ndarray:
|
||||||
"""
|
"""Extract an L2-normalized face embedding vector from an image.
|
||||||
Extracts a l2 normalized face embedding vector from an image.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image: Input face image (BGR format).
|
image: Input face image in BGR format.
|
||||||
landmarks: Facial landmarks (5 points for alignment).
|
landmarks: Facial landmarks (5 points for alignment).
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Normalized face embedding vector (typically 512-dimensional).
|
L2-normalized face embedding vector (typically 512-dimensional).
|
||||||
"""
|
"""
|
||||||
embedding = self.get_embedding(image, landmarks)
|
embedding = self.get_embedding(image, landmarks)
|
||||||
norm = np.linalg.norm(embedding)
|
norm = np.linalg.norm(embedding)
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import Optional
|
from __future__ import annotations
|
||||||
|
|
||||||
from uniface.constants import ArcFaceWeights, MobileFaceWeights, SphereFaceWeights
|
from uniface.constants import ArcFaceWeights, MobileFaceWeights, SphereFaceWeights
|
||||||
from uniface.model_store import verify_model_weights
|
from uniface.model_store import verify_model_weights
|
||||||
@@ -34,7 +34,7 @@ class ArcFace(BaseRecognizer):
|
|||||||
def __init__(
|
def __init__(
|
||||||
self,
|
self,
|
||||||
model_name: ArcFaceWeights = ArcFaceWeights.MNET,
|
model_name: ArcFaceWeights = ArcFaceWeights.MNET,
|
||||||
preprocessing: Optional[PreprocessConfig] = None,
|
preprocessing: PreprocessConfig | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
if preprocessing is None:
|
if preprocessing is None:
|
||||||
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
|
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
|
||||||
@@ -64,7 +64,7 @@ class MobileFace(BaseRecognizer):
|
|||||||
def __init__(
|
def __init__(
|
||||||
self,
|
self,
|
||||||
model_name: MobileFaceWeights = MobileFaceWeights.MNET_V2,
|
model_name: MobileFaceWeights = MobileFaceWeights.MNET_V2,
|
||||||
preprocessing: Optional[PreprocessConfig] = None,
|
preprocessing: PreprocessConfig | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
if preprocessing is None:
|
if preprocessing is None:
|
||||||
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
|
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
|
||||||
@@ -94,7 +94,7 @@ class SphereFace(BaseRecognizer):
|
|||||||
def __init__(
|
def __init__(
|
||||||
self,
|
self,
|
||||||
model_name: SphereFaceWeights = SphereFaceWeights.SPHERE20,
|
model_name: SphereFaceWeights = SphereFaceWeights.SPHERE20,
|
||||||
preprocessing: Optional[PreprocessConfig] = None,
|
preprocessing: PreprocessConfig | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
if preprocessing is None:
|
if preprocessing is None:
|
||||||
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
|
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import Optional
|
from __future__ import annotations
|
||||||
|
|
||||||
from uniface.constants import MiniFASNetWeights
|
from uniface.constants import MiniFASNetWeights
|
||||||
|
|
||||||
@@ -19,46 +19,27 @@ __all__ = [
|
|||||||
|
|
||||||
def create_spoofer(
|
def create_spoofer(
|
||||||
model_name: MiniFASNetWeights = MiniFASNetWeights.V2,
|
model_name: MiniFASNetWeights = MiniFASNetWeights.V2,
|
||||||
scale: Optional[float] = None,
|
scale: float | None = None,
|
||||||
) -> MiniFASNet:
|
) -> MiniFASNet:
|
||||||
"""
|
"""Factory function to create a face anti-spoofing model.
|
||||||
Factory function to create a face anti-spoofing model.
|
|
||||||
|
|
||||||
This is a convenience function that creates a MiniFASNet instance
|
This is a convenience function that creates a MiniFASNet instance
|
||||||
with the specified model variant and optional custom scale.
|
with the specified model variant and optional custom scale.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
model_name (MiniFASNetWeights): The model variant to use.
|
model_name: The model variant to use. Options:
|
||||||
Options:
|
|
||||||
- MiniFASNetWeights.V2: Improved version (default), uses scale=2.7
|
- MiniFASNetWeights.V2: Improved version (default), uses scale=2.7
|
||||||
- MiniFASNetWeights.V1SE: Squeeze-and-excitation version, uses scale=4.0
|
- MiniFASNetWeights.V1SE: Squeeze-and-excitation version, uses scale=4.0
|
||||||
Defaults to MiniFASNetWeights.V2.
|
scale: Custom crop scale factor for face region. If None, uses the
|
||||||
scale (Optional[float]): Custom crop scale factor for face region.
|
default scale for the selected model variant.
|
||||||
If None, uses the default scale for the selected model variant.
|
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
MiniFASNet: An initialized face anti-spoofing model.
|
An initialized face anti-spoofing model.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
>>> from uniface.spoofing import create_spoofer, MiniFASNetWeights
|
>>> from uniface.spoofing import create_spoofer, MiniFASNetWeights
|
||||||
>>> from uniface import RetinaFace
|
|
||||||
>>>
|
|
||||||
>>> # Create with default settings (V2 model)
|
|
||||||
>>> spoofer = create_spoofer()
|
>>> spoofer = create_spoofer()
|
||||||
>>>
|
>>> label_idx, score = spoofer.predict(image, face.bbox)
|
||||||
>>> # Create with V1SE model
|
>>> # label_idx: 0 = Fake, 1 = Real
|
||||||
>>> spoofer = create_spoofer(model_name=MiniFASNetWeights.V1SE)
|
|
||||||
>>>
|
|
||||||
>>> # Create with custom scale
|
|
||||||
>>> spoofer = create_spoofer(scale=3.0)
|
|
||||||
>>>
|
|
||||||
>>> # Use with face detector
|
|
||||||
>>> detector = RetinaFace()
|
|
||||||
>>> faces = detector.detect(image)
|
|
||||||
>>> for face in faces:
|
|
||||||
... label_idx, score = spoofer.predict(image, face['bbox'])
|
|
||||||
... # label_idx: 0 = Fake, 1 = Real
|
|
||||||
... label = 'Real' if label_idx == 1 else 'Fake'
|
|
||||||
... print(f'{label}: {score:.2%}')
|
|
||||||
"""
|
"""
|
||||||
return MiniFASNet(model_name=model_name, scale=scale)
|
return MiniFASNet(model_name=model_name, scale=scale)
|
||||||
|
|||||||
@@ -3,7 +3,6 @@
|
|||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from abc import ABC, abstractmethod
|
from abc import ABC, abstractmethod
|
||||||
from typing import List, Tuple, Union
|
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
@@ -36,7 +35,7 @@ class BaseSpoofer(ABC):
|
|||||||
raise NotImplementedError('Subclasses must implement the _initialize_model method.')
|
raise NotImplementedError('Subclasses must implement the _initialize_model method.')
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
def preprocess(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> np.ndarray:
|
def preprocess(self, image: np.ndarray, bbox: list | np.ndarray) -> np.ndarray:
|
||||||
"""
|
"""
|
||||||
Preprocess the input image for model inference.
|
Preprocess the input image for model inference.
|
||||||
|
|
||||||
@@ -55,7 +54,7 @@ class BaseSpoofer(ABC):
|
|||||||
raise NotImplementedError('Subclasses must implement the preprocess method.')
|
raise NotImplementedError('Subclasses must implement the preprocess method.')
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
def postprocess(self, outputs: np.ndarray) -> Tuple[int, float]:
|
def postprocess(self, outputs: np.ndarray) -> tuple[int, float]:
|
||||||
"""
|
"""
|
||||||
Postprocess raw model outputs into prediction result.
|
Postprocess raw model outputs into prediction result.
|
||||||
|
|
||||||
@@ -73,7 +72,7 @@ class BaseSpoofer(ABC):
|
|||||||
raise NotImplementedError('Subclasses must implement the postprocess method.')
|
raise NotImplementedError('Subclasses must implement the postprocess method.')
|
||||||
|
|
||||||
@abstractmethod
|
@abstractmethod
|
||||||
def predict(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> Tuple[int, float]:
|
def predict(self, image: np.ndarray, bbox: list | np.ndarray) -> tuple[int, float]:
|
||||||
"""
|
"""
|
||||||
Perform end-to-end anti-spoofing prediction on a face.
|
Perform end-to-end anti-spoofing prediction on a face.
|
||||||
|
|
||||||
@@ -95,13 +94,13 @@ class BaseSpoofer(ABC):
|
|||||||
>>> detector = RetinaFace()
|
>>> detector = RetinaFace()
|
||||||
>>> faces = detector.detect(image)
|
>>> faces = detector.detect(image)
|
||||||
>>> for face in faces:
|
>>> for face in faces:
|
||||||
... label_idx, score = spoofer.predict(image, face['bbox'])
|
... label_idx, score = spoofer.predict(image, face.bbox)
|
||||||
... label = 'Real' if label_idx == 1 else 'Fake'
|
... label = 'Real' if label_idx == 1 else 'Fake'
|
||||||
... print(f'{label}: {score:.2%}')
|
... print(f'{label}: {score:.2%}')
|
||||||
"""
|
"""
|
||||||
raise NotImplementedError('Subclasses must implement the predict method.')
|
raise NotImplementedError('Subclasses must implement the predict method.')
|
||||||
|
|
||||||
def __call__(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> Tuple[int, float]:
|
def __call__(self, image: np.ndarray, bbox: list | np.ndarray) -> tuple[int, float]:
|
||||||
"""
|
"""
|
||||||
Provides a convenient, callable shortcut for the `predict` method.
|
Provides a convenient, callable shortcut for the `predict` method.
|
||||||
|
|
||||||
|
|||||||
@@ -2,7 +2,6 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import List, Optional, Tuple, Union
|
|
||||||
|
|
||||||
import cv2
|
import cv2
|
||||||
import numpy as np
|
import numpy as np
|
||||||
@@ -59,7 +58,7 @@ class MiniFASNet(BaseSpoofer):
|
|||||||
>>> # Detect faces and check if they are real
|
>>> # Detect faces and check if they are real
|
||||||
>>> faces = detector.detect(image)
|
>>> faces = detector.detect(image)
|
||||||
>>> for face in faces:
|
>>> for face in faces:
|
||||||
... label_idx, score = spoofer.predict(image, face['bbox'])
|
... label_idx, score = spoofer.predict(image, face.bbox)
|
||||||
... # label_idx: 0 = Fake, 1 = Real
|
... # label_idx: 0 = Fake, 1 = Real
|
||||||
... label = 'Real' if label_idx == 1 else 'Fake'
|
... label = 'Real' if label_idx == 1 else 'Fake'
|
||||||
... print(f'{label}: {score:.2%}')
|
... print(f'{label}: {score:.2%}')
|
||||||
@@ -68,7 +67,7 @@ class MiniFASNet(BaseSpoofer):
|
|||||||
def __init__(
|
def __init__(
|
||||||
self,
|
self,
|
||||||
model_name: MiniFASNetWeights = MiniFASNetWeights.V2,
|
model_name: MiniFASNetWeights = MiniFASNetWeights.V2,
|
||||||
scale: Optional[float] = None,
|
scale: float | None = None,
|
||||||
) -> None:
|
) -> None:
|
||||||
Logger.info(f'Initializing MiniFASNet with model={model_name.name}')
|
Logger.info(f'Initializing MiniFASNet with model={model_name.name}')
|
||||||
|
|
||||||
@@ -104,12 +103,12 @@ class MiniFASNet(BaseSpoofer):
|
|||||||
Logger.error(f"Failed to load MiniFASNet model from '{self.model_path}'", exc_info=True)
|
Logger.error(f"Failed to load MiniFASNet model from '{self.model_path}'", exc_info=True)
|
||||||
raise RuntimeError(f'Failed to initialize MiniFASNet model: {e}') from e
|
raise RuntimeError(f'Failed to initialize MiniFASNet model: {e}') from e
|
||||||
|
|
||||||
def _xyxy_to_xywh(self, bbox: Union[List, np.ndarray]) -> List[int]:
|
def _xyxy_to_xywh(self, bbox: list | np.ndarray) -> list[int]:
|
||||||
"""Convert bounding box from [x1, y1, x2, y2] to [x, y, w, h] format."""
|
"""Convert bounding box from [x1, y1, x2, y2] to [x, y, w, h] format."""
|
||||||
x1, y1, x2, y2 = bbox[:4]
|
x1, y1, x2, y2 = bbox[:4]
|
||||||
return [int(x1), int(y1), int(x2 - x1), int(y2 - y1)]
|
return [int(x1), int(y1), int(x2 - x1), int(y2 - y1)]
|
||||||
|
|
||||||
def _crop_face(self, image: np.ndarray, bbox_xywh: List[int]) -> np.ndarray:
|
def _crop_face(self, image: np.ndarray, bbox_xywh: list[int]) -> np.ndarray:
|
||||||
"""
|
"""
|
||||||
Crop and resize face region from image using scale factor.
|
Crop and resize face region from image using scale factor.
|
||||||
|
|
||||||
@@ -147,7 +146,7 @@ class MiniFASNet(BaseSpoofer):
|
|||||||
|
|
||||||
return resized
|
return resized
|
||||||
|
|
||||||
def preprocess(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> np.ndarray:
|
def preprocess(self, image: np.ndarray, bbox: list | np.ndarray) -> np.ndarray:
|
||||||
"""
|
"""
|
||||||
Preprocess the input image for model inference.
|
Preprocess the input image for model inference.
|
||||||
|
|
||||||
@@ -181,7 +180,7 @@ class MiniFASNet(BaseSpoofer):
|
|||||||
e_x = np.exp(x - np.max(x, axis=1, keepdims=True))
|
e_x = np.exp(x - np.max(x, axis=1, keepdims=True))
|
||||||
return e_x / e_x.sum(axis=1, keepdims=True)
|
return e_x / e_x.sum(axis=1, keepdims=True)
|
||||||
|
|
||||||
def postprocess(self, outputs: np.ndarray) -> Tuple[int, float]:
|
def postprocess(self, outputs: np.ndarray) -> tuple[int, float]:
|
||||||
"""
|
"""
|
||||||
Postprocess raw model outputs into prediction result.
|
Postprocess raw model outputs into prediction result.
|
||||||
|
|
||||||
@@ -202,7 +201,7 @@ class MiniFASNet(BaseSpoofer):
|
|||||||
|
|
||||||
return label_idx, score
|
return label_idx, score
|
||||||
|
|
||||||
def predict(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> Tuple[int, float]:
|
def predict(self, image: np.ndarray, bbox: list | np.ndarray) -> tuple[int, float]:
|
||||||
"""
|
"""
|
||||||
Perform end-to-end anti-spoofing prediction on a face.
|
Perform end-to-end anti-spoofing prediction on a face.
|
||||||
|
|
||||||
|
|||||||
@@ -2,11 +2,26 @@
|
|||||||
# Author: Yakhyokhuja Valikhujaev
|
# Author: Yakhyokhuja Valikhujaev
|
||||||
# GitHub: https://github.com/yakhyo
|
# GitHub: https://github.com/yakhyo
|
||||||
|
|
||||||
from typing import List, Tuple, Union
|
"""Visualization utilities for UniFace.
|
||||||
|
|
||||||
|
This module provides functions for drawing detection results, gaze directions,
|
||||||
|
and face parsing segmentation maps on images.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
import cv2
|
import cv2
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
|
||||||
|
__all__ = [
|
||||||
|
'FACE_PARSING_COLORS',
|
||||||
|
'FACE_PARSING_LABELS',
|
||||||
|
'draw_detections',
|
||||||
|
'draw_fancy_bbox',
|
||||||
|
'draw_gaze',
|
||||||
|
'vis_parsing_maps',
|
||||||
|
]
|
||||||
|
|
||||||
# Face parsing component names (19 classes)
|
# Face parsing component names (19 classes)
|
||||||
FACE_PARSING_LABELS = [
|
FACE_PARSING_LABELS = [
|
||||||
'background',
|
'background',
|
||||||
@@ -57,23 +72,25 @@ FACE_PARSING_COLORS = [
|
|||||||
def draw_detections(
|
def draw_detections(
|
||||||
*,
|
*,
|
||||||
image: np.ndarray,
|
image: np.ndarray,
|
||||||
bboxes: Union[List[np.ndarray], List[List[float]]],
|
bboxes: list[np.ndarray] | list[list[float]],
|
||||||
scores: Union[np.ndarray, List[float]],
|
scores: np.ndarray | list[float],
|
||||||
landmarks: Union[List[np.ndarray], List[List[List[float]]]],
|
landmarks: list[np.ndarray] | list[list[list[float]]],
|
||||||
vis_threshold: float = 0.6,
|
vis_threshold: float = 0.6,
|
||||||
draw_score: bool = False,
|
draw_score: bool = False,
|
||||||
fancy_bbox: bool = True,
|
fancy_bbox: bool = True,
|
||||||
):
|
) -> None:
|
||||||
"""
|
"""Draw bounding boxes, landmarks, and optional scores on an image.
|
||||||
Draws bounding boxes, landmarks, and optional scores on an image.
|
|
||||||
|
Modifies the image in-place.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image: Input image to draw on.
|
image: Input image to draw on (modified in-place).
|
||||||
bboxes: List of bounding boxes [x1, y1, x2, y2].
|
bboxes: List of bounding boxes as [x1, y1, x2, y2].
|
||||||
scores: List of confidence scores.
|
scores: List of confidence scores.
|
||||||
landmarks: List of landmark sets with shape (5, 2).
|
landmarks: List of landmark sets with shape (5, 2).
|
||||||
vis_threshold: Confidence threshold for filtering. Defaults to 0.6.
|
vis_threshold: Confidence threshold for filtering. Defaults to 0.6.
|
||||||
draw_score: Whether to draw confidence scores. Defaults to False.
|
draw_score: Whether to draw confidence scores. Defaults to False.
|
||||||
|
fancy_bbox: Use corner-style bounding boxes. Defaults to True.
|
||||||
"""
|
"""
|
||||||
colors = [(0, 0, 255), (0, 255, 255), (255, 0, 255), (0, 255, 0), (255, 0, 0)]
|
colors = [(0, 0, 255), (0, 255, 255), (255, 0, 255), (0, 255, 0), (255, 0, 0)]
|
||||||
|
|
||||||
@@ -134,19 +151,18 @@ def draw_detections(
|
|||||||
def draw_fancy_bbox(
|
def draw_fancy_bbox(
|
||||||
image: np.ndarray,
|
image: np.ndarray,
|
||||||
bbox: np.ndarray,
|
bbox: np.ndarray,
|
||||||
color: Tuple[int, int, int] = (0, 255, 0),
|
color: tuple[int, int, int] = (0, 255, 0),
|
||||||
thickness: int = 3,
|
thickness: int = 3,
|
||||||
proportion: float = 0.2,
|
proportion: float = 0.2,
|
||||||
):
|
) -> None:
|
||||||
"""
|
"""Draw a bounding box with fancy corners on an image.
|
||||||
Draws a bounding box with fancy corners on an image.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image: Input image to draw on.
|
image: Input image to draw on (modified in-place).
|
||||||
bbox: Bounding box coordinates [x1, y1, x2, y2].
|
bbox: Bounding box coordinates [x1, y1, x2, y2].
|
||||||
color: Color of the bounding box. Defaults to green.
|
color: Color of the bounding box in BGR. Defaults to green.
|
||||||
thickness: Thickness of the bounding box lines. Defaults to 3.
|
thickness: Thickness of the corner lines. Defaults to 3.
|
||||||
proportion: Proportion of the corner length to the width/height of the bounding box. Defaults to 0.2.
|
proportion: Proportion of corner length to box dimensions. Defaults to 0.2.
|
||||||
"""
|
"""
|
||||||
x1, y1, x2, y2 = map(int, bbox)
|
x1, y1, x2, y2 = map(int, bbox)
|
||||||
width = x2 - x1
|
width = x2 - x1
|
||||||
@@ -177,15 +193,14 @@ def draw_fancy_bbox(
|
|||||||
def draw_gaze(
|
def draw_gaze(
|
||||||
image: np.ndarray,
|
image: np.ndarray,
|
||||||
bbox: np.ndarray,
|
bbox: np.ndarray,
|
||||||
pitch: np.ndarray,
|
pitch: np.ndarray | float,
|
||||||
yaw: np.ndarray,
|
yaw: np.ndarray | float,
|
||||||
*,
|
*,
|
||||||
draw_bbox: bool = True,
|
draw_bbox: bool = True,
|
||||||
fancy_bbox: bool = True,
|
fancy_bbox: bool = True,
|
||||||
draw_angles: bool = True,
|
draw_angles: bool = True,
|
||||||
):
|
) -> None:
|
||||||
"""
|
"""Draw gaze direction with optional bounding box on an image.
|
||||||
Draws gaze direction with optional bounding box on an image.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image: Input image to draw on (modified in-place).
|
image: Input image to draw on (modified in-place).
|
||||||
@@ -194,7 +209,7 @@ def draw_gaze(
|
|||||||
yaw: Horizontal gaze angle in radians.
|
yaw: Horizontal gaze angle in radians.
|
||||||
draw_bbox: Whether to draw the bounding box. Defaults to True.
|
draw_bbox: Whether to draw the bounding box. Defaults to True.
|
||||||
fancy_bbox: Use fancy corner-style bbox. Defaults to True.
|
fancy_bbox: Use fancy corner-style bbox. Defaults to True.
|
||||||
draw_angles: Whether to display pitch/yaw values as text. Defaults to False.
|
draw_angles: Whether to display pitch/yaw values as text. Defaults to True.
|
||||||
"""
|
"""
|
||||||
x_min, y_min, x_max, y_max = map(int, bbox[:4])
|
x_min, y_min, x_max, y_max = map(int, bbox[:4])
|
||||||
|
|
||||||
@@ -275,8 +290,7 @@ def vis_parsing_maps(
|
|||||||
save_image: bool = False,
|
save_image: bool = False,
|
||||||
save_path: str = 'result.png',
|
save_path: str = 'result.png',
|
||||||
) -> np.ndarray:
|
) -> np.ndarray:
|
||||||
"""
|
"""Visualize face parsing segmentation mask by overlaying colored regions.
|
||||||
Visualizes face parsing segmentation mask by overlaying colored regions on the image.
|
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
image: Input face image in RGB format with shape (H, W, 3).
|
image: Input face image in RGB format with shape (H, W, 3).
|
||||||
@@ -286,18 +300,15 @@ def vis_parsing_maps(
|
|||||||
save_path: Path to save the visualization if save_image is True.
|
save_path: Path to save the visualization if save_image is True.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
np.ndarray: Blended image with segmentation overlay in BGR format.
|
Blended image with segmentation overlay in BGR format.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
>>> import cv2
|
>>> import cv2
|
||||||
>>> from uniface.parsing import BiSeNet
|
>>> from uniface.parsing import BiSeNet
|
||||||
>>> from uniface.visualization import vis_parsing_maps
|
>>> from uniface.visualization import vis_parsing_maps
|
||||||
>>>
|
|
||||||
>>> parser = BiSeNet()
|
>>> parser = BiSeNet()
|
||||||
>>> face_image = cv2.imread('face.jpg')
|
>>> face_image = cv2.imread('face.jpg')
|
||||||
>>> mask = parser.parse(face_image)
|
>>> mask = parser.parse(face_image)
|
||||||
>>>
|
|
||||||
>>> # Visualize
|
|
||||||
>>> face_rgb = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
|
>>> face_rgb = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
|
||||||
>>> result = vis_parsing_maps(face_rgb, mask)
|
>>> result = vis_parsing_maps(face_rgb, mask)
|
||||||
>>> cv2.imwrite('parsed_face.jpg', result)
|
>>> cv2.imwrite('parsed_face.jpg', result)
|
||||||
|
|||||||
Reference in New Issue
Block a user