17 Commits

Author SHA1 Message Date
yakhyo
39b50b62bd chore: Update the version 2025-11-26 00:15:45 +09:00
yakhyo
db7532ecf1 feat: Supress the warning and give info about onnx backend 2025-11-26 00:06:39 +09:00
yakhyo
4b8dc2c0f9 feat: Update jupyter notebooks to match the latest version of UniFace 2025-11-26 00:06:13 +09:00
yakhyo
0a2a10e165 docs: Update README.md 2025-11-26 00:05:40 +09:00
yakhyo
84cda5f56c chore: Code style formatting changes 2025-11-26 00:05:24 +09:00
yakhyo
0771a7959a chore: Code style formatting changes 2025-11-25 23:45:50 +09:00
yakhyo
15947eb605 chore: Change import order and style changes by Ruff 2025-11-25 23:35:00 +09:00
yakhyo
1ccc4f6b77 chore: Update print 2025-11-25 23:28:42 +09:00
yakhyo
189755a1a6 ref: Update some refactoring files for testing 2025-11-25 23:19:45 +09:00
Yakhyokhuja Valikhujaev
11363fe0a8 Merge pull request #18 from yakhyo/feat-20251115
ref: Add comprehensive test suite and enhance model functionality
2025-11-15 21:32:10 +09:00
yakhyo
fe3e70a352 release: Update release version to v1.1.0 2025-11-15 21:31:56 +09:00
yakhyo
8e218321a4 fix: Fix test issue where landmark variable name wrongly used 2025-11-15 21:28:21 +09:00
yakhyo
2c78f39e5d ref: Add comprehensive test suite and enhance model functionality
- Add new test files for age_gender, factory, landmark, recognition, scrfd, and utils
- Add new scripts for age_gender, landmarks, and video detection
- Update documentation in README.md, MODELS.md, QUICKSTART.md
- Improve model constants and face utilities
- Update detection models (retinaface, scrfd) with enhanced functionality
- Update project configuration in pyproject.toml
2025-11-15 21:09:37 +09:00
yakhyo
df673c4a3f remove release guide 2025-11-11 21:38:57 +09:00
yakhyo
496de7a491 Release v1.0.0: First stable release with complete feature set
- Added high-level pipeline API (FacePipeline, process_faces, compare_faces)
- Implemented face detection (RetinaFace, SCRFD)
- Implemented face recognition (ArcFace, MobileFace, SphereFace)
- Added landmark detection (106 points)
- Added attribute analysis (age, gender, emotion)
- Comprehensive examples and documentation
- Cleaned up legacy code and simplified documentation structure
2025-11-11 21:38:57 +09:00
yakhyo
89a05e4689 fix ci badge reference 2025-11-08 01:46:28 +09:00
yakhyo
d3c2d959d0 add permissions for github release creation 2025-11-08 01:43:47 +09:00
48 changed files with 2503 additions and 1213 deletions

View File

@@ -66,6 +66,9 @@ jobs:
publish:
runs-on: ubuntu-latest
needs: [validate, test]
permissions:
contents: write
id-token: write
environment:
name: pypi
url: https://pypi.org/project/uniface/

View File

@@ -113,13 +113,15 @@ embedding = recognizer.get_normalized_embedding(image, landmarks)
Lightweight face recognition optimized for mobile devices.
| Model Name | Backbone | Params | Size | Use Case |
|-----------------|-----------------|--------|------|--------------------|
| `MNET_025` | MobileNetV1 0.25| 0.2M | 1MB | Ultra-lightweight |
| `MNET_V2` ⭐ | MobileNetV2 | 1.0M | 4MB | **Mobile/Edge** |
| `MNET_V3_SMALL` | MobileNetV3-S | 0.8M | 3MB | Mobile optimized |
| `MNET_V3_LARGE` | MobileNetV3-L | 2.5M | 10MB | Balanced mobile |
| Model Name | Backbone | Params | Size | LFW | CALFW | CPLFW | AgeDB-30 | Use Case |
|-----------------|-----------------|--------|------|-------|-------|-------|----------|--------------------|
| `MNET_025` | MobileNetV1 0.25| 0.36M | 1MB | 98.76%| 92.02%| 82.37%| 90.02% | Ultra-lightweight |
| `MNET_V2` ⭐ | MobileNetV2 | 2.29M | 4MB | 99.55%| 94.87%| 86.89%| 95.16% | **Mobile/Edge** |
| `MNET_V3_SMALL` | MobileNetV3-S | 1.25M | 3MB | 99.30%| 93.77%| 85.29%| 92.79% | Mobile optimized |
| `MNET_V3_LARGE` | MobileNetV3-L | 3.52M | 10MB | 99.53%| 94.56%| 86.79%| 95.13% | Balanced mobile |
**Dataset**: Trained on MS1M-V2 (5.8M images, 85K identities)
**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks
**Note**: These models are lightweight alternatives to ArcFace for resource-constrained environments
#### Usage
@@ -138,12 +140,14 @@ recognizer = MobileFace(model_name=MobileFaceWeights.MNET_V2)
Face recognition using angular softmax loss.
| Model Name | Backbone | Params | Size | Use Case |
|-------------|----------|--------|------|----------------------|
| `SPHERE20` | Sphere20 | 13.0M | 50MB | Research/Comparison |
| `SPHERE36` | Sphere36 | 24.2M | 92MB | Research/Comparison |
| Model Name | Backbone | Params | Size | LFW | CALFW | CPLFW | AgeDB-30 | Use Case |
|-------------|----------|--------|------|-------|-------|-------|----------|----------------------|
| `SPHERE20` | Sphere20 | 24.5M | 50MB | 99.67%| 95.61%| 88.75%| 96.58% | Research/Comparison |
| `SPHERE36` | Sphere36 | 34.6M | 92MB | 99.72%| 95.64%| 89.92%| 96.83% | Research/Comparison |
**Note**: SphereFace uses angular softmax loss, an earlier approach before ArcFace
**Dataset**: Trained on MS1M-V2 (5.8M images, 85K identities)
**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks
**Note**: SphereFace uses angular softmax loss, an earlier approach before ArcFace. These models provide good accuracy with moderate resource requirements.
#### Usage
@@ -264,10 +268,10 @@ emotion, confidence = predictor.predict(image, landmarks)
### By Hardware
#### Apple Silicon (M1/M2/M3/M4)
**Recommended**: All models work well with CoreML acceleration
**Recommended**: All models work well with ARM64 optimizations (automatically included)
```bash
pip install uniface[silicon]
pip install uniface
```
**Recommended models**:

View File

@@ -7,8 +7,8 @@ Get up and running with UniFace in 5 minutes! This guide covers the most common
## Installation
```bash
# macOS (Apple Silicon)
pip install uniface[silicon]
# macOS (Apple Silicon) - automatically includes ARM64 optimizations
pip install uniface
# Linux/Windows with NVIDIA GPU
pip install uniface[gpu]
@@ -114,9 +114,9 @@ if faces1 and faces2:
# Interpret result
if similarity > 0.6:
print(f"Same person (similarity: {similarity:.3f})")
print(f"Same person (similarity: {similarity:.3f})")
else:
print(f"Different people (similarity: {similarity:.3f})")
print(f"Different people (similarity: {similarity:.3f})")
else:
print("No faces detected")
```
@@ -264,31 +264,46 @@ print("Done!")
Choose the right model for your use case:
### Detection Models
```python
from uniface import create_detector
from uniface.detection import RetinaFace, SCRFD
from uniface.constants import RetinaFaceWeights, SCRFDWeights
# Fast detection (mobile/edge devices)
detector = create_detector(
'retinaface',
detector = RetinaFace(
model_name=RetinaFaceWeights.MNET_025,
conf_thresh=0.7
)
# Balanced (recommended)
detector = create_detector(
'retinaface',
detector = RetinaFace(
model_name=RetinaFaceWeights.MNET_V2
)
# High accuracy (server/GPU)
detector = create_detector(
'scrfd',
detector = SCRFD(
model_name=SCRFDWeights.SCRFD_10G_KPS,
conf_thresh=0.5
)
```
### Recognition Models
```python
from uniface import ArcFace, MobileFace, SphereFace
from uniface.constants import MobileFaceWeights, SphereFaceWeights
# ArcFace (recommended for most use cases)
recognizer = ArcFace() # Best accuracy
# MobileFace (lightweight for mobile/edge)
recognizer = MobileFace(model_name=MobileFaceWeights.MNET_V2) # Fast, small size
# SphereFace (angular margin approach)
recognizer = SphereFace(model_name=SphereFaceWeights.SPHERE20) # Alternative method
```
---
## Common Issues
@@ -316,20 +331,22 @@ print("Available providers:", ort.get_available_providers())
### 3. Slow Performance on Mac
Make sure you installed with CoreML support:
The standard installation includes ARM64 optimizations for Apple Silicon. If performance is slow, verify you're using the ARM64 build of Python:
```bash
pip install uniface[silicon]
python -c "import platform; print(platform.machine())"
# Should show: arm64 (not x86_64)
```
### 4. Import Errors
```python
# Correct imports
from uniface import RetinaFace, ArcFace, Landmark106
from uniface.detection import create_detector
# Correct imports
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.landmark import Landmark106
# Wrong imports
# Wrong imports
from uniface import retinaface # Module, not class
```

View File

@@ -3,7 +3,7 @@
[![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
![Python](https://img.shields.io/badge/Python-3.10%2B-blue)
[![PyPI Version](https://img.shields.io/pypi/v/uniface.svg)](https://pypi.org/project/uniface/)
[![Build Status](https://github.com/yakhyo/uniface/actions/workflows/build.yml/badge.svg)](https://github.com/yakhyo/uniface/actions)
[![CI](https://github.com/yakhyo/uniface/actions/workflows/ci.yml/badge.svg)](https://github.com/yakhyo/uniface/actions)
[![Downloads](https://pepy.tech/badge/uniface)](https://pepy.tech/project/uniface)
<div align="center">
@@ -21,7 +21,7 @@
- **Face Recognition**: ArcFace, MobileFace, and SphereFace embeddings
- **Attribute Analysis**: Age, gender, and emotion detection
- **Face Alignment**: Precise alignment for downstream tasks
- **Hardware Acceleration**: CoreML (Apple Silicon), CUDA (NVIDIA), CPU fallback
- **Hardware Acceleration**: ARM64 optimizations (Apple Silicon), CUDA (NVIDIA), CPU fallback
- **Simple API**: Intuitive factory functions and clean interfaces
- **Production-Ready**: Type hints, comprehensive logging, PEP8 compliant
@@ -39,27 +39,19 @@ pip install uniface
#### macOS (Apple Silicon - M1/M2/M3/M4)
For optimal performance with **CoreML acceleration** (3-5x faster):
For Apple Silicon Macs, the standard installation automatically includes optimized ARM64 support:
```bash
# Standard installation (CPU only)
pip install uniface
# With CoreML acceleration (recommended for M-series chips)
pip install uniface[silicon]
```
**Verify CoreML is available:**
```python
import onnxruntime as ort
print(ort.get_available_providers())
# Should show: ['CoreMLExecutionProvider', 'CPUExecutionProvider']
```
The base `onnxruntime` package (included with uniface) has native Apple Silicon support with ARM64 optimizations built-in since version 1.13+.
#### Linux/Windows with NVIDIA GPU
For CUDA acceleration on NVIDIA GPUs:
```bash
# With CUDA acceleration
pip install uniface[gpu]
```
@@ -172,28 +164,29 @@ print(f"{gender}, {age} years old")
### Factory Functions (Recommended)
```python
from uniface import create_detector, create_recognizer, create_landmarker
from uniface.detection import RetinaFace, SCRFD
from uniface.recognition import ArcFace
from uniface.landmark import Landmark106
# Create detector with default settings
detector = create_detector('retinaface')
detector = RetinaFace()
# Create with custom config
detector = create_detector(
'scrfd',
detector = SCRFD(
model_name='scrfd_10g_kps',
conf_thresh=0.8,
input_size=(640, 640)
)
# Recognition and landmarks
recognizer = create_recognizer('arcface')
landmarker = create_landmarker('2d106det')
recognizer = ArcFace()
landmarker = Landmark106()
```
### Direct Model Instantiation
```python
from uniface import RetinaFace, SCRFD, ArcFace, MobileFace
from uniface import RetinaFace, SCRFD, ArcFace, MobileFace, SphereFace
from uniface.constants import RetinaFaceWeights
# Detection
@@ -206,6 +199,7 @@ detector = RetinaFace(
# Recognition
recognizer = ArcFace() # Uses default weights
recognizer = MobileFace() # Lightweight alternative
recognizer = SphereFace() # Angular softmax alternative
```
### High-Level Detection API
@@ -400,12 +394,28 @@ pip install -e ".[dev]"
# Run tests
pytest
# Format code
black uniface/
isort uniface/
```
### Code Formatting
This project uses [Ruff](https://docs.astral.sh/ruff/) for linting and formatting.
```bash
# Format code
ruff format .
# Check for linting errors
ruff check .
# Auto-fix linting errors
ruff check . --fix
```
Ruff configuration is in `pyproject.toml`. Key settings:
- Line length: 120
- Python target: 3.10+
- Import sorting: `uniface` as first-party
### Project Structure
```

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -1,6 +1,6 @@
[project]
name = "uniface"
version = "0.1.9"
version = "1.1.1"
description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Landmark Analysis, Age, and Gender Detection"
readme = "README.md"
license = { text = "MIT" }
@@ -19,9 +19,8 @@ dependencies = [
requires-python = ">=3.10"
[project.optional-dependencies]
dev = ["pytest>=7.0.0"]
dev = ["pytest>=7.0.0", "ruff>=0.4.0"]
gpu = ["onnxruntime-gpu>=1.16.0"]
silicon = ["onnxruntime-silicon>=1.16.0"]
[project.urls]
Homepage = "https://github.com/yakhyo/uniface"
@@ -36,3 +35,13 @@ packages = { find = {} }
[tool.setuptools.package-data]
"uniface" = ["*.txt", "*.md"]
[tool.ruff]
line-length = 120
target-version = "py310"
[tool.ruff.lint]
select = ["E", "F", "I", "W"]
[tool.ruff.lint.isort]
known-first-party = ["uniface"]

View File

@@ -1,18 +1,68 @@
### `download_model.py`
# Scripts
# Download all models
Scripts for testing UniFace features.
## Available Scripts
| Script | Description |
|--------|-------------|
| `run_detection.py` | Face detection on image or webcam |
| `run_age_gender.py` | Age and gender prediction |
| `run_landmarks.py` | 106-point facial landmark detection |
| `run_recognition.py` | Face embedding extraction and comparison |
| `run_face_search.py` | Real-time face matching against reference |
| `run_video_detection.py` | Face detection on video files |
| `batch_process.py` | Batch process folder of images |
| `download_model.py` | Download model weights |
| `sha256_generate.py` | Generate SHA256 hash for model files |
## Usage Examples
```bash
python scripts/download_model.py
# Face detection
python scripts/run_detection.py --image assets/test.jpg
python scripts/run_detection.py --webcam
# Age and gender
python scripts/run_age_gender.py --image assets/test.jpg
python scripts/run_age_gender.py --webcam
# Landmarks
python scripts/run_landmarks.py --image assets/test.jpg
python scripts/run_landmarks.py --webcam
# Face recognition (extract embedding)
python scripts/run_recognition.py --image assets/test.jpg
# Face comparison
python scripts/run_recognition.py --image1 face1.jpg --image2 face2.jpg
# Face search (match webcam against reference)
python scripts/run_face_search.py --image reference.jpg
# Video processing
python scripts/run_video_detection.py --input video.mp4 --output output.mp4
# Batch processing
python scripts/batch_process.py --input images/ --output results/
# Download models
python scripts/download_model.py --model-type retinaface
python scripts/download_model.py # downloads all
```
# Download just RESNET18
## Common Options
| Option | Description |
|--------|-------------|
| `--image` | Path to input image |
| `--webcam` | Use webcam instead of image |
| `--detector` | Choose detector: `retinaface` or `scrfd` |
| `--threshold` | Visualization confidence threshold (default: 0.6) |
| `--save_dir` | Output directory (default: `outputs`) |
## Quick Test
```bash
python scripts/download_model.py --model RESNET18
python scripts/run_detection.py --image assets/test.jpg
```
### `run_inference.py`
```bash
python scripts/run_inference.py --image assets/test.jpg --model MNET_V2 --iterations 10
```

View File

@@ -1,389 +0,0 @@
# Testing Scripts Guide
Complete guide to testing all scripts in the `scripts/` directory.
---
## 📁 Available Scripts
1. **download_model.py** - Download and verify model weights
2. **run_detection.py** - Face detection on images
3. **run_recognition.py** - Face recognition (extract embeddings)
4. **run_face_search.py** - Real-time face matching with webcam
5. **sha256_generate.py** - Generate SHA256 checksums for models
---
## Testing Each Script
### 1. Test Model Download
```bash
# Download a specific model
python scripts/download_model.py --model MNET_V2
# Download all RetinaFace models (takes ~5 minutes, ~200MB)
python scripts/download_model.py
# Verify models are cached
ls -lh ~/.uniface/models/
```
**Expected Output:**
```
📥 Downloading model: retinaface_mnet_v2
2025-11-08 00:00:00 - INFO - Downloading model 'RetinaFaceWeights.MNET_V2' from https://...
Downloading ~/.uniface/models/retinaface_mnet_v2.onnx: 100%|████| 3.5M/3.5M
2025-11-08 00:00:05 - INFO - Successfully downloaded 'RetinaFaceWeights.MNET_V2'
✅ All requested weights are ready and verified.
```
---
### 2. Test Face Detection
```bash
# Basic detection
python scripts/run_detection.py --image assets/test.jpg
# With custom settings
python scripts/run_detection.py \
--image assets/test.jpg \
--method scrfd \
--threshold 0.7 \
--save_dir outputs
# Benchmark mode (100 iterations)
python scripts/run_detection.py \
--image assets/test.jpg \
--iterations 100
```
**Expected Output:**
```
Initializing detector: retinaface
2025-11-08 00:00:00 - INFO - Initializing RetinaFace with model=RetinaFaceWeights.MNET_V2...
2025-11-08 00:00:01 - INFO - CoreML acceleration enabled (Apple Silicon)
✅ Output saved at: outputs/test_out.jpg
[1/1] ⏱️ Inference time: 0.0234 seconds
```
**Verify Output:**
```bash
# Check output image was created
ls -lh outputs/test_out.jpg
# View the image (macOS)
open outputs/test_out.jpg
```
---
### 3. Test Face Recognition (Embedding Extraction)
```bash
# Extract embeddings from an image
python scripts/run_recognition.py --image assets/test.jpg
# With different models
python scripts/run_recognition.py \
--image assets/test.jpg \
--detector scrfd \
--recognizer mobileface
```
**Expected Output:**
```
Initializing detector: retinaface
Initializing recognizer: arcface
2025-11-08 00:00:00 - INFO - Successfully initialized face encoder from ~/.uniface/models/w600k_mbf.onnx
Detected 1 face(s). Extracting embeddings for the first face...
- Embedding shape: (1, 512)
- L2 norm of unnormalized embedding: 64.2341
- L2 norm of normalized embedding: 1.0000
```
---
### 4. Test Real-Time Face Search (Webcam)
**Prerequisites:**
- Webcam connected
- Reference image with a clear face
```bash
# Basic usage
python scripts/run_face_search.py --image assets/test.jpg
# With custom models
python scripts/run_face_search.py \
--image assets/test.jpg \
--detector scrfd \
--recognizer arcface
```
**Expected Behavior:**
1. Webcam window opens
2. Faces are detected in real-time
3. Green box = Match (similarity > 0.4)
4. Red box = Unknown (similarity < 0.4)
5. Press 'q' to quit
**Expected Output:**
```
Initializing models...
2025-11-08 00:00:00 - INFO - CoreML acceleration enabled (Apple Silicon)
Extracting reference embedding...
Webcam started. Press 'q' to quit.
```
**Troubleshooting:**
```bash
# If webcam doesn't open
python -c "import cv2; cap = cv2.VideoCapture(0); print('Webcam OK' if cap.isOpened() else 'Webcam FAIL')"
# If no faces detected
# - Ensure good lighting
# - Face should be frontal and clearly visible
# - Try lowering threshold: edit script line 29, change 0.4 to 0.3
```
---
### 5. Test SHA256 Generator (For Developers)
```bash
# Generate checksum for a model file
python scripts/sha256_generate.py ~/.uniface/models/retinaface_mnet_v2.onnx
# Generate for all models
for model in ~/.uniface/models/*.onnx; do
python scripts/sha256_generate.py "$model"
done
```
---
## 🔍 Quick Verification Tests
### Test 1: Imports Work
```bash
python -c "
from uniface.detection import create_detector
from uniface.recognition import create_recognizer
print('✅ Imports successful')
"
```
### Test 2: Models Download
```bash
python -c "
from uniface import RetinaFace
detector = RetinaFace()
print('✅ Model downloaded and loaded')
"
```
### Test 3: Detection Works
```bash
python -c "
import cv2
import numpy as np
from uniface import RetinaFace
detector = RetinaFace()
image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = detector.detect(image)
print(f'✅ Detection works, found {len(faces)} faces')
"
```
### Test 4: Recognition Works
```bash
python -c "
import cv2
import numpy as np
from uniface import RetinaFace, ArcFace
detector = RetinaFace()
recognizer = ArcFace()
image = cv2.imread('assets/test.jpg')
faces = detector.detect(image)
if faces:
landmarks = np.array(faces[0]['landmarks'])
embedding = recognizer.get_normalized_embedding(image, landmarks)
print(f'✅ Recognition works, embedding shape: {embedding.shape}')
else:
print('⚠️ No faces detected in test image')
"
```
---
## End-to-End Test Workflow
Run this complete workflow to verify everything works:
```bash
#!/bin/bash
# Save as test_all_scripts.sh
echo "=== Testing UniFace Scripts ==="
echo ""
# Test 1: Download models
echo "1⃣ Testing model download..."
python scripts/download_model.py --model MNET_V2
if [ $? -eq 0 ]; then
echo "✅ Model download: PASS"
else
echo "❌ Model download: FAIL"
exit 1
fi
echo ""
# Test 2: Face detection
echo "2⃣ Testing face detection..."
python scripts/run_detection.py --image assets/test.jpg --save_dir /tmp/uniface_test
if [ $? -eq 0 ] && [ -f /tmp/uniface_test/test_out.jpg ]; then
echo "✅ Face detection: PASS"
else
echo "❌ Face detection: FAIL"
exit 1
fi
echo ""
# Test 3: Face recognition
echo "3⃣ Testing face recognition..."
python scripts/run_recognition.py --image assets/test.jpg > /tmp/uniface_recognition.log
if [ $? -eq 0 ] && grep -q "Embedding shape" /tmp/uniface_recognition.log; then
echo "✅ Face recognition: PASS"
else
echo "❌ Face recognition: FAIL"
exit 1
fi
echo ""
echo "=== All Tests Passed! 🎉 ==="
```
**Run the test suite:**
```bash
chmod +x test_all_scripts.sh
./test_all_scripts.sh
```
---
## Performance Benchmarking
### Benchmark Detection Speed
```bash
# Test different models
for model in retinaface scrfd; do
echo "Testing $model..."
python scripts/run_detection.py \
--image assets/test.jpg \
--method $model \
--iterations 50
done
```
### Benchmark Recognition Speed
```bash
# Test different recognizers
for recognizer in arcface mobileface; do
echo "Testing $recognizer..."
time python scripts/run_recognition.py \
--image assets/test.jpg \
--recognizer $recognizer
done
```
---
## 🐛 Common Issues
### Issue: "No module named 'uniface'"
```bash
# Solution: Install in editable mode
pip install -e .
```
### Issue: "Failed to load image"
```bash
# Check image exists
ls -lh assets/test.jpg
# Try with absolute path
python scripts/run_detection.py --image $(pwd)/assets/test.jpg
```
### Issue: "No faces detected"
```bash
# Lower confidence threshold
python scripts/run_detection.py \
--image assets/test.jpg \
--threshold 0.3
```
### Issue: Models downloading slowly
```bash
# Check internet connection
curl -I https://github.com/yakhyo/uniface/releases
# Or download manually
wget https://github.com/yakhyo/uniface/releases/download/v0.1.2/retinaface_mv2.onnx \
-O ~/.uniface/models/retinaface_mnet_v2.onnx
```
### Issue: CoreML not available on Mac
```bash
# Install CoreML-enabled ONNX Runtime
pip uninstall onnxruntime
pip install onnxruntime-silicon
# Verify
python -c "import onnxruntime as ort; print(ort.get_available_providers())"
# Should show: ['CoreMLExecutionProvider', 'CPUExecutionProvider']
```
---
## ✅ Script Status Summary
| Script | Status | API Updated | Tested |
|-----------------------|--------|-------------|--------|
| download_model.py | ✅ | ✅ | ✅ |
| run_detection.py | ✅ | ✅ | ✅ |
| run_recognition.py | ✅ | ✅ | ✅ |
| run_face_search.py | ✅ | ✅ | ✅ |
| sha256_generate.py | ✅ | N/A | ✅ |
All scripts are updated and working with the new dict-based API! 🎉
---
## 📝 Notes
- All scripts now use the factory functions (`create_detector`, `create_recognizer`)
- Scripts work with the new dict-based detection API
- Model download bug is fixed (enum vs string issue)
- CoreML acceleration is automatically detected on Apple Silicon
- All scripts include proper error handling
---
Need help with a specific script? Check the main [README.md](../README.md) or [QUICKSTART.md](../QUICKSTART.md)!

96
scripts/batch_process.py Normal file
View File

@@ -0,0 +1,96 @@
# Batch face detection on a folder of images
# Usage: python batch_process.py --input images/ --output results/
import argparse
from pathlib import Path
import cv2
from tqdm import tqdm
from uniface import SCRFD, RetinaFace
from uniface.visualization import draw_detections
def get_image_files(input_dir: Path, extensions: tuple) -> list:
files = []
for ext in extensions:
files.extend(input_dir.glob(f"*.{ext}"))
files.extend(input_dir.glob(f"*.{ext.upper()}"))
return sorted(files)
def process_image(detector, image_path: Path, output_path: Path, threshold: float) -> int:
"""Process single image. Returns face count or -1 on error."""
image = cv2.imread(str(image_path))
if image is None:
return -1
faces = detector.detect(image)
# unpack face data for visualization
bboxes = [f["bbox"] for f in faces]
scores = [f["confidence"] for f in faces]
landmarks = [f["landmarks"] for f in faces]
draw_detections(image, bboxes, scores, landmarks, vis_threshold=threshold)
cv2.putText(
image,
f"Faces: {len(faces)}",
(10, 30),
cv2.FONT_HERSHEY_SIMPLEX,
1,
(0, 255, 0),
2,
)
cv2.imwrite(str(output_path), image)
return len(faces)
def main():
parser = argparse.ArgumentParser(description="Batch process images with face detection")
parser.add_argument("--input", type=str, required=True, help="Input directory")
parser.add_argument("--output", type=str, required=True, help="Output directory")
parser.add_argument("--detector", type=str, default="retinaface", choices=["retinaface", "scrfd"])
parser.add_argument("--threshold", type=float, default=0.6, help="Visualization threshold")
parser.add_argument("--extensions", type=str, default="jpg,jpeg,png,bmp", help="Image extensions")
args = parser.parse_args()
input_path = Path(args.input)
output_path = Path(args.output)
if not input_path.exists():
print(f"Error: Input directory '{args.input}' does not exist")
return
output_path.mkdir(parents=True, exist_ok=True)
extensions = tuple(ext.strip() for ext in args.extensions.split(","))
image_files = get_image_files(input_path, extensions)
if not image_files:
print(f"No images found with extensions {extensions}")
return
print(f"Found {len(image_files)} images")
detector = RetinaFace() if args.detector == "retinaface" else SCRFD()
success, errors, total_faces = 0, 0, 0
for img_path in tqdm(image_files, desc="Processing", unit="img"):
out_path = output_path / f"{img_path.stem}_detected{img_path.suffix}"
result = process_image(detector, img_path, out_path, args.threshold)
if result >= 0:
success += 1
total_faces += result
else:
errors += 1
print(f"\nFailed: {img_path.name}")
print(f"\nDone! {success} processed, {errors} errors, {total_faces} faces total")
if __name__ == "__main__":
main()

View File

@@ -1,31 +1,60 @@
import argparse
from uniface.constants import RetinaFaceWeights
from uniface.constants import (
AgeGenderWeights,
ArcFaceWeights,
DDAMFNWeights,
LandmarkWeights,
MobileFaceWeights,
RetinaFaceWeights,
SCRFDWeights,
SphereFaceWeights,
)
from uniface.model_store import verify_model_weights
MODEL_TYPES = {
"retinaface": RetinaFaceWeights,
"sphereface": SphereFaceWeights,
"mobileface": MobileFaceWeights,
"arcface": ArcFaceWeights,
"scrfd": SCRFDWeights,
"ddamfn": DDAMFNWeights,
"agegender": AgeGenderWeights,
"landmark": LandmarkWeights,
}
def download_models(model_enum):
for weight in model_enum:
print(f"Downloading: {weight.value}")
try:
verify_model_weights(weight)
print(f" Done: {weight.value}")
except Exception as e:
print(f" Failed: {e}")
def main():
parser = argparse.ArgumentParser(description="Download and verify RetinaFace model weights.")
parser = argparse.ArgumentParser(description="Download model weights")
parser.add_argument(
"--model",
"--model-type",
type=str,
choices=[m.name for m in RetinaFaceWeights],
help="Model to download (e.g. MNET_V2). If not specified, all models will be downloaded.",
choices=list(MODEL_TYPES.keys()),
help="Model type to download. If not specified, downloads all.",
)
args = parser.parse_args()
if args.model:
weight = RetinaFaceWeights[args.model]
print(f"📥 Downloading model: {weight.value}")
verify_model_weights(weight) # Pass enum, not string
if args.model_type:
print(f"Downloading {args.model_type} models...")
download_models(MODEL_TYPES[args.model_type])
else:
print("📥 Downloading all models...")
for weight in RetinaFaceWeights:
verify_model_weights(weight) # Pass enum, not string
print("Downloading all models...")
for name, model_enum in MODEL_TYPES.items():
print(f"\n{name}:")
download_models(model_enum)
print("✅ All requested weights are ready and verified.")
print("\nDone!")
if __name__ == "__main__":
main()

124
scripts/run_age_gender.py Normal file
View File

@@ -0,0 +1,124 @@
# Age and gender prediction on detected faces
# Usage: python run_age_gender.py --image path/to/image.jpg
# python run_age_gender.py --webcam
import argparse
import os
from pathlib import Path
import cv2
from uniface import SCRFD, AgeGender, RetinaFace
from uniface.visualization import draw_detections
def draw_age_gender_label(image, bbox, gender: str, age: int):
"""Draw age/gender label above the bounding box."""
x1, y1 = int(bbox[0]), int(bbox[1])
text = f"{gender}, {age}y"
(tw, th), _ = cv2.getTextSize(text, cv2.FONT_HERSHEY_SIMPLEX, 0.6, 2)
cv2.rectangle(image, (x1, y1 - th - 10), (x1 + tw + 10, y1), (0, 255, 0), -1)
cv2.putText(image, text, (x1 + 5, y1 - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 0), 2)
def process_image(
detector,
age_gender,
image_path: str,
save_dir: str = "outputs",
threshold: float = 0.6,
):
image = cv2.imread(image_path)
if image is None:
print(f"Error: Failed to load image from '{image_path}'")
return
faces = detector.detect(image)
print(f"Detected {len(faces)} face(s)")
if not faces:
return
bboxes = [f["bbox"] for f in faces]
scores = [f["confidence"] for f in faces]
landmarks = [f["landmarks"] for f in faces]
draw_detections(image, bboxes, scores, landmarks, vis_threshold=threshold)
for i, face in enumerate(faces):
gender, age = age_gender.predict(image, face["bbox"])
print(f" Face {i + 1}: {gender}, {age} years old")
draw_age_gender_label(image, face["bbox"], gender, age)
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f"{Path(image_path).stem}_age_gender.jpg")
cv2.imwrite(output_path, image)
print(f"Output saved: {output_path}")
def run_webcam(detector, age_gender, threshold: float = 0.6):
cap = cv2.VideoCapture(0) # 0 = default webcam
if not cap.isOpened():
print("Cannot open webcam")
return
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1) # mirror for natural interaction
if not ret:
break
faces = detector.detect(frame)
# unpack face data for visualization
bboxes = [f["bbox"] for f in faces]
scores = [f["confidence"] for f in faces]
landmarks = [f["landmarks"] for f in faces]
draw_detections(frame, bboxes, scores, landmarks, vis_threshold=threshold)
for face in faces:
gender, age = age_gender.predict(frame, face["bbox"]) # predict per face
draw_age_gender_label(frame, face["bbox"], gender, age)
cv2.putText(
frame,
f"Faces: {len(faces)}",
(10, 30),
cv2.FONT_HERSHEY_SIMPLEX,
1,
(0, 255, 0),
2,
)
cv2.imshow("Age & Gender Detection", frame)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
cap.release()
cv2.destroyAllWindows()
def main():
parser = argparse.ArgumentParser(description="Run age and gender detection")
parser.add_argument("--image", type=str, help="Path to input image")
parser.add_argument("--webcam", action="store_true", help="Use webcam")
parser.add_argument("--detector", type=str, default="retinaface", choices=["retinaface", "scrfd"])
parser.add_argument("--threshold", type=float, default=0.6, help="Visualization threshold")
parser.add_argument("--save_dir", type=str, default="outputs")
args = parser.parse_args()
if not args.image and not args.webcam:
parser.error("Either --image or --webcam must be specified")
detector = RetinaFace() if args.detector == "retinaface" else SCRFD()
age_gender = AgeGender()
if args.webcam:
run_webcam(detector, age_gender, args.threshold)
else:
process_image(detector, age_gender, args.image, args.save_dir, args.threshold)
if __name__ == "__main__":
main()

View File

@@ -1,86 +1,94 @@
import os
import cv2
import time
import argparse
import numpy as np
# Face detection on image or webcam
# Usage: python run_detection.py --image path/to/image.jpg
# python run_detection.py --webcam
# UPDATED: Use the factory function and import from the new location
from uniface.detection import create_detector
import argparse
import os
import cv2
from uniface.detection import SCRFD, RetinaFace
from uniface.visualization import draw_detections
def run_inference(detector, image_path: str, vis_threshold: float = 0.6, save_dir: str = "outputs"):
"""
Run face detection on a single image.
Args:
detector: Initialized face detector.
image_path (str): Path to input image.
vis_threshold (float): Threshold for drawing detections.
save_dir (str): Directory to save output image.
"""
def process_image(detector, image_path: str, threshold: float = 0.6, save_dir: str = "outputs"):
image = cv2.imread(image_path)
if image is None:
print(f"Error: Failed to load image from '{image_path}'")
print(f"Error: Failed to load image from '{image_path}'")
return
# 1. Get the list of face dictionaries from the detector
faces = detector.detect(image)
if faces:
# 2. Unpack the data into separate lists
bboxes = [face['bbox'] for face in faces]
scores = [face['confidence'] for face in faces]
landmarks = [face['landmarks'] for face in faces]
# 3. Pass the unpacked lists to the drawing function
draw_detections(image, bboxes, scores, landmarks, vis_threshold=0.6)
bboxes = [face["bbox"] for face in faces]
scores = [face["confidence"] for face in faces]
landmarks = [face["landmarks"] for face in faces]
draw_detections(image, bboxes, scores, landmarks, vis_threshold=threshold)
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f"{os.path.splitext(os.path.basename(image_path))[0]}_out.jpg")
cv2.imwrite(output_path, image)
print(f"Output saved at: {output_path}")
print(f"Output saved: {output_path}")
def run_webcam(detector, threshold: float = 0.6):
cap = cv2.VideoCapture(0) # 0 = default webcam
if not cap.isOpened():
print("Cannot open webcam")
return
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1) # mirror for natural interaction
if not ret:
break
faces = detector.detect(frame)
# unpack face data for visualization
bboxes = [f["bbox"] for f in faces]
scores = [f["confidence"] for f in faces]
landmarks = [f["landmarks"] for f in faces]
draw_detections(frame, bboxes, scores, landmarks, vis_threshold=threshold)
cv2.putText(
frame,
f"Faces: {len(faces)}",
(10, 30),
cv2.FONT_HERSHEY_SIMPLEX,
1,
(0, 255, 0),
2,
)
cv2.imshow("Face Detection", frame)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
cap.release()
cv2.destroyAllWindows()
def main():
parser = argparse.ArgumentParser(description="Run face detection on an image.")
parser.add_argument("--image", type=str, required=True, help="Path to the input image")
parser.add_argument(
"--method",
type=str,
default="retinaface",
choices=['retinaface', 'scrfd'],
help="Detection method to use."
)
parser.add_argument("--threshold", type=float, default=0.6, help="Visualization confidence threshold")
parser.add_argument("--iterations", type=int, default=1, help="Number of inference runs for benchmarking")
parser.add_argument("--save_dir", type=str, default="outputs", help="Directory to save output images")
parser.add_argument("--verbose", action="store_true", help="Enable verbose logging")
parser = argparse.ArgumentParser(description="Run face detection")
parser.add_argument("--image", type=str, help="Path to input image")
parser.add_argument("--webcam", action="store_true", help="Use webcam")
parser.add_argument("--method", type=str, default="retinaface", choices=["retinaface", "scrfd"])
parser.add_argument("--threshold", type=float, default=0.6, help="Visualization threshold")
parser.add_argument("--save_dir", type=str, default="outputs")
args = parser.parse_args()
if args.verbose:
from uniface import enable_logging
enable_logging()
if not args.image and not args.webcam:
parser.error("Either --image or --webcam must be specified")
print(f"Initializing detector: {args.method}")
detector = create_detector(method=args.method)
detector = RetinaFace() if args.method == "retinaface" else SCRFD()
avg_time = 0
for i in range(args.iterations):
start = time.time()
run_inference(detector, args.image, args.threshold, args.save_dir)
elapsed = time.time() - start
print(f"[{i + 1}/{args.iterations}] ⏱️ Inference time: {elapsed:.4f} seconds")
if i >= 0: # Avoid counting the first run if it includes model loading time
avg_time += elapsed
if args.iterations > 1:
# Adjust average calculation to exclude potential first-run overhead
effective_iterations = max(1, args.iterations)
print(
f"\n🔥 Average inference time over {effective_iterations} runs: {avg_time / effective_iterations:.4f} seconds")
if args.webcam:
run_webcam(detector, args.threshold)
else:
process_image(detector, args.image, args.threshold, args.save_dir)
if __name__ == "__main__":

View File

@@ -1,16 +1,26 @@
# Real-time face search: match webcam faces against a reference image
# Usage: python run_face_search.py --image reference.jpg
import argparse
import cv2
import numpy as np
# Use the new high-level factory functions
from uniface.detection import create_detector
from uniface.detection import SCRFD, RetinaFace
from uniface.face_utils import compute_similarity
from uniface.recognition import create_recognizer
from uniface.recognition import ArcFace, MobileFace, SphereFace
def get_recognizer(name: str):
if name == "arcface":
return ArcFace()
elif name == "mobileface":
return MobileFace()
else:
return SphereFace()
def extract_reference_embedding(detector, recognizer, image_path: str) -> np.ndarray:
"""Extracts a normalized embedding from the first face found in an image."""
image = cv2.imread(image_path)
if image is None:
raise RuntimeError(f"Failed to load image: {image_path}")
@@ -19,45 +29,37 @@ def extract_reference_embedding(detector, recognizer, image_path: str) -> np.nda
if not faces:
raise RuntimeError("No faces found in reference image.")
# Get landmarks from the first detected face dictionary
landmarks = np.array(faces[0]["landmarks"])
# Use normalized embedding for more reliable similarity comparison
embedding = recognizer.get_normalized_embedding(image, landmarks)
return embedding
return recognizer.get_normalized_embedding(image, landmarks)
def run_video(detector, recognizer, ref_embedding: np.ndarray, threshold: float = 0.4):
"""Run real-time face recognition from a webcam feed."""
cap = cv2.VideoCapture(0)
def run_webcam(detector, recognizer, ref_embedding: np.ndarray, threshold: float = 0.4):
cap = cv2.VideoCapture(0) # 0 = default webcam
if not cap.isOpened():
raise RuntimeError("Webcam could not be opened.")
print("Webcam started. Press 'q' to quit.")
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1) # mirror for natural interaction
if not ret:
break
faces = detector.detect(frame)
# Loop through each detected face
for face in faces:
# Extract bbox and landmarks from the dictionary
bbox = face["bbox"]
landmarks = np.array(face["landmarks"])
x1, y1, x2, y2 = map(int, bbox)
# Get the normalized embedding for the current face
embedding = recognizer.get_normalized_embedding(frame, landmarks)
sim = compute_similarity(ref_embedding, embedding) # compare with reference
# Compare with the reference embedding
sim = compute_similarity(ref_embedding, embedding)
# Draw results
# green = match, red = unknown
label = f"Match ({sim:.2f})" if sim > threshold else f"Unknown ({sim:.2f})"
color = (0, 255, 0) if sim > threshold else (0, 0, 255)
cv2.rectangle(frame, (x1, y1), (x2, y2), color, 2)
cv2.putText(frame, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
@@ -70,34 +72,25 @@ def run_video(detector, recognizer, ref_embedding: np.ndarray, threshold: float
def main():
parser = argparse.ArgumentParser(description="Face recognition using a reference image.")
parser.add_argument("--image", type=str, required=True, help="Path to the reference face image.")
parser.add_argument(
"--detector", type=str, default="scrfd", choices=["retinaface", "scrfd"], help="Face detection method."
)
parser = argparse.ArgumentParser(description="Face search using a reference image")
parser.add_argument("--image", type=str, required=True, help="Reference face image")
parser.add_argument("--threshold", type=float, default=0.4, help="Match threshold")
parser.add_argument("--detector", type=str, default="scrfd", choices=["retinaface", "scrfd"])
parser.add_argument(
"--recognizer",
type=str,
default="arcface",
choices=["arcface", "mobileface", "sphereface"],
help="Face recognition method.",
)
parser.add_argument("--verbose", action="store_true", help="Enable verbose logging")
args = parser.parse_args()
if args.verbose:
from uniface import enable_logging
detector = RetinaFace() if args.detector == "retinaface" else SCRFD()
recognizer = get_recognizer(args.recognizer)
enable_logging()
print("Initializing models...")
detector = create_detector(method=args.detector)
recognizer = create_recognizer(method=args.recognizer)
print("Extracting reference embedding...")
print(f"Loading reference: {args.image}")
ref_embedding = extract_reference_embedding(detector, recognizer, args.image)
run_video(detector, recognizer, ref_embedding)
run_webcam(detector, recognizer, ref_embedding, args.threshold)
if __name__ == "__main__":

117
scripts/run_landmarks.py Normal file
View File

@@ -0,0 +1,117 @@
# 106-point facial landmark detection
# Usage: python run_landmarks.py --image path/to/image.jpg
# python run_landmarks.py --webcam
import argparse
import os
from pathlib import Path
import cv2
from uniface import SCRFD, Landmark106, RetinaFace
def process_image(detector, landmarker, image_path: str, save_dir: str = "outputs"):
image = cv2.imread(image_path)
if image is None:
print(f"Error: Failed to load image from '{image_path}'")
return
faces = detector.detect(image)
print(f"Detected {len(faces)} face(s)")
if not faces:
return
for i, face in enumerate(faces):
bbox = face["bbox"]
x1, y1, x2, y2 = map(int, bbox)
cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
landmarks = landmarker.get_landmarks(image, bbox)
print(f" Face {i + 1}: {len(landmarks)} landmarks")
for x, y in landmarks.astype(int):
cv2.circle(image, (x, y), 1, (0, 255, 0), -1)
cv2.putText(
image,
f"Face {i + 1}",
(x1, y1 - 10),
cv2.FONT_HERSHEY_SIMPLEX,
0.5,
(0, 255, 0),
2,
)
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f"{Path(image_path).stem}_landmarks.jpg")
cv2.imwrite(output_path, image)
print(f"Output saved: {output_path}")
def run_webcam(detector, landmarker):
cap = cv2.VideoCapture(0) # 0 = default webcam
if not cap.isOpened():
print("Cannot open webcam")
return
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1) # mirror for natural interaction
if not ret:
break
faces = detector.detect(frame)
for face in faces:
bbox = face["bbox"]
x1, y1, x2, y2 = map(int, bbox)
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
landmarks = landmarker.get_landmarks(frame, bbox) # 106 points
for x, y in landmarks.astype(int):
cv2.circle(frame, (x, y), 1, (0, 255, 0), -1)
cv2.putText(
frame,
f"Faces: {len(faces)}",
(10, 30),
cv2.FONT_HERSHEY_SIMPLEX,
1,
(0, 255, 0),
2,
)
cv2.imshow("106-Point Landmarks", frame)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
cap.release()
cv2.destroyAllWindows()
def main():
parser = argparse.ArgumentParser(description="Run facial landmark detection")
parser.add_argument("--image", type=str, help="Path to input image")
parser.add_argument("--webcam", action="store_true", help="Use webcam")
parser.add_argument("--detector", type=str, default="retinaface", choices=["retinaface", "scrfd"])
parser.add_argument("--save_dir", type=str, default="outputs")
args = parser.parse_args()
if not args.image and not args.webcam:
parser.error("Either --image or --webcam must be specified")
detector = RetinaFace() if args.detector == "retinaface" else SCRFD()
landmarker = Landmark106()
if args.webcam:
run_webcam(detector, landmarker)
else:
process_image(detector, landmarker, args.image, args.save_dir)
if __name__ == "__main__":
main()

View File

@@ -1,80 +1,102 @@
import cv2
# Face recognition: extract embeddings or compare two faces
# Usage: python run_recognition.py --image path/to/image.jpg
# python run_recognition.py --image1 face1.jpg --image2 face2.jpg
import argparse
import cv2
import numpy as np
# Use the new high-level factory functions for consistency
from uniface.detection import create_detector
from uniface.recognition import create_recognizer
from uniface.detection import SCRFD, RetinaFace
from uniface.face_utils import compute_similarity
from uniface.recognition import ArcFace, MobileFace, SphereFace
def get_recognizer(name: str):
if name == "arcface":
return ArcFace()
elif name == "mobileface":
return MobileFace()
else:
return SphereFace()
def run_inference(detector, recognizer, image_path: str):
"""
Detect faces and extract embeddings from a single image.
Args:
detector: Initialized face detector.
recognizer: Initialized face recognition model.
image_path (str): Path to the input image.
"""
image = cv2.imread(image_path)
if image is None:
print(f"Error: Failed to load image from '{image_path}'")
return
faces = detector.detect(image)
if not faces:
print("No faces detected.")
return
print(f"Detected {len(faces)} face(s). Extracting embeddings for the first face...")
print(f"Detected {len(faces)} face(s). Extracting embedding for the first face...")
# Process the first detected face
first_face = faces[0]
landmarks = np.array(first_face['landmarks']) # Convert landmarks to numpy array
# Extract embedding using the landmarks from the face dictionary
landmarks = np.array(faces[0]["landmarks"]) # 5-point landmarks for alignment
embedding = recognizer.get_embedding(image, landmarks)
norm_embedding = recognizer.get_normalized_embedding(image, landmarks)
norm_embedding = recognizer.get_normalized_embedding(image, landmarks) # L2 normalized
# Print some info about the embeddings
print(f" - Embedding shape: {embedding.shape}")
print(f" - L2 norm of unnormalized embedding: {np.linalg.norm(embedding):.4f}")
print(f" - L2 norm of normalized embedding: {np.linalg.norm(norm_embedding):.4f}")
print(f" Embedding shape: {embedding.shape}")
print(f" L2 norm (raw): {np.linalg.norm(embedding):.4f}")
print(f" L2 norm (normalized): {np.linalg.norm(norm_embedding):.4f}")
def compare_faces(detector, recognizer, image1_path: str, image2_path: str, threshold: float = 0.35):
img1 = cv2.imread(image1_path)
img2 = cv2.imread(image2_path)
if img1 is None or img2 is None:
print("Error: Failed to load one or both images")
return
faces1 = detector.detect(img1)
faces2 = detector.detect(img2)
if not faces1 or not faces2:
print("Error: No faces detected in one or both images")
return
landmarks1 = np.array(faces1[0]["landmarks"])
landmarks2 = np.array(faces2[0]["landmarks"])
embedding1 = recognizer.get_normalized_embedding(img1, landmarks1)
embedding2 = recognizer.get_normalized_embedding(img2, landmarks2)
# cosine similarity for normalized embeddings
similarity = compute_similarity(embedding1, embedding2, normalized=True)
is_match = similarity > threshold
print(f"Similarity: {similarity:.4f}")
print(f"Result: {'Same person' if is_match else 'Different person'} (threshold: {threshold})")
def main():
parser = argparse.ArgumentParser(description="Extract face embeddings from a single image.")
parser.add_argument("--image", type=str, required=True, help="Path to the input image.")
parser.add_argument(
"--detector",
type=str,
default="retinaface",
choices=['retinaface', 'scrfd'],
help="Face detection method to use."
)
parser = argparse.ArgumentParser(description="Face recognition and comparison")
parser.add_argument("--image", type=str, help="Single image for embedding extraction")
parser.add_argument("--image1", type=str, help="First image for comparison")
parser.add_argument("--image2", type=str, help="Second image for comparison")
parser.add_argument("--threshold", type=float, default=0.35, help="Similarity threshold")
parser.add_argument("--detector", type=str, default="retinaface", choices=["retinaface", "scrfd"])
parser.add_argument(
"--recognizer",
type=str,
default="arcface",
choices=['arcface', 'mobileface', 'sphereface'],
help="Face recognition method to use."
choices=["arcface", "mobileface", "sphereface"],
)
parser.add_argument("--verbose", action="store_true", help="Enable verbose logging")
args = parser.parse_args()
if args.verbose:
from uniface import enable_logging
enable_logging()
detector = RetinaFace() if args.detector == "retinaface" else SCRFD()
recognizer = get_recognizer(args.recognizer)
print(f"Initializing detector: {args.detector}")
detector = create_detector(method=args.detector)
print(f"Initializing recognizer: {args.recognizer}")
recognizer = create_recognizer(method=args.recognizer)
run_inference(detector, recognizer, args.image)
if args.image1 and args.image2:
compare_faces(detector, recognizer, args.image1, args.image2, args.threshold)
elif args.image:
run_inference(detector, recognizer, args.image)
else:
print("Error: Provide --image or both --image1 and --image2")
parser.print_help()
if __name__ == "__main__":

View File

@@ -0,0 +1,107 @@
# Face detection on video files
# Usage: python run_video_detection.py --input video.mp4 --output output.mp4
import argparse
from pathlib import Path
import cv2
from tqdm import tqdm
from uniface import SCRFD, RetinaFace
from uniface.visualization import draw_detections
def process_video(
detector,
input_path: str,
output_path: str,
threshold: float = 0.6,
show_preview: bool = False,
):
cap = cv2.VideoCapture(input_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{input_path}'")
return
# get video properties
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
print(f"Input: {input_path} ({width}x{height}, {fps:.1f} fps, {total_frames} frames)")
print(f"Output: {output_path}")
fourcc = cv2.VideoWriter_fourcc(*"mp4v") # codec for .mp4
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
if not out.isOpened():
print(f"Error: Cannot create output video '{output_path}'")
cap.release()
return
frame_count = 0
total_faces = 0
for _ in tqdm(range(total_frames), desc="Processing", unit="frames"):
ret, frame = cap.read()
if not ret:
break
frame_count += 1
faces = detector.detect(frame)
total_faces += len(faces)
bboxes = [f["bbox"] for f in faces]
scores = [f["confidence"] for f in faces]
landmarks = [f["landmarks"] for f in faces]
draw_detections(frame, bboxes, scores, landmarks, vis_threshold=threshold)
cv2.putText(
frame,
f"Faces: {len(faces)}",
(10, 30),
cv2.FONT_HERSHEY_SIMPLEX,
1,
(0, 255, 0),
2,
)
out.write(frame)
if show_preview:
cv2.imshow("Processing - Press 'q' to cancel", frame)
if cv2.waitKey(1) & 0xFF == ord("q"):
print("\nCancelled by user")
break
cap.release()
out.release()
if show_preview:
cv2.destroyAllWindows()
avg_faces = total_faces / frame_count if frame_count > 0 else 0
print(f"\nDone! {frame_count} frames, {total_faces} faces ({avg_faces:.1f} avg/frame)")
print(f"Saved: {output_path}")
def main():
parser = argparse.ArgumentParser(description="Process video with face detection")
parser.add_argument("--input", type=str, required=True, help="Input video path")
parser.add_argument("--output", type=str, required=True, help="Output video path")
parser.add_argument("--detector", type=str, default="retinaface", choices=["retinaface", "scrfd"])
parser.add_argument("--threshold", type=float, default=0.6, help="Visualization threshold")
parser.add_argument("--preview", action="store_true", help="Show live preview")
args = parser.parse_args()
if not Path(args.input).exists():
print(f"Error: Input file '{args.input}' does not exist")
return
Path(args.output).parent.mkdir(parents=True, exist_ok=True)
detector = RetinaFace() if args.detector == "retinaface" else SCRFD()
process_video(detector, args.input, args.output, args.threshold, args.preview)
if __name__ == "__main__":
main()

View File

@@ -12,15 +12,8 @@ def compute_sha256(file_path: Path, chunk_size: int = 8192) -> str:
def main():
parser = argparse.ArgumentParser(
description="Compute SHA256 hash of a model weight file."
)
parser.add_argument(
"file",
type=Path,
help="Path to the model weight file (.onnx, .pth, etc)."
)
parser = argparse.ArgumentParser(description="Compute SHA256 hash of a file")
parser.add_argument("file", type=Path, help="Path to file")
args = parser.parse_args()
if not args.file.exists() or not args.file.is_file():
@@ -28,7 +21,7 @@ def main():
return
sha256 = compute_sha256(args.file)
print(f"`SHA256 hash for '{args.file.name}':\n{sha256}")
print(f"SHA256 hash for '{args.file.name}':\n{sha256}")
if __name__ == "__main__":

116
tests/test_age_gender.py Normal file
View File

@@ -0,0 +1,116 @@
import numpy as np
import pytest
from uniface.attribute import AgeGender
@pytest.fixture
def age_gender_model():
return AgeGender()
@pytest.fixture
def mock_image():
return np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
@pytest.fixture
def mock_bbox():
return [100, 100, 300, 300]
def test_model_initialization(age_gender_model):
assert age_gender_model is not None, "AgeGender model initialization failed."
def test_prediction_output_format(age_gender_model, mock_image, mock_bbox):
gender, age = age_gender_model.predict(mock_image, mock_bbox)
assert isinstance(gender, str), f"Gender should be string, got {type(gender)}"
assert isinstance(age, int), f"Age should be int, got {type(age)}"
def test_gender_values(age_gender_model, mock_image, mock_bbox):
gender, age = age_gender_model.predict(mock_image, mock_bbox)
assert gender in ["Male", "Female"], f"Gender should be 'Male' or 'Female', got '{gender}'"
def test_age_range(age_gender_model, mock_image, mock_bbox):
gender, age = age_gender_model.predict(mock_image, mock_bbox)
assert 0 <= age <= 120, f"Age should be between 0 and 120, got {age}"
def test_different_bbox_sizes(age_gender_model, mock_image):
test_bboxes = [
[50, 50, 150, 150],
[100, 100, 300, 300],
[50, 50, 400, 400],
]
for bbox in test_bboxes:
gender, age = age_gender_model.predict(mock_image, bbox)
assert gender in ["Male", "Female"], f"Failed for bbox {bbox}"
assert 0 <= age <= 120, f"Age out of range for bbox {bbox}"
def test_different_image_sizes(age_gender_model, mock_bbox):
test_sizes = [(480, 640, 3), (720, 1280, 3), (1080, 1920, 3)]
for size in test_sizes:
mock_image = np.random.randint(0, 255, size, dtype=np.uint8)
gender, age = age_gender_model.predict(mock_image, mock_bbox)
assert gender in ["Male", "Female"], f"Failed for image size {size}"
assert 0 <= age <= 120, f"Age out of range for image size {size}"
def test_consistency(age_gender_model, mock_image, mock_bbox):
gender1, age1 = age_gender_model.predict(mock_image, mock_bbox)
gender2, age2 = age_gender_model.predict(mock_image, mock_bbox)
assert gender1 == gender2, "Same input should produce same gender prediction"
assert age1 == age2, "Same input should produce same age prediction"
def test_bbox_list_format(age_gender_model, mock_image):
bbox_list = [100, 100, 300, 300]
gender, age = age_gender_model.predict(mock_image, bbox_list)
assert gender in ["Male", "Female"], "Should work with bbox as list"
assert 0 <= age <= 120, "Age should be in valid range"
def test_bbox_array_format(age_gender_model, mock_image):
bbox_array = np.array([100, 100, 300, 300])
gender, age = age_gender_model.predict(mock_image, bbox_array)
assert gender in ["Male", "Female"], "Should work with bbox as numpy array"
assert 0 <= age <= 120, "Age should be in valid range"
def test_multiple_predictions(age_gender_model, mock_image):
bboxes = [
[50, 50, 150, 150],
[200, 200, 350, 350],
[400, 400, 550, 550],
]
results = []
for bbox in bboxes:
gender, age = age_gender_model.predict(mock_image, bbox)
results.append((gender, age))
assert len(results) == 3, "Should have 3 predictions"
for gender, age in results:
assert gender in ["Male", "Female"]
assert 0 <= age <= 120
def test_age_is_positive(age_gender_model, mock_image, mock_bbox):
for _ in range(5):
gender, age = age_gender_model.predict(mock_image, mock_bbox)
assert age >= 0, f"Age should be non-negative, got {age}"
def test_output_format_for_visualization(age_gender_model, mock_image, mock_bbox):
gender, age = age_gender_model.predict(mock_image, mock_bbox)
text = f"{gender}, {age}y"
assert isinstance(text, str), "Should be able to format as string"
assert "Male" in text or "Female" in text, "Text should contain gender"
assert "y" in text, "Text should contain 'y' for years"

274
tests/test_factory.py Normal file
View File

@@ -0,0 +1,274 @@
import numpy as np
import pytest
from uniface import (
create_detector,
create_landmarker,
create_recognizer,
detect_faces,
list_available_detectors,
)
from uniface.constants import RetinaFaceWeights, SCRFDWeights
# create_detector tests
def test_create_detector_retinaface():
"""
Test creating a RetinaFace detector using factory function.
"""
detector = create_detector("retinaface")
assert detector is not None, "Failed to create RetinaFace detector"
def test_create_detector_scrfd():
"""
Test creating a SCRFD detector using factory function.
"""
detector = create_detector("scrfd")
assert detector is not None, "Failed to create SCRFD detector"
def test_create_detector_with_config():
"""
Test creating detector with custom configuration.
"""
detector = create_detector(
"retinaface",
model_name=RetinaFaceWeights.MNET_V2,
conf_thresh=0.8,
nms_thresh=0.3,
)
assert detector is not None, "Failed to create detector with custom config"
def test_create_detector_invalid_method():
"""
Test that invalid detector method raises an error.
"""
with pytest.raises((ValueError, KeyError)):
create_detector("invalid_method")
def test_create_detector_scrfd_with_model():
"""
Test creating SCRFD detector with specific model.
"""
detector = create_detector("scrfd", model_name=SCRFDWeights.SCRFD_10G_KPS, conf_thresh=0.5)
assert detector is not None, "Failed to create SCRFD with specific model"
# create_recognizer tests
def test_create_recognizer_arcface():
"""
Test creating an ArcFace recognizer using factory function.
"""
recognizer = create_recognizer("arcface")
assert recognizer is not None, "Failed to create ArcFace recognizer"
def test_create_recognizer_mobileface():
"""
Test creating a MobileFace recognizer using factory function.
"""
recognizer = create_recognizer("mobileface")
assert recognizer is not None, "Failed to create MobileFace recognizer"
def test_create_recognizer_sphereface():
"""
Test creating a SphereFace recognizer using factory function.
"""
recognizer = create_recognizer("sphereface")
assert recognizer is not None, "Failed to create SphereFace recognizer"
def test_create_recognizer_invalid_method():
"""
Test that invalid recognizer method raises an error.
"""
with pytest.raises((ValueError, KeyError)):
create_recognizer("invalid_method")
# create_landmarker tests
def test_create_landmarker():
"""
Test creating a Landmark106 detector using factory function.
"""
landmarker = create_landmarker("2d106det")
assert landmarker is not None, "Failed to create Landmark106 detector"
def test_create_landmarker_default():
"""
Test creating landmarker with default parameters.
"""
landmarker = create_landmarker()
assert landmarker is not None, "Failed to create default landmarker"
def test_create_landmarker_invalid_method():
"""
Test that invalid landmarker method raises an error.
"""
with pytest.raises((ValueError, KeyError)):
create_landmarker("invalid_method")
# detect_faces tests
def test_detect_faces_retinaface():
"""
Test high-level detect_faces function with RetinaFace.
"""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = detect_faces(mock_image, method="retinaface")
assert isinstance(faces, list), "detect_faces should return a list"
def test_detect_faces_scrfd():
"""
Test high-level detect_faces function with SCRFD.
"""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = detect_faces(mock_image, method="scrfd")
assert isinstance(faces, list), "detect_faces should return a list"
def test_detect_faces_with_threshold():
"""
Test detect_faces with custom confidence threshold.
"""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = detect_faces(mock_image, method="retinaface", conf_thresh=0.8)
assert isinstance(faces, list), "detect_faces should return a list"
# All detections should respect threshold
for face in faces:
assert face["confidence"] >= 0.8, "All detections should meet confidence threshold"
def test_detect_faces_default_method():
"""
Test detect_faces with default method (should use retinaface).
"""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = detect_faces(mock_image) # No method specified
assert isinstance(faces, list), "detect_faces should return a list with default method"
def test_detect_faces_empty_image():
"""
Test detect_faces on a blank image.
"""
empty_image = np.zeros((640, 640, 3), dtype=np.uint8)
faces = detect_faces(empty_image, method="retinaface")
assert isinstance(faces, list), "Should return a list even for empty image"
assert len(faces) == 0, "Should detect no faces in blank image"
# list_available_detectors tests
def test_list_available_detectors():
"""
Test that list_available_detectors returns a dictionary.
"""
detectors = list_available_detectors()
assert isinstance(detectors, dict), "Should return a dictionary of detectors"
assert len(detectors) > 0, "Should have at least one detector available"
def test_list_available_detectors_contents():
"""
Test that list includes known detectors.
"""
detectors = list_available_detectors()
# Should include at least these detectors
assert "retinaface" in detectors, "Should include 'retinaface'"
assert "scrfd" in detectors, "Should include 'scrfd'"
# Integration tests
def test_detector_inference_from_factory():
"""
Test that detector created from factory can perform inference.
"""
detector = create_detector("retinaface")
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = detector.detect(mock_image)
assert isinstance(faces, list), "Detector should return list of faces"
def test_recognizer_inference_from_factory():
"""
Test that recognizer created from factory can perform inference.
"""
recognizer = create_recognizer("arcface")
mock_image = np.random.randint(0, 255, (112, 112, 3), dtype=np.uint8)
embedding = recognizer.get_embedding(mock_image)
assert embedding is not None, "Recognizer should return embedding"
assert embedding.shape[1] == 512, "Should return 512-dimensional embedding"
def test_landmarker_inference_from_factory():
"""
Test that landmarker created from factory can perform inference.
"""
landmarker = create_landmarker("2d106det")
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
mock_bbox = [100, 100, 300, 300]
landmarks = landmarker.get_landmarks(mock_image, mock_bbox)
assert landmarks is not None, "Landmarker should return landmarks"
assert landmarks.shape == (106, 2), "Should return 106 landmarks"
def test_multiple_detector_creation():
"""
Test that multiple detectors can be created independently.
"""
detector1 = create_detector("retinaface")
detector2 = create_detector("scrfd")
assert detector1 is not None
assert detector2 is not None
assert detector1 is not detector2, "Should create separate instances"
def test_detector_with_different_configs():
"""
Test creating multiple detectors with different configurations.
"""
detector_high_thresh = create_detector("retinaface", conf_thresh=0.9)
detector_low_thresh = create_detector("retinaface", conf_thresh=0.3)
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces_high = detector_high_thresh.detect(mock_image)
faces_low = detector_low_thresh.detect(mock_image)
# Both should work
assert isinstance(faces_high, list)
assert isinstance(faces_low, list)
def test_factory_returns_correct_types():
"""
Test that factory functions return instances of the correct types.
"""
from uniface import RetinaFace, ArcFace, Landmark106
detector = create_detector("retinaface")
recognizer = create_recognizer("arcface")
landmarker = create_landmarker("2d106det")
assert isinstance(detector, RetinaFace), "Should return RetinaFace instance"
assert isinstance(recognizer, ArcFace), "Should return ArcFace instance"
assert isinstance(landmarker, Landmark106), "Should return Landmark106 instance"

107
tests/test_landmark.py Normal file
View File

@@ -0,0 +1,107 @@
import numpy as np
import pytest
from uniface.landmark import Landmark106
@pytest.fixture
def landmark_model():
return Landmark106()
@pytest.fixture
def mock_image():
return np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
@pytest.fixture
def mock_bbox():
return [100, 100, 300, 300]
def test_model_initialization(landmark_model):
assert landmark_model is not None, "Landmark106 model initialization failed."
def test_landmark_detection(landmark_model, mock_image, mock_bbox):
landmarks = landmark_model.get_landmarks(mock_image, mock_bbox)
assert landmarks.shape == (106, 2), f"Expected shape (106, 2), got {landmarks.shape}"
def test_landmark_dtype(landmark_model, mock_image, mock_bbox):
landmarks = landmark_model.get_landmarks(mock_image, mock_bbox)
assert landmarks.dtype == np.float32, f"Expected float32, got {landmarks.dtype}"
def test_landmark_coordinates_within_image(landmark_model, mock_image, mock_bbox):
landmarks = landmark_model.get_landmarks(mock_image, mock_bbox)
x_coords = landmarks[:, 0]
y_coords = landmarks[:, 1]
x1, y1, x2, y2 = mock_bbox
margin = 50
x_in_bounds = np.sum((x_coords >= x1 - margin) & (x_coords <= x2 + margin))
y_in_bounds = np.sum((y_coords >= y1 - margin) & (y_coords <= y2 + margin))
assert x_in_bounds >= 95, f"Only {x_in_bounds}/106 x-coordinates within bounds"
assert y_in_bounds >= 95, f"Only {y_in_bounds}/106 y-coordinates within bounds"
def test_different_bbox_sizes(landmark_model, mock_image):
test_bboxes = [
[50, 50, 150, 150],
[100, 100, 300, 300],
[50, 50, 400, 400],
]
for bbox in test_bboxes:
landmarks = landmark_model.get_landmarks(mock_image, bbox)
assert landmarks.shape == (106, 2), f"Failed for bbox {bbox}"
def test_landmark_array_format(landmark_model, mock_image, mock_bbox):
landmarks = landmark_model.get_landmarks(mock_image, mock_bbox)
landmarks_int = landmarks.astype(int)
assert landmarks_int.shape == (106, 2), "Integer conversion should preserve shape"
assert landmarks_int.dtype in [np.int32, np.int64], "Should convert to integer type"
def test_consistency(landmark_model, mock_image, mock_bbox):
landmarks1 = landmark_model.get_landmarks(mock_image, mock_bbox)
landmarks2 = landmark_model.get_landmarks(mock_image, mock_bbox)
assert np.allclose(landmarks1, landmarks2), "Same input should produce same landmarks"
def test_different_image_sizes(landmark_model, mock_bbox):
test_sizes = [(480, 640, 3), (720, 1280, 3), (1080, 1920, 3)]
for size in test_sizes:
mock_image = np.random.randint(0, 255, size, dtype=np.uint8)
landmarks = landmark_model.get_landmarks(mock_image, mock_bbox)
assert landmarks.shape == (106, 2), f"Failed for image size {size}"
def test_bbox_list_format(landmark_model, mock_image):
bbox_list = [100, 100, 300, 300]
landmarks = landmark_model.get_landmarks(mock_image, bbox_list)
assert landmarks.shape == (106, 2), "Should work with bbox as list"
def test_bbox_array_format(landmark_model, mock_image):
bbox_array = np.array([100, 100, 300, 300])
landmarks = landmark_model.get_landmarks(mock_image, bbox_array)
assert landmarks.shape == (106, 2), "Should work with bbox as numpy array"
def test_landmark_distribution(landmark_model, mock_image, mock_bbox):
landmarks = landmark_model.get_landmarks(mock_image, mock_bbox)
x_variance = np.var(landmarks[:, 0])
y_variance = np.var(landmarks[:, 1])
assert x_variance > 0, "Landmarks should have variation in x-coordinates"
assert y_variance > 0, "Landmarks should have variation in y-coordinates"

215
tests/test_recognition.py Normal file
View File

@@ -0,0 +1,215 @@
import numpy as np
import pytest
from uniface.recognition import ArcFace, MobileFace, SphereFace
@pytest.fixture
def arcface_model():
"""
Fixture to initialize the ArcFace model for testing.
"""
return ArcFace()
@pytest.fixture
def mobileface_model():
"""
Fixture to initialize the MobileFace model for testing.
"""
return MobileFace()
@pytest.fixture
def sphereface_model():
"""
Fixture to initialize the SphereFace model for testing.
"""
return SphereFace()
@pytest.fixture
def mock_aligned_face():
"""
Create a mock 112x112 aligned face image.
"""
return np.random.randint(0, 255, (112, 112, 3), dtype=np.uint8)
@pytest.fixture
def mock_landmarks():
"""
Create mock 5-point facial landmarks.
"""
return np.array(
[
[38.2946, 51.6963],
[73.5318, 51.5014],
[56.0252, 71.7366],
[41.5493, 92.3655],
[70.7299, 92.2041],
],
dtype=np.float32,
)
# ArcFace Tests
def test_arcface_initialization(arcface_model):
"""
Test that the ArcFace model initializes correctly.
"""
assert arcface_model is not None, "ArcFace model initialization failed."
def test_arcface_embedding_shape(arcface_model, mock_aligned_face):
"""
Test that ArcFace produces embeddings with the correct shape.
"""
embedding = arcface_model.get_embedding(mock_aligned_face)
# ArcFace typically produces 512-dimensional embeddings
assert embedding.shape[1] == 512, f"Expected 512-dim embedding, got {embedding.shape[1]}"
assert embedding.shape[0] == 1, "Embedding should have batch dimension of 1"
def test_arcface_normalized_embedding(arcface_model, mock_landmarks):
"""
Test that normalized embeddings have unit length.
"""
# Create a larger mock image for alignment
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
embedding = arcface_model.get_normalized_embedding(mock_image, mock_landmarks)
# Check that embedding is normalized (L2 norm ≈ 1.0)
norm = np.linalg.norm(embedding)
assert np.isclose(norm, 1.0, atol=1e-5), f"Normalized embedding should have norm 1.0, got {norm}"
def test_arcface_embedding_dtype(arcface_model, mock_aligned_face):
"""
Test that embeddings have the correct data type.
"""
embedding = arcface_model.get_embedding(mock_aligned_face)
assert embedding.dtype == np.float32, f"Expected float32, got {embedding.dtype}"
def test_arcface_consistency(arcface_model, mock_aligned_face):
"""
Test that the same input produces the same embedding.
"""
embedding1 = arcface_model.get_embedding(mock_aligned_face)
embedding2 = arcface_model.get_embedding(mock_aligned_face)
assert np.allclose(embedding1, embedding2), "Same input should produce same embedding"
# MobileFace Tests
def test_mobileface_initialization(mobileface_model):
"""
Test that the MobileFace model initializes correctly.
"""
assert mobileface_model is not None, "MobileFace model initialization failed."
def test_mobileface_embedding_shape(mobileface_model, mock_aligned_face):
"""
Test that MobileFace produces embeddings with the correct shape.
"""
embedding = mobileface_model.get_embedding(mock_aligned_face)
# MobileFace typically produces 512-dimensional embeddings
assert embedding.shape[1] == 512, f"Expected 512-dim embedding, got {embedding.shape[1]}"
assert embedding.shape[0] == 1, "Embedding should have batch dimension of 1"
def test_mobileface_normalized_embedding(mobileface_model, mock_landmarks):
"""
Test that MobileFace normalized embeddings have unit length.
"""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
embedding = mobileface_model.get_normalized_embedding(mock_image, mock_landmarks)
norm = np.linalg.norm(embedding)
assert np.isclose(norm, 1.0, atol=1e-5), f"Normalized embedding should have norm 1.0, got {norm}"
# SphereFace Tests
def test_sphereface_initialization(sphereface_model):
"""
Test that the SphereFace model initializes correctly.
"""
assert sphereface_model is not None, "SphereFace model initialization failed."
def test_sphereface_embedding_shape(sphereface_model, mock_aligned_face):
"""
Test that SphereFace produces embeddings with the correct shape.
"""
embedding = sphereface_model.get_embedding(mock_aligned_face)
# SphereFace typically produces 512-dimensional embeddings
assert embedding.shape[1] == 512, f"Expected 512-dim embedding, got {embedding.shape[1]}"
assert embedding.shape[0] == 1, "Embedding should have batch dimension of 1"
def test_sphereface_normalized_embedding(sphereface_model, mock_landmarks):
"""
Test that SphereFace normalized embeddings have unit length.
"""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
embedding = sphereface_model.get_normalized_embedding(mock_image, mock_landmarks)
norm = np.linalg.norm(embedding)
assert np.isclose(norm, 1.0, atol=1e-5), f"Normalized embedding should have norm 1.0, got {norm}"
# Cross-model comparison tests
def test_different_models_different_embeddings(arcface_model, mobileface_model, mock_aligned_face):
"""
Test that different models produce different embeddings for the same input.
"""
arcface_emb = arcface_model.get_embedding(mock_aligned_face)
mobileface_emb = mobileface_model.get_embedding(mock_aligned_face)
# Embeddings should be different (with high probability for random input)
# We check that they're not identical
assert not np.allclose(arcface_emb, mobileface_emb), "Different models should produce different embeddings"
def test_embedding_similarity_computation(arcface_model, mock_aligned_face):
"""
Test computing similarity between embeddings.
"""
# Get two embeddings
emb1 = arcface_model.get_embedding(mock_aligned_face)
# Create a slightly different image
mock_aligned_face2 = mock_aligned_face.copy()
mock_aligned_face2[:10, :10] = 0 # Modify a small region
emb2 = arcface_model.get_embedding(mock_aligned_face2)
# Compute cosine similarity
from uniface import compute_similarity
similarity = compute_similarity(emb1, emb2)
# Similarity should be between -1 and 1
assert -1.0 <= similarity <= 1.0, f"Similarity should be in [-1, 1], got {similarity}"
def test_same_face_high_similarity(arcface_model, mock_aligned_face):
"""
Test that the same face produces high similarity.
"""
emb1 = arcface_model.get_embedding(mock_aligned_face)
emb2 = arcface_model.get_embedding(mock_aligned_face)
from uniface import compute_similarity
similarity = compute_similarity(emb1, emb2)
# Same image should have similarity close to 1.0
assert similarity > 0.99, f"Same face should have similarity > 0.99, got {similarity}"

View File

@@ -7,9 +7,6 @@ from uniface.detection import RetinaFace
@pytest.fixture
def retinaface_model():
"""
Fixture to initialize the RetinaFace model for testing.
"""
return RetinaFace(
model_name=RetinaFaceWeights.MNET_V2,
conf_thresh=0.5,
@@ -20,67 +17,39 @@ def retinaface_model():
def test_model_initialization(retinaface_model):
"""
Test that the RetinaFace model initializes correctly.
"""
assert retinaface_model is not None, "Model initialization failed."
def test_inference_on_640x640_image(retinaface_model):
"""
Test inference on a 640x640 BGR image.
"""
# Generate a mock 640x640 BGR image
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
# Run inference - returns list of dictionaries
faces = retinaface_model.detect(mock_image)
# Check output type
assert isinstance(faces, list), "Detections should be a list."
# Check that each face has the expected structure
for face in faces:
assert isinstance(face, dict), "Each detection should be a dictionary."
assert "bbox" in face, "Each detection should have a 'bbox' key."
assert "confidence" in face, "Each detection should have a 'confidence' key."
assert "landmarks" in face, "Each detection should have a 'landmarks' key."
# Check bbox format
bbox = face["bbox"]
assert len(bbox) == 4, "BBox should have 4 values (x1, y1, x2, y2)."
# Check landmarks format
landmarks = face["landmarks"]
assert len(landmarks) == 5, "Should have 5 landmark points."
assert all(len(pt) == 2 for pt in landmarks), "Each landmark should be (x, y)."
def test_confidence_threshold(retinaface_model):
"""
Test that detections respect the confidence threshold.
"""
# Generate a mock 640x640 BGR image
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
# Run inference
faces = retinaface_model.detect(mock_image)
# Ensure all detections have confidence scores above the threshold
for face in faces:
confidence = face["confidence"]
assert confidence >= 0.5, f"Detection has confidence {confidence} below threshold 0.5"
def test_no_faces_detected(retinaface_model):
"""
Test inference on an image without detectable faces.
"""
# Generate an empty (black) 640x640 image
empty_image = np.zeros((640, 640, 3), dtype=np.uint8)
# Run inference
faces = retinaface_model.detect(empty_image)
# Ensure no detections are found
assert len(faces) == 0, "Should detect no faces in a blank image."

71
tests/test_scrfd.py Normal file
View File

@@ -0,0 +1,71 @@
import numpy as np
import pytest
from uniface.constants import SCRFDWeights
from uniface.detection import SCRFD
@pytest.fixture
def scrfd_model():
return SCRFD(
model_name=SCRFDWeights.SCRFD_500M_KPS,
conf_thresh=0.5,
nms_thresh=0.4,
)
def test_model_initialization(scrfd_model):
assert scrfd_model is not None, "Model initialization failed."
def test_inference_on_640x640_image(scrfd_model):
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = scrfd_model.detect(mock_image)
assert isinstance(faces, list), "Detections should be a list."
for face in faces:
assert isinstance(face, dict), "Each detection should be a dictionary."
assert "bbox" in face, "Each detection should have a 'bbox' key."
assert "confidence" in face, "Each detection should have a 'confidence' key."
assert "landmarks" in face, "Each detection should have a 'landmarks' key."
bbox = face["bbox"]
assert len(bbox) == 4, "BBox should have 4 values (x1, y1, x2, y2)."
landmarks = face["landmarks"]
assert len(landmarks) == 5, "Should have 5 landmark points."
assert all(len(pt) == 2 for pt in landmarks), "Each landmark should be (x, y)."
def test_confidence_threshold(scrfd_model):
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = scrfd_model.detect(mock_image)
for face in faces:
confidence = face["confidence"]
assert confidence >= 0.5, f"Detection has confidence {confidence} below threshold 0.5"
def test_no_faces_detected(scrfd_model):
empty_image = np.zeros((640, 640, 3), dtype=np.uint8)
faces = scrfd_model.detect(empty_image)
assert len(faces) == 0, "Should detect no faces in a blank image."
def test_different_input_sizes(scrfd_model):
test_sizes = [(480, 640, 3), (720, 1280, 3), (1080, 1920, 3)]
for size in test_sizes:
mock_image = np.random.randint(0, 255, size, dtype=np.uint8)
faces = scrfd_model.detect(mock_image)
assert isinstance(faces, list), f"Should return list for size {size}"
def test_scrfd_10g_model():
model = SCRFD(model_name=SCRFDWeights.SCRFD_10G_KPS, conf_thresh=0.5)
assert model is not None, "SCRFD 10G model initialization failed."
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
faces = model.detect(mock_image)
assert isinstance(faces, list), "SCRFD 10G should return list of detections."

262
tests/test_utils.py Normal file
View File

@@ -0,0 +1,262 @@
import numpy as np
import pytest
from uniface import compute_similarity, face_alignment
@pytest.fixture
def mock_image():
"""
Create a mock 640x640 BGR image.
"""
return np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
@pytest.fixture
def mock_landmarks():
"""
Create mock 5-point facial landmarks.
Standard positions for a face roughly centered at (112/2, 112/2).
"""
return np.array(
[
[38.2946, 51.6963], # Left eye
[73.5318, 51.5014], # Right eye
[56.0252, 71.7366], # Nose
[41.5493, 92.3655], # Left mouth corner
[70.7299, 92.2041], # Right mouth corner
],
dtype=np.float32,
)
# compute_similarity tests
def test_compute_similarity_same_embedding():
"""
Test that similarity of an embedding with itself is 1.0.
"""
embedding = np.random.randn(1, 512).astype(np.float32)
embedding = embedding / np.linalg.norm(embedding) # Normalize
similarity = compute_similarity(embedding, embedding)
assert np.isclose(similarity, 1.0, atol=1e-5), f"Self-similarity should be 1.0, got {similarity}"
def test_compute_similarity_range():
"""
Test that similarity is always in the range [-1, 1].
"""
# Test with multiple random embeddings
for _ in range(10):
emb1 = np.random.randn(1, 512).astype(np.float32)
emb2 = np.random.randn(1, 512).astype(np.float32)
# Normalize
emb1 = emb1 / np.linalg.norm(emb1)
emb2 = emb2 / np.linalg.norm(emb2)
similarity = compute_similarity(emb1, emb2)
assert -1.0 <= similarity <= 1.0, f"Similarity should be in [-1, 1], got {similarity}"
def test_compute_similarity_orthogonal():
"""
Test that orthogonal embeddings have similarity close to 0.
"""
# Create orthogonal embeddings
emb1 = np.zeros((1, 512), dtype=np.float32)
emb1[0, 0] = 1.0 # [1, 0, 0, ..., 0]
emb2 = np.zeros((1, 512), dtype=np.float32)
emb2[0, 1] = 1.0 # [0, 1, 0, ..., 0]
similarity = compute_similarity(emb1, emb2)
assert np.isclose(similarity, 0.0, atol=1e-5), f"Orthogonal embeddings should have similarity 0.0, got {similarity}"
def test_compute_similarity_opposite():
"""
Test that opposite embeddings have similarity close to -1.
"""
emb1 = np.ones((1, 512), dtype=np.float32)
emb1 = emb1 / np.linalg.norm(emb1)
emb2 = -emb1 # Opposite direction
similarity = compute_similarity(emb1, emb2)
assert np.isclose(similarity, -1.0, atol=1e-5), f"Opposite embeddings should have similarity -1.0, got {similarity}"
def test_compute_similarity_symmetry():
"""
Test that similarity(A, B) == similarity(B, A).
"""
emb1 = np.random.randn(1, 512).astype(np.float32)
emb2 = np.random.randn(1, 512).astype(np.float32)
# Normalize
emb1 = emb1 / np.linalg.norm(emb1)
emb2 = emb2 / np.linalg.norm(emb2)
sim_12 = compute_similarity(emb1, emb2)
sim_21 = compute_similarity(emb2, emb1)
assert np.isclose(sim_12, sim_21), "Similarity should be symmetric"
def test_compute_similarity_dtype():
"""
Test that compute_similarity returns a float.
"""
emb1 = np.random.randn(1, 512).astype(np.float32)
emb2 = np.random.randn(1, 512).astype(np.float32)
# Normalize
emb1 = emb1 / np.linalg.norm(emb1)
emb2 = emb2 / np.linalg.norm(emb2)
similarity = compute_similarity(emb1, emb2)
assert isinstance(similarity, (float, np.floating)), f"Similarity should be float, got {type(similarity)}"
# face_alignment tests
def test_face_alignment_output_shape(mock_image, mock_landmarks):
"""
Test that face_alignment produces output with the correct shape.
"""
aligned, _ = face_alignment(mock_image, mock_landmarks, image_size=(112, 112))
assert aligned.shape == (112, 112, 3), f"Expected shape (112, 112, 3), got {aligned.shape}"
def test_face_alignment_dtype(mock_image, mock_landmarks):
"""
Test that aligned face has the correct data type.
"""
aligned, _ = face_alignment(mock_image, mock_landmarks, image_size=(112, 112))
assert aligned.dtype == np.uint8, f"Expected uint8, got {aligned.dtype}"
def test_face_alignment_different_sizes(mock_image, mock_landmarks):
"""
Test face alignment with different output sizes.
"""
# Only test sizes that are multiples of 112 or 128 as required by the function
test_sizes = [(112, 112), (128, 128), (224, 224)]
for size in test_sizes:
aligned, _ = face_alignment(mock_image, mock_landmarks, image_size=size)
assert aligned.shape == (*size, 3), f"Failed for size {size}"
def test_face_alignment_consistency(mock_image, mock_landmarks):
"""
Test that the same input produces the same aligned face.
"""
aligned1, _ = face_alignment(mock_image, mock_landmarks, image_size=(112, 112))
aligned2, _ = face_alignment(mock_image, mock_landmarks, image_size=(112, 112))
assert np.allclose(aligned1, aligned2), "Same input should produce same aligned face"
def test_face_alignment_landmarks_as_list(mock_image):
"""
Test that landmarks can be passed as a list of lists (converted to array).
"""
landmarks_list = [
[38.2946, 51.6963],
[73.5318, 51.5014],
[56.0252, 71.7366],
[41.5493, 92.3655],
[70.7299, 92.2041],
]
# Convert list to numpy array before passing to face_alignment
landmarks_array = np.array(landmarks_list, dtype=np.float32)
aligned, _ = face_alignment(mock_image, landmarks_array, image_size=(112, 112))
assert aligned.shape == (112, 112, 3), "Should work with landmarks as array"
def test_face_alignment_value_range(mock_image, mock_landmarks):
"""
Test that aligned face pixel values are in valid range [0, 255].
"""
aligned, _ = face_alignment(mock_image, mock_landmarks, image_size=(112, 112))
assert np.all(aligned >= 0), "Pixel values should be >= 0"
assert np.all(aligned <= 255), "Pixel values should be <= 255"
def test_face_alignment_not_all_zeros(mock_image, mock_landmarks):
"""
Test that aligned face is not all zeros (actual transformation occurred).
"""
aligned, _ = face_alignment(mock_image, mock_landmarks, image_size=(112, 112))
# At least some pixels should be non-zero
assert np.any(aligned > 0), "Aligned face should have some non-zero pixels"
def test_face_alignment_from_different_positions(mock_image):
"""
Test alignment with landmarks at different positions in the image.
"""
# Landmarks at different positions
positions = [
np.array(
[[100, 100], [150, 100], [125, 130], [110, 150], [140, 150]],
dtype=np.float32,
),
np.array(
[[300, 200], [350, 200], [325, 230], [310, 250], [340, 250]],
dtype=np.float32,
),
np.array(
[[500, 400], [550, 400], [525, 430], [510, 450], [540, 450]],
dtype=np.float32,
),
]
for landmarks in positions:
aligned, _ = face_alignment(mock_image, landmarks, image_size=(112, 112))
assert aligned.shape == (112, 112, 3), f"Failed for landmarks at {landmarks[0]}"
def test_face_alignment_landmark_count(mock_image):
"""
Test that face_alignment works specifically with 5-point landmarks.
"""
# Standard 5-point landmarks
landmarks_5pt = np.array(
[
[38.2946, 51.6963],
[73.5318, 51.5014],
[56.0252, 71.7366],
[41.5493, 92.3655],
[70.7299, 92.2041],
],
dtype=np.float32,
)
aligned, _ = face_alignment(mock_image, landmarks_5pt, image_size=(112, 112))
assert aligned.shape == (112, 112, 3), "Should work with 5-point landmarks"
def test_compute_similarity_with_recognition_embeddings():
"""
Test compute_similarity with realistic embedding dimensions.
"""
# Simulate ArcFace/MobileFace/SphereFace embeddings (512-dim)
emb1 = np.random.randn(1, 512).astype(np.float32)
emb2 = np.random.randn(1, 512).astype(np.float32)
# Normalize (as done in get_normalized_embedding)
emb1 = emb1 / np.linalg.norm(emb1)
emb2 = emb2 / np.linalg.norm(emb2)
similarity = compute_similarity(emb1, emb2)
# Should be a valid similarity score
assert -1.0 <= similarity <= 1.0
assert isinstance(similarity, (float, np.floating))

View File

@@ -13,7 +13,7 @@
__license__ = "MIT"
__author__ = "Yakhyokhuja Valikhujaev"
__version__ = "0.1.9"
__version__ = "1.1.1"
from uniface.face_utils import compute_similarity, face_alignment
@@ -22,11 +22,18 @@ from uniface.model_store import verify_model_weights
from uniface.visualization import draw_detections
from .attribute import AgeGender
try:
from .attribute import Emotion
except ImportError:
Emotion = None # PyTorch not installed
from .detection import SCRFD, RetinaFace, create_detector, detect_faces, list_available_detectors
from .detection import (
SCRFD,
RetinaFace,
create_detector,
detect_faces,
list_available_detectors,
)
from .landmark import Landmark106, create_landmarker
from .recognition import ArcFace, MobileFace, SphereFace, create_recognizer

View File

@@ -2,7 +2,8 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Dict, Any, List, Union
from typing import Any, Dict, List, Union
import numpy as np
from uniface.attribute.age_gender import AgeGender
@@ -12,18 +13,14 @@ from uniface.constants import AgeGenderWeights, DDAMFNWeights
# Emotion requires PyTorch - make it optional
try:
from uniface.attribute.emotion import Emotion
_EMOTION_AVAILABLE = True
except ImportError:
Emotion = None
_EMOTION_AVAILABLE = False
# Public API for the attribute module
__all__ = [
"AgeGender",
"Emotion",
"create_attribute_predictor",
"predict_attributes"
]
__all__ = ["AgeGender", "Emotion", "create_attribute_predictor", "predict_attributes"]
# A mapping from model enums to their corresponding attribute classes
_ATTRIBUTE_MODELS = {
@@ -35,10 +32,7 @@ if _EMOTION_AVAILABLE:
_ATTRIBUTE_MODELS.update({model: Emotion for model in DDAMFNWeights})
def create_attribute_predictor(
model_name: Union[AgeGenderWeights, DDAMFNWeights],
**kwargs: Any
) -> Attribute:
def create_attribute_predictor(model_name: Union[AgeGenderWeights, DDAMFNWeights], **kwargs: Any) -> Attribute:
"""
Factory function to create an attribute predictor instance.
@@ -59,17 +53,16 @@ def create_attribute_predictor(
model_class = _ATTRIBUTE_MODELS.get(model_name)
if model_class is None:
raise ValueError(f"Unsupported attribute model: {model_name}. "
f"Please choose from AgeGenderWeights or DDAMFNWeights.")
raise ValueError(
f"Unsupported attribute model: {model_name}. Please choose from AgeGenderWeights or DDAMFNWeights."
)
# Pass model_name to the constructor, as some classes might need it
return model_class(model_name=model_name, **kwargs)
def predict_attributes(
image: np.ndarray,
detections: List[Dict[str, np.ndarray]],
predictor: Attribute
image: np.ndarray, detections: List[Dict[str, np.ndarray]], predictor: Attribute
) -> List[Dict[str, Any]]:
"""
High-level API to predict attributes for multiple detected faces.
@@ -91,16 +84,16 @@ def predict_attributes(
"""
for face in detections:
# Initialize attributes dict if it doesn't exist
if 'attributes' not in face:
face['attributes'] = {}
if "attributes" not in face:
face["attributes"] = {}
if isinstance(predictor, AgeGender):
gender, age = predictor(image, face['bbox'])
face['attributes']['gender'] = gender
face['attributes']['age'] = age
gender, age = predictor(image, face["bbox"])
face["attributes"]["gender"] = gender
face["attributes"]["age"] = age
elif isinstance(predictor, Emotion):
emotion, confidence = predictor(image, face['landmark'])
face['attributes']['emotion'] = emotion
face['attributes']['confidence'] = confidence
emotion, confidence = predictor(image, face["landmark"])
face["attributes"]["emotion"] = emotion
face["attributes"]["confidence"] = confidence
return detections

View File

@@ -51,8 +51,11 @@ class AgeGender(Attribute):
self.output_names = [output.name for output in self.session.get_outputs()]
Logger.info(f"Successfully initialized AgeGender model with input size {self.input_size}")
except Exception as e:
Logger.error(f"Failed to load AgeGender model from '{self.model_path}'", exc_info=True)
raise RuntimeError(f"Failed to initialize AgeGender model: {e}")
Logger.error(
f"Failed to load AgeGender model from '{self.model_path}'",
exc_info=True,
)
raise RuntimeError(f"Failed to initialize AgeGender model: {e}") from e
def preprocess(self, image: np.ndarray, bbox: Union[List, np.ndarray]) -> np.ndarray:
"""
@@ -76,7 +79,11 @@ class AgeGender(Attribute):
aligned_face, _ = bbox_center_alignment(image, center, self.input_size[1], scale, rotation)
blob = cv2.dnn.blobFromImage(
aligned_face, scalefactor=1.0, size=self.input_size[::-1], mean=(0.0, 0.0, 0.0), swapRB=True
aligned_face,
scalefactor=1.0,
size=self.input_size[::-1],
mean=(0.0, 0.0, 0.0),
swapRB=True,
)
return blob
@@ -157,7 +164,15 @@ if __name__ == "__main__":
# Prepare text and draw on the frame
label = f"{gender}, {age}"
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
cv2.putText(frame, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 2)
cv2.putText(
frame,
label,
(x1, y1 - 10),
cv2.FONT_HERSHEY_SIMPLEX,
0.8,
(0, 255, 0),
2,
)
# Display the resulting frame
cv2.imshow("Age and Gender Inference (Press 'q' to quit)", frame)

View File

@@ -4,6 +4,7 @@
from abc import ABC, abstractmethod
from typing import Any
import numpy as np

View File

@@ -2,15 +2,16 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import List, Tuple, Union
import cv2
import torch
import numpy as np
from typing import Tuple, Union, List
import torch
from uniface.attribute.base import Attribute
from uniface.log import Logger
from uniface.constants import DDAMFNWeights
from uniface.face_utils import face_alignment
from uniface.log import Logger
from uniface.model_store import verify_model_weights
__all__ = ["Emotion"]
@@ -43,7 +44,15 @@ class Emotion(Attribute):
self.model_path = verify_model_weights(model_weights)
# Define emotion labels based on the selected model
self.emotion_labels = ["Neutral", "Happy", "Sad", "Surprise", "Fear", "Disgust", "Angry"]
self.emotion_labels = [
"Neutral",
"Happy",
"Sad",
"Surprise",
"Fear",
"Disgust",
"Angry",
]
if model_weights == DDAMFNWeights.AFFECNET8:
self.emotion_labels.append("Contempt")
@@ -63,7 +72,7 @@ class Emotion(Attribute):
Logger.info(f"Successfully initialized Emotion model on {self.device}")
except Exception as e:
Logger.error(f"Failed to load Emotion model from '{self.model_path}'", exc_info=True)
raise RuntimeError(f"Failed to initialize Emotion model: {e}")
raise RuntimeError(f"Failed to initialize Emotion model: {e}") from e
def preprocess(self, image: np.ndarray, landmark: Union[List, np.ndarray]) -> torch.Tensor:
"""
@@ -77,7 +86,7 @@ class Emotion(Attribute):
torch.Tensor: The preprocessed image tensor ready for inference.
"""
landmark = np.asarray(landmark)
aligned_image, _ = face_alignment(image, landmark)
# Convert BGR to RGB, resize, normalize, and convert to a CHW tensor
@@ -115,8 +124,8 @@ class Emotion(Attribute):
# TODO: below is only for testing, remove it later
if __name__ == "__main__":
from uniface.detection import create_detector
from uniface.constants import RetinaFaceWeights
from uniface.detection import create_detector
print("Initializing models for live inference...")
# 1. Initialize the face detector
@@ -145,26 +154,34 @@ if __name__ == "__main__":
# For each detected face, predict the emotion
for detection in detections:
box = detection['bbox']
landmark = detection['landmarks']
box = detection["bbox"]
landmark = detection["landmarks"]
x1, y1, x2, y2 = map(int, box)
# Predict attributes using the landmark
emotion, confidence = emotion_predictor.predict(frame, landmark)
# Prepare text and draw on the frame
label = f"{emotion} ({confidence:.2f})"
cv2.rectangle(frame, (x1, y1), (x2, y2), (255, 0, 0), 2)
cv2.putText(frame, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (255, 0, 0), 2)
cv2.putText(
frame,
label,
(x1, y1 - 10),
cv2.FONT_HERSHEY_SIMPLEX,
0.8,
(255, 0, 0),
2,
)
# Display the resulting frame
cv2.imshow("Emotion Inference (Press 'q' to quit)", frame)
# Break the loop if 'q' is pressed
if cv2.waitKey(1) & 0xFF == ord('q'):
if cv2.waitKey(1) & 0xFF == ord("q"):
break
# Release resources
cap.release()
cv2.destroyAllWindows()
print("Inference stopped.")
print("Inference stopped.")

View File

@@ -2,12 +2,12 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
import cv2
import math
import itertools
import numpy as np
import math
from typing import List, Tuple
from typing import Tuple, List
import cv2
import numpy as np
def resize_image(frame, target_shape: Tuple[int, int] = (640, 640)) -> Tuple[np.ndarray, float]:
@@ -59,12 +59,7 @@ def generate_anchors(image_size: Tuple[int, int] = (640, 640)) -> np.ndarray:
min_sizes = [[16, 32], [64, 128], [256, 512]]
anchors = []
feature_maps = [
[
math.ceil(image_size[0] / step),
math.ceil(image_size[1] / step)
] for step in steps
]
feature_maps = [[math.ceil(image_size[0] / step), math.ceil(image_size[1] / step)] for step in steps]
for k, (map_height, map_width) in enumerate(feature_maps):
step = steps[k]

View File

@@ -5,6 +5,7 @@
from enum import Enum
from typing import Dict
# fmt: off
class SphereFaceWeights(str, Enum):
"""
@@ -29,7 +30,7 @@ class ArcFaceWeights(str, Enum):
Pretrained weights from ArcFace model (insightface).
https://github.com/deepinsight/insightface
"""
MNET = "arcface_mnet"
MNET = "arcface_mnet"
RESNET = "arcface_resnet"
class RetinaFaceWeights(str, Enum):
@@ -82,83 +83,65 @@ class LandmarkWeights(str, Enum):
MODEL_URLS: Dict[Enum, str] = {
# RetinaFace
RetinaFaceWeights.MNET_025: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/retinaface_mv1_0.25.onnx',
RetinaFaceWeights.MNET_050: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/retinaface_mv1_0.50.onnx',
RetinaFaceWeights.MNET_V1: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/retinaface_mv1.onnx',
RetinaFaceWeights.MNET_V2: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/retinaface_mv2.onnx',
RetinaFaceWeights.RESNET18: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/retinaface_r18.onnx',
RetinaFaceWeights.RESNET34: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/retinaface_r34.onnx',
RetinaFaceWeights.MNET_025: "https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1_0.25.onnx",
RetinaFaceWeights.MNET_050: "https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1_0.50.onnx",
RetinaFaceWeights.MNET_V1: "https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1.onnx",
RetinaFaceWeights.MNET_V2: "https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv2.onnx",
RetinaFaceWeights.RESNET18: "https://github.com/yakhyo/uniface/releases/download/weights/retinaface_r18.onnx",
RetinaFaceWeights.RESNET34: "https://github.com/yakhyo/uniface/releases/download/weights/retinaface_r34.onnx",
# MobileFace
MobileFaceWeights.MNET_025: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/###',
MobileFaceWeights.MNET_V2: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/###',
MobileFaceWeights.MNET_V3_SMALL: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/###',
MobileFaceWeights.MNET_V3_LARGE: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/###',
MobileFaceWeights.MNET_025: "https://github.com/yakhyo/uniface/releases/download/weights/mobilenetv1_0.25.onnx",
MobileFaceWeights.MNET_V2: "https://github.com/yakhyo/uniface/releases/download/weights/mobilenetv2.onnx",
MobileFaceWeights.MNET_V3_SMALL: "https://github.com/yakhyo/uniface/releases/download/weights/mobilenetv3_small.onnx",
MobileFaceWeights.MNET_V3_LARGE: "https://github.com/yakhyo/uniface/releases/download/weights/mobilenetv3_large.onnx",
# SphereFace
SphereFaceWeights.SPHERE20: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/###',
SphereFaceWeights.SPHERE36: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/###',
SphereFaceWeights.SPHERE20: "https://github.com/yakhyo/uniface/releases/download/weights/sphere20.onnx",
SphereFaceWeights.SPHERE36: "https://github.com/yakhyo/uniface/releases/download/weights/sphere36.onnx",
# ArcFace
ArcFaceWeights.MNET: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/w600k_mbf.onnx',
ArcFaceWeights.RESNET: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/w600k_r50.onnx',
ArcFaceWeights.MNET: "https://github.com/yakhyo/uniface/releases/download/weights/w600k_mbf.onnx",
ArcFaceWeights.RESNET: "https://github.com/yakhyo/uniface/releases/download/weights/w600k_r50.onnx",
# SCRFD
SCRFDWeights.SCRFD_10G_KPS: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/scrfd_10g_kps.onnx',
SCRFDWeights.SCRFD_500M_KPS: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/scrfd_500m_kps.onnx',
SCRFDWeights.SCRFD_10G_KPS: "https://github.com/yakhyo/uniface/releases/download/weights/scrfd_10g_kps.onnx",
SCRFDWeights.SCRFD_500M_KPS: "https://github.com/yakhyo/uniface/releases/download/weights/scrfd_500m_kps.onnx",
# DDAFM
DDAMFNWeights.AFFECNET7: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/affecnet7.script',
DDAMFNWeights.AFFECNET8: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/affecnet8.script',
DDAMFNWeights.AFFECNET7: "https://github.com/yakhyo/uniface/releases/download/weights/affecnet7.script",
DDAMFNWeights.AFFECNET8: "https://github.com/yakhyo/uniface/releases/download/weights/affecnet8.script",
# AgeGender
AgeGenderWeights.DEFAULT: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/genderage.onnx',
AgeGenderWeights.DEFAULT: "https://github.com/yakhyo/uniface/releases/download/weights/genderage.onnx",
# Landmarks
LandmarkWeights.DEFAULT: 'https://github.com/yakhyo/uniface/releases/download/v0.1.2/2d106det.onnx',
LandmarkWeights.DEFAULT: "https://github.com/yakhyo/uniface/releases/download/weights/2d106det.onnx",
}
MODEL_SHA256: Dict[Enum, str] = {
# RetinaFace
RetinaFaceWeights.MNET_025: 'b7a7acab55e104dce6f32cdfff929bd83946da5cd869b9e2e9bdffafd1b7e4a5',
RetinaFaceWeights.MNET_050: 'd8977186f6037999af5b4113d42ba77a84a6ab0c996b17c713cc3d53b88bfc37',
RetinaFaceWeights.MNET_V1: '75c961aaf0aff03d13c074e9ec656e5510e174454dd4964a161aab4fe5f04153',
RetinaFaceWeights.MNET_V2: '3ca44c045651cabeed1193a1fae8946ad1f3a55da8fa74b341feab5a8319f757',
RetinaFaceWeights.RESNET18: 'e8b5ddd7d2c3c8f7c942f9f10cec09d8e319f78f09725d3f709631de34fb649d',
RetinaFaceWeights.RESNET34: 'bd0263dc2a465d32859555cb1741f2d98991eb0053696e8ee33fec583d30e630',
RetinaFaceWeights.MNET_025: "b7a7acab55e104dce6f32cdfff929bd83946da5cd869b9e2e9bdffafd1b7e4a5",
RetinaFaceWeights.MNET_050: "d8977186f6037999af5b4113d42ba77a84a6ab0c996b17c713cc3d53b88bfc37",
RetinaFaceWeights.MNET_V1: "75c961aaf0aff03d13c074e9ec656e5510e174454dd4964a161aab4fe5f04153",
RetinaFaceWeights.MNET_V2: "3ca44c045651cabeed1193a1fae8946ad1f3a55da8fa74b341feab5a8319f757",
RetinaFaceWeights.RESNET18: "e8b5ddd7d2c3c8f7c942f9f10cec09d8e319f78f09725d3f709631de34fb649d",
RetinaFaceWeights.RESNET34: "bd0263dc2a465d32859555cb1741f2d98991eb0053696e8ee33fec583d30e630",
# MobileFace
MobileFaceWeights.MNET_025: '#',
MobileFaceWeights.MNET_V2: '#',
MobileFaceWeights.MNET_V3_SMALL: '#',
MobileFaceWeights.MNET_V3_LARGE: '#',
MobileFaceWeights.MNET_025: "eeda7d23d9c2b40cf77fa8da8e895b5697465192648852216074679657f8ee8b",
MobileFaceWeights.MNET_V2: "38b148284dd48cc898d5d4453104252fbdcbacc105fe3f0b80e78954d9d20d89",
MobileFaceWeights.MNET_V3_SMALL: "d4acafa1039a82957aa8a9a1dac278a401c353a749c39df43de0e29cc1c127c3",
MobileFaceWeights.MNET_V3_LARGE: "0e48f8e11f070211716d03e5c65a3db35a5e917cfb5bc30552358629775a142a",
# SphereFace
SphereFaceWeights.SPHERE20: '#',
SphereFaceWeights.SPHERE36: '#',
SphereFaceWeights.SPHERE20: "c02878cf658eb1861f580b7e7144b0d27cc29c440bcaa6a99d466d2854f14c9d",
SphereFaceWeights.SPHERE36: "13b3890cd5d7dec2b63f7c36fd7ce07403e5a0bbb701d9647c0289e6cbe7bb20",
# ArcFace
ArcFaceWeights.MNET: '9cc6e4a75f0e2bf0b1aed94578f144d15175f357bdc05e815e5c4a02b319eb4f',
ArcFaceWeights.RESNET: '4c06341c33c2ca1f86781dab0e829f88ad5b64be9fba56e56bc9ebdefc619e43',
ArcFaceWeights.MNET: "9cc6e4a75f0e2bf0b1aed94578f144d15175f357bdc05e815e5c4a02b319eb4f",
ArcFaceWeights.RESNET: "4c06341c33c2ca1f86781dab0e829f88ad5b64be9fba56e56bc9ebdefc619e43",
# SCRFD
SCRFDWeights.SCRFD_10G_KPS: '5838f7fe053675b1c7a08b633df49e7af5495cee0493c7dcf6697200b85b5b91',
SCRFDWeights.SCRFD_500M_KPS: '5e4447f50245bbd7966bd6c0fa52938c61474a04ec7def48753668a9d8b4ea3a',
SCRFDWeights.SCRFD_10G_KPS: "5838f7fe053675b1c7a08b633df49e7af5495cee0493c7dcf6697200b85b5b91",
SCRFDWeights.SCRFD_500M_KPS: "5e4447f50245bbd7966bd6c0fa52938c61474a04ec7def48753668a9d8b4ea3a",
# DDAFM
DDAMFNWeights.AFFECNET7: '10535bf8b6afe8e9d6ae26cea6c3add9a93036e9addb6adebfd4a972171d015d',
DDAMFNWeights.AFFECNET8: '8c66963bc71db42796a14dfcbfcd181b268b65a3fc16e87147d6a3a3d7e0f487',
DDAMFNWeights.AFFECNET7: "10535bf8b6afe8e9d6ae26cea6c3add9a93036e9addb6adebfd4a972171d015d",
DDAMFNWeights.AFFECNET8: "8c66963bc71db42796a14dfcbfcd181b268b65a3fc16e87147d6a3a3d7e0f487",
# AgeGender
AgeGenderWeights.DEFAULT: '4fde69b1c810857b88c64a335084f1c3fe8f01246c9a191b48c7bb756d6652fb',
AgeGenderWeights.DEFAULT: "4fde69b1c810857b88c64a335084f1c3fe8f01246c9a191b48c7bb756d6652fb",
# Landmark
LandmarkWeights.DEFAULT: 'f001b856447c413801ef5c42091ed0cd516fcd21f2d6b79635b1e733a7109dbf',
LandmarkWeights.DEFAULT: "f001b856447c413801ef5c42091ed0cd516fcd21f2d6b79635b1e733a7109dbf",
}
CHUNK_SIZE = 8192

View File

@@ -3,18 +3,19 @@
# GitHub: https://github.com/yakhyo
import numpy as np
from typing import Tuple, Dict, Any, List
from typing import Any, Dict, List
import numpy as np
from .scrfd import SCRFD
from .base import BaseDetector
from .retinaface import RetinaFace
from .scrfd import SCRFD
# Global cache for detector instances
_detector_cache: Dict[str, BaseDetector] = {}
def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs) -> List[Dict[str, Any]]:
def detect_faces(image: np.ndarray, method: str = "retinaface", **kwargs) -> List[Dict[str, Any]]:
"""
High-level face detection function.
@@ -38,7 +39,7 @@ def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs) -> Lis
... print(f"BBox: {face['bbox']}")
"""
method_name = method.lower()
sorted_kwargs = sorted(kwargs.items())
cache_key = f"{method_name}_{str(sorted_kwargs)}"
@@ -50,7 +51,7 @@ def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs) -> Lis
return detector.detect(image)
def create_detector(method: str = 'retinaface', **kwargs) -> BaseDetector:
def create_detector(method: str = "retinaface", **kwargs) -> BaseDetector:
"""
Factory function to create face detectors.
@@ -88,18 +89,15 @@ def create_detector(method: str = 'retinaface', **kwargs) -> BaseDetector:
"""
method = method.lower()
if method == 'retinaface':
if method == "retinaface":
return RetinaFace(**kwargs)
elif method == 'scrfd':
elif method == "scrfd":
return SCRFD(**kwargs)
else:
available_methods = ['retinaface', 'scrfd']
raise ValueError(
f"Unsupported detection method: '{method}'. "
f"Available methods: {available_methods}"
)
available_methods = ["retinaface", "scrfd"]
raise ValueError(f"Unsupported detection method: '{method}'. Available methods: {available_methods}")
def list_available_detectors() -> Dict[str, Dict[str, Any]]:
@@ -110,36 +108,36 @@ def list_available_detectors() -> Dict[str, Dict[str, Any]]:
Dict[str, Dict[str, Any]]: Dictionary of detector information
"""
return {
'retinaface': {
'description': 'RetinaFace detector with high accuracy',
'supports_landmarks': True,
'paper': 'https://arxiv.org/abs/1905.00641',
'default_params': {
'model_name': 'mnet_v2',
'conf_thresh': 0.5,
'nms_thresh': 0.4,
'input_size': (640, 640)
}
"retinaface": {
"description": "RetinaFace detector with high accuracy",
"supports_landmarks": True,
"paper": "https://arxiv.org/abs/1905.00641",
"default_params": {
"model_name": "mnet_v2",
"conf_thresh": 0.5,
"nms_thresh": 0.4,
"input_size": (640, 640),
},
},
"scrfd": {
"description": "SCRFD detector - fast and accurate with efficient architecture",
"supports_landmarks": True,
"paper": "https://arxiv.org/abs/2105.04714",
"default_params": {
"model_name": "scrfd_10g_kps",
"conf_thresh": 0.5,
"nms_thresh": 0.4,
"input_size": (640, 640),
},
},
'scrfd': {
'description': 'SCRFD detector - fast and accurate with efficient architecture',
'supports_landmarks': True,
'paper': 'https://arxiv.org/abs/2105.04714',
'default_params': {
'model_name': 'scrfd_10g_kps',
'conf_thresh': 0.5,
'nms_thresh': 0.4,
'input_size': (640, 640)
}
}
}
__all__ = [
'detect_faces',
'create_detector',
'list_available_detectors',
'SCRFD',
'RetinaFace',
'BaseDetector',
"detect_faces",
"create_detector",
"list_available_detectors",
"SCRFD",
"RetinaFace",
"BaseDetector",
]

View File

@@ -6,9 +6,10 @@
Base classes for face detection.
"""
import numpy as np
from abc import ABC, abstractmethod
from typing import Tuple, Dict, Any
from typing import Any, Dict, Tuple
import numpy as np
class BaseDetector(ABC):
@@ -84,7 +85,7 @@ class BaseDetector(ABC):
Returns:
bool: True if landmarks are supported, False otherwise
"""
return hasattr(self, '_supports_landmarks') and self._supports_landmarks
return hasattr(self, "_supports_landmarks") and self._supports_landmarks
def get_info(self) -> Dict[str, Any]:
"""
@@ -94,7 +95,7 @@ class BaseDetector(ABC):
Dict[str, Any]: Detector information
"""
return {
'name': self.__class__.__name__,
'supports_landmarks': self._supports_landmarks,
'config': self.config
"name": self.__class__.__name__,
"supports_landmarks": self._supports_landmarks,
"config": self.config,
}

View File

@@ -2,22 +2,22 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Any, Dict, List, Literal, Tuple
import numpy as np
from typing import Tuple, List, Literal, Dict, Any
from uniface.constants import RetinaFaceWeights
from uniface.log import Logger
from uniface.model_store import verify_model_weights
from uniface.constants import RetinaFaceWeights
from uniface.onnx_utils import create_onnx_session
from .base import BaseDetector
from .utils import (
decode_boxes,
decode_landmarks,
generate_anchors,
non_max_supression,
resize_image,
decode_boxes,
generate_anchors,
decode_landmarks
)
@@ -59,13 +59,13 @@ class RetinaFace(BaseDetector):
super().__init__(**kwargs)
self._supports_landmarks = True # RetinaFace supports landmarks
self.model_name = kwargs.get('model_name', RetinaFaceWeights.MNET_V2)
self.conf_thresh = kwargs.get('conf_thresh', 0.5)
self.nms_thresh = kwargs.get('nms_thresh', 0.4)
self.pre_nms_topk = kwargs.get('pre_nms_topk', 5000)
self.post_nms_topk = kwargs.get('post_nms_topk', 750)
self.dynamic_size = kwargs.get('dynamic_size', False)
self.input_size = kwargs.get('input_size', (640, 640))
self.model_name = kwargs.get("model_name", RetinaFaceWeights.MNET_V2)
self.conf_thresh = kwargs.get("conf_thresh", 0.5)
self.nms_thresh = kwargs.get("nms_thresh", 0.4)
self.pre_nms_topk = kwargs.get("pre_nms_topk", 5000)
self.post_nms_topk = kwargs.get("post_nms_topk", 750)
self.dynamic_size = kwargs.get("dynamic_size", False)
self.input_size = kwargs.get("input_size", (640, 640))
Logger.info(
f"Initializing RetinaFace with model={self.model_name}, conf_thresh={self.conf_thresh}, nms_thresh={self.nms_thresh}, "
@@ -133,7 +133,7 @@ class RetinaFace(BaseDetector):
image: np.ndarray,
max_num: int = 0,
metric: Literal["default", "max"] = "max",
center_weight: float = 2.0
center_weight: float = 2.0,
) -> List[Dict[str, Any]]:
"""
Perform face detection on an input image and return bounding boxes and facial landmarks.
@@ -178,14 +178,16 @@ class RetinaFace(BaseDetector):
# Calculate offsets from image center
center = (original_height // 2, original_width // 2)
offsets = np.vstack([
(detections[:, 0] + detections[:, 2]) / 2 - center[1],
(detections[:, 1] + detections[:, 3]) / 2 - center[0]
])
offsets = np.vstack(
[
(detections[:, 0] + detections[:, 2]) / 2 - center[1],
(detections[:, 1] + detections[:, 3]) / 2 - center[0],
]
)
offset_dist_squared = np.sum(np.power(offsets, 2.0), axis=0)
# Calculate scores based on the chosen metric
if metric == 'max':
if metric == "max":
scores = areas
else:
scores = areas - offset_dist_squared * center_weight
@@ -199,15 +201,17 @@ class RetinaFace(BaseDetector):
faces = []
for i in range(detections.shape[0]):
face_dict = {
'bbox': detections[i, :4].astype(float).tolist(),
'confidence': detections[i, 4].item(),
'landmarks': landmarks[i].astype(float).tolist()
"bbox": detections[i, :4].astype(float).tolist(),
"confidence": detections[i, 4].item(),
"landmarks": landmarks[i].astype(float).tolist(),
}
faces.append(face_dict)
return faces
def postprocess(self, outputs: List[np.ndarray], resize_factor: float, shape: Tuple[int, int]) -> Tuple[np.ndarray, np.ndarray]:
def postprocess(
self, outputs: List[np.ndarray], resize_factor: float, shape: Tuple[int, int]
) -> Tuple[np.ndarray, np.ndarray]:
"""
Process the model outputs into final detection results.
@@ -226,7 +230,11 @@ class RetinaFace(BaseDetector):
- landmarks (np.ndarray): Array of detected facial landmarks.
Shape: (num_detections, 5, 2), where each row contains 5 landmark points (x, y).
"""
loc, conf, landmarks = outputs[0].squeeze(0), outputs[1].squeeze(0), outputs[2].squeeze(0)
loc, conf, landmarks = (
outputs[0].squeeze(0),
outputs[1].squeeze(0),
outputs[2].squeeze(0),
)
# Decode boxes and landmarks
boxes = decode_boxes(loc, self._priors)
@@ -242,7 +250,7 @@ class RetinaFace(BaseDetector):
boxes, landmarks, scores = boxes[mask], landmarks[mask], scores[mask]
# Sort by scores
order = scores.argsort()[::-1][:self.pre_nms_topk]
order = scores.argsort()[::-1][: self.pre_nms_topk]
boxes, landmarks, scores = boxes[order], landmarks[order], scores[order]
# Apply NMS
@@ -251,13 +259,22 @@ class RetinaFace(BaseDetector):
detections, landmarks = detections[keep], landmarks[keep]
# Keep top-k detections
detections, landmarks = detections[:self.post_nms_topk], landmarks[:self.post_nms_topk]
detections, landmarks = (
detections[: self.post_nms_topk],
landmarks[: self.post_nms_topk],
)
landmarks = landmarks.reshape(-1, 5, 2).astype(np.int32)
return detections, landmarks
def _scale_detections(self, boxes: np.ndarray, landmarks: np.ndarray, resize_factor: float, shape: Tuple[int, int]) -> Tuple[np.ndarray, np.ndarray]:
def _scale_detections(
self,
boxes: np.ndarray,
landmarks: np.ndarray,
resize_factor: float,
shape: Tuple[int, int],
) -> Tuple[np.ndarray, np.ndarray]:
# Scale bounding boxes and landmarks to the original image size.
bbox_scale = np.array([shape[0], shape[1]] * 2)
boxes = boxes * bbox_scale / resize_factor
@@ -276,26 +293,27 @@ def draw_bbox(frame, bbox, score, color=(0, 255, 0), thickness=2):
def draw_keypoints(frame, points, color=(0, 0, 255), radius=2):
for (x, y) in points.astype(np.int32):
for x, y in points.astype(np.int32):
cv2.circle(frame, (int(x), int(y)), radius, color, -1)
if __name__ == "__main__":
import cv2
detector = RetinaFace(model_name=RetinaFaceWeights.MNET_050)
print(detector.get_info())
cap = cv2.VideoCapture(0)
if not cap.isOpened():
print("Failed to open webcam.")
print("Failed to open webcam.")
exit()
print("📷 Webcam started. Press 'q' to exit.")
print("Webcam started. Press 'q' to exit.")
while True:
ret, frame = cap.read()
if not ret:
print("Failed to read frame.")
print("Failed to read frame.")
break
# Get face detections as list of dictionaries
@@ -304,9 +322,9 @@ if __name__ == "__main__":
# Process each detected face
for face in faces:
# Extract bbox and landmarks from dictionary
bbox = face['bbox'] # [x1, y1, x2, y2]
landmarks = face['landmarks'] # [[x1, y1], [x2, y2], ...]
confidence = face['confidence']
bbox = face["bbox"] # [x1, y1, x2, y2]
landmarks = face["landmarks"] # [[x1, y1], [x2, y2], ...]
confidence = face["confidence"]
# Pass bbox and confidence separately
draw_bbox(frame, bbox, confidence)
@@ -318,8 +336,15 @@ if __name__ == "__main__":
draw_keypoints(frame, points)
# Display face count
cv2.putText(frame, f"Faces: {len(faces)}", (10, 30),
cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255, 255, 255), 2)
cv2.putText(
frame,
f"Faces: {len(faces)}",
(10, 30),
cv2.FONT_HERSHEY_SIMPLEX,
0.7,
(255, 255, 255),
2,
)
cv2.imshow("FaceDetection", frame)
if cv2.waitKey(1) & 0xFF == ord("q"):

View File

@@ -173,7 +173,11 @@ class SCRFD(BaseDetector):
return scores_list, bboxes_list, kpss_list
def detect(
self, image: np.ndarray, max_num: int = 0, metric: Literal["default", "max"] = "max", center_weight: float = 2
self,
image: np.ndarray,
max_num: int = 0,
metric: Literal["default", "max"] = "max",
center_weight: float = 2,
) -> List[Dict[str, Any]]:
"""
Perform face detection on an input image and return bounding boxes and facial landmarks.
@@ -280,15 +284,15 @@ if __name__ == "__main__":
cap = cv2.VideoCapture(0)
if not cap.isOpened():
print("Failed to open webcam.")
print("Failed to open webcam.")
exit()
print("📷 Webcam started. Press 'q' to exit.")
print("Webcam started. Press 'q' to exit.")
while True:
ret, frame = cap.read()
if not ret:
print("Failed to read frame.")
print("Failed to read frame.")
break
# Get face detections as list of dictionaries
@@ -311,7 +315,15 @@ if __name__ == "__main__":
draw_keypoints(frame, points)
# Display face count
cv2.putText(frame, f"Faces: {len(faces)}", (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255, 255, 255), 2)
cv2.putText(
frame,
f"Faces: {len(faces)}",
(10, 30),
cv2.FONT_HERSHEY_SIMPLEX,
0.7,
(255, 255, 255),
2,
)
cv2.imshow("FaceDetection", frame)
if cv2.waitKey(1) & 0xFF == ord("q"):

View File

@@ -2,12 +2,12 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
import cv2
import math
import itertools
import numpy as np
import math
from typing import List, Tuple
from typing import Tuple, List
import cv2
import numpy as np
def resize_image(frame, target_shape: Tuple[int, int] = (640, 640)) -> Tuple[np.ndarray, float]:
@@ -59,12 +59,7 @@ def generate_anchors(image_size: Tuple[int, int] = (640, 640)) -> np.ndarray:
min_sizes = [[16, 32], [64, 128], [256, 512]]
anchors = []
feature_maps = [
[
math.ceil(image_size[0] / step),
math.ceil(image_size[1] / step)
] for step in steps
]
feature_maps = [[math.ceil(image_size[0] / step), math.ceil(image_size[1] / step)] for step in steps]
for k, (map_height, map_width) in enumerate(feature_maps):
step = steps[k]

View File

@@ -2,13 +2,18 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Tuple, Union
import cv2
import numpy as np
from skimage.transform import SimilarityTransform
from typing import Tuple
__all__ = ["face_alignment", "compute_similarity", "bbox_center_alignment", "transform_points_2d"]
__all__ = [
"face_alignment",
"compute_similarity",
"bbox_center_alignment",
"transform_points_2d",
]
# Reference alignment for facial landmarks (ArcFace)
@@ -18,19 +23,20 @@ reference_alignment: np.ndarray = np.array(
[73.5318, 51.5014],
[56.0252, 71.7366],
[41.5493, 92.3655],
[70.7299, 92.2041]
[70.7299, 92.2041],
],
dtype=np.float32
dtype=np.float32,
)
def estimate_norm(landmark: np.ndarray, image_size: int = 112) -> Tuple[np.ndarray, np.ndarray]:
def estimate_norm(landmark: np.ndarray, image_size: Union[int, Tuple[int, int]] = 112) -> Tuple[np.ndarray, np.ndarray]:
"""
Estimate the normalization transformation matrix for facial landmarks.
Args:
landmark (np.ndarray): Array of shape (5, 2) representing the coordinates of the facial landmarks.
image_size (int, optional): The size of the output image. Default is 112.
image_size (Union[int, Tuple[int, int]], optional): The size of the output image.
Can be an integer (for square images) or a tuple (width, height). Default is 112.
Returns:
np.ndarray: The 2x3 transformation matrix for aligning the landmarks.
@@ -41,13 +47,20 @@ def estimate_norm(landmark: np.ndarray, image_size: int = 112) -> Tuple[np.ndarr
or if image_size is not a multiple of 112 or 128.
"""
assert landmark.shape == (5, 2), "Landmark array must have shape (5, 2)."
assert image_size % 112 == 0 or image_size % 128 == 0, "Image size must be a multiple of 112 or 128."
if image_size % 112 == 0:
ratio = float(image_size) / 112.0
# Handle both int and tuple inputs
if isinstance(image_size, tuple):
size = image_size[0] # Use width for ratio calculation
else:
size = image_size
assert size % 112 == 0 or size % 128 == 0, "Image size must be a multiple of 112 or 128."
if size % 112 == 0:
ratio = float(size) / 112.0
diff_x = 0.0
else:
ratio = float(image_size) / 128.0
ratio = float(size) / 128.0
diff_x = 8.0 * ratio
# Adjust reference alignment based on ratio and diff_x
@@ -64,14 +77,19 @@ def estimate_norm(landmark: np.ndarray, image_size: int = 112) -> Tuple[np.ndarr
return matrix, inverse_matrix
def face_alignment(image: np.ndarray, landmark: np.ndarray, image_size: int = 112) -> Tuple[np.ndarray, np.ndarray]:
def face_alignment(
image: np.ndarray,
landmark: np.ndarray,
image_size: Union[int, Tuple[int, int]] = 112,
) -> Tuple[np.ndarray, np.ndarray]:
"""
Align the face in the input image based on the given facial landmarks.
Args:
image (np.ndarray): Input image as a NumPy array.
landmark (np.ndarray): Array of shape (5, 2) representing the coordinates of the facial landmarks.
image_size (int, optional): The size of the aligned output image. Default is 112.
image_size (Union[int, Tuple[int, int]], optional): The size of the aligned output image.
Can be an integer (for square images) or a tuple (width, height). Default is 112.
Returns:
np.ndarray: The aligned face as a NumPy array.
@@ -80,8 +98,14 @@ def face_alignment(image: np.ndarray, landmark: np.ndarray, image_size: int = 11
# Get the transformation matrix
M, M_inv = estimate_norm(landmark, image_size)
# Handle both int and tuple for warpAffine output size
if isinstance(image_size, int):
output_size = (image_size, image_size)
else:
output_size = image_size
# Warp the input image to align the face
warped = cv2.warpAffine(image, M, (image_size, image_size), borderValue=0.0)
warped = cv2.warpAffine(image, M, output_size, borderValue=0.0)
return warped, M_inv

View File

@@ -2,11 +2,11 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from .models import Landmark106
from .base import BaseLandmarker
from .models import Landmark106
def create_landmarker(method: str = '2d106det', **kwargs) -> BaseLandmarker:
def create_landmarker(method: str = "2d106det", **kwargs) -> BaseLandmarker:
"""
Factory function to create facial landmark predictors.
@@ -18,15 +18,11 @@ def create_landmarker(method: str = '2d106det', **kwargs) -> BaseLandmarker:
Initialized landmarker instance.
"""
method = method.lower()
if method == '2d106det':
if method == "2d106det":
return Landmark106(**kwargs)
else:
available = ['2d106det']
available = ["2d106det"]
raise ValueError(f"Unsupported method: '{method}'. Available: {available}")
__all__ = [
"create_landmarker",
"Landmark106",
"BaseLandmarker"
]
__all__ = ["create_landmarker", "Landmark106", "BaseLandmarker"]

View File

@@ -3,6 +3,7 @@
# GitHub: https://github.com/yakhyo
from abc import ABC, abstractmethod
import numpy as np
@@ -10,6 +11,7 @@ class BaseLandmarker(ABC):
"""
Abstract Base Class for all facial landmark models.
"""
@abstractmethod
def get_landmarks(self, image: np.ndarray, bbox: np.ndarray) -> np.ndarray:
"""

View File

@@ -2,18 +2,20 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
import cv2
import numpy as np
from typing import Tuple
from uniface.log import Logger
import cv2
import numpy as np
from uniface.constants import LandmarkWeights
from uniface.model_store import verify_model_weights
from uniface.face_utils import bbox_center_alignment, transform_points_2d
from uniface.log import Logger
from uniface.model_store import verify_model_weights
from uniface.onnx_utils import create_onnx_session
from .base import BaseLandmarker
__all__ = ['Landmark']
__all__ = ["Landmark"]
class Landmark106(BaseLandmarker):
@@ -40,15 +42,13 @@ class Landmark106(BaseLandmarker):
>>> print(landmarks.shape)
(106, 2)
"""
def __init__(
self,
model_name: LandmarkWeights = LandmarkWeights.DEFAULT,
input_size: Tuple[int, int] = (192, 192)
input_size: Tuple[int, int] = (192, 192),
) -> None:
Logger.info(
f"Initializing Facial Landmark with model={model_name}, "
f"input_size={input_size}"
)
Logger.info(f"Initializing Facial Landmark with model={model_name}, input_size={input_size}")
self.input_size = input_size
self.input_std = 1.0
self.input_mean = 0.0
@@ -83,7 +83,7 @@ class Landmark106(BaseLandmarker):
except Exception as e:
Logger.error(f"Failed to load landmark model from '{self.model_path}'", exc_info=True)
raise RuntimeError(f"Failed to initialize landmark model: {e}")
raise RuntimeError(f"Failed to initialize landmark model: {e}") from e
def preprocess(self, image: np.ndarray, bbox: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
"""Prepares a face crop for inference.
@@ -108,8 +108,11 @@ class Landmark106(BaseLandmarker):
aligned_face, transform_matrix = bbox_center_alignment(image, center, self.input_size[0], scale, 0.0)
face_blob = cv2.dnn.blobFromImage(
aligned_face, 1.0 / self.input_std, self.input_size,
(self.input_mean, self.input_mean, self.input_mean), swapRB=True
aligned_face,
1.0 / self.input_std,
self.input_size,
(self.input_mean, self.input_mean, self.input_mean),
swapRB=True,
)
return face_blob, transform_matrix
@@ -129,7 +132,7 @@ class Landmark106(BaseLandmarker):
"""
landmarks = predictions.reshape((-1, 2))
landmarks[:, 0:2] += 1
landmarks[:, 0:2] *= (self.input_size[0] // 2)
landmarks[:, 0:2] *= self.input_size[0] // 2
inverse_matrix = cv2.invertAffineTransform(transform_matrix)
landmarks = transform_points_2d(landmarks, inverse_matrix)
@@ -149,23 +152,18 @@ class Landmark106(BaseLandmarker):
np.ndarray: An array of predicted landmark points with shape (106, 2).
"""
face_blob, transform_matrix = self.preprocess(image, bbox)
raw_predictions = self.session.run(
self.output_names, {self.input_names[0]: face_blob}
)[0][0]
raw_predictions = self.session.run(self.output_names, {self.input_names[0]: face_blob})[0][0]
landmarks = self.postprocess(raw_predictions, transform_matrix)
return landmarks
# TODO: For testing purposes only, remote later
# Testing code
if __name__ == "__main__":
# UPDATED: Use the high-level factory functions
from uniface.detection import create_detector
from uniface.landmark import create_landmarker
from uniface.detection import RetinaFace
from uniface.landmark import Landmark106
# 1. Create the detector and landmarker using the new API
face_detector = create_detector('retinaface')
landmarker = create_landmarker() # Uses the default '2d106det' method
face_detector = RetinaFace()
landmarker = Landmark106()
cap = cv2.VideoCapture(0)
if not cap.isOpened():
@@ -185,21 +183,21 @@ if __name__ == "__main__":
if not faces:
cv2.imshow("Facial Landmark Detection", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
if cv2.waitKey(1) & 0xFF == ord("q"):
break
continue
# 3. Loop through the list of face dictionaries
for face in faces:
# Extract the bounding box
bbox = face['bbox']
bbox = face["bbox"]
# 4. Get landmarks for the current face using its bounding box
landmarks = landmarker.get_landmarks(frame, bbox)
# --- Drawing Logic ---
# Draw the landmarks
for (x, y) in landmarks.astype(int):
for x, y in landmarks.astype(int):
cv2.circle(frame, (x, y), 2, (0, 255, 0), -1)
# Draw the bounding box
@@ -207,7 +205,7 @@ if __name__ == "__main__":
cv2.rectangle(frame, (x1, y1), (x2, y2), (255, 0, 0), 2)
cv2.imshow("Facial Landmark Detection", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
if cv2.waitKey(1) & 0xFF == ord("q"):
break
cap.release()

View File

@@ -19,10 +19,7 @@ def enable_logging(level=logging.INFO):
"""
Logger.handlers.clear()
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter(
"%(asctime)s - %(levelname)s - %(message)s",
datefmt="%Y-%m-%d %H:%M:%S"
))
handler.setFormatter(logging.Formatter("%(asctime)s - %(levelname)s - %(message)s", datefmt="%Y-%m-%d %H:%M:%S"))
Logger.addHandler(handler)
Logger.setLevel(level)
Logger.propagate = False

View File

@@ -2,19 +2,19 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
import os
import hashlib
import os
import requests
from tqdm import tqdm
from uniface.log import Logger
import uniface.constants as const
from uniface.log import Logger
__all__ = ["verify_model_weights"]
__all__ = ['verify_model_weights']
def verify_model_weights(model_name: str, root: str = '~/.uniface/models') -> str:
def verify_model_weights(model_name: str, root: str = "~/.uniface/models") -> str:
"""
Ensure model weights are present, downloading and verifying them using SHA-256 if necessary.
@@ -53,7 +53,7 @@ def verify_model_weights(model_name: str, root: str = '~/.uniface/models') -> st
raise ValueError(f"No URL found for model '{model_name}'")
file_ext = os.path.splitext(url)[1]
model_path = os.path.normpath(os.path.join(root, f'{model_name.value}{file_ext}'))
model_path = os.path.normpath(os.path.join(root, f"{model_name.value}{file_ext}"))
if not os.path.exists(model_path):
Logger.info(f"Downloading model '{model_name}' from {url}")
@@ -62,7 +62,7 @@ def verify_model_weights(model_name: str, root: str = '~/.uniface/models') -> st
Logger.info(f"Successfully downloaded '{model_name}' to {model_path}")
except Exception as e:
Logger.error(f"Failed to download model '{model_name}': {e}")
raise ConnectionError(f"Download failed for '{model_name}'")
raise ConnectionError(f"Download failed for '{model_name}'") from e
expected_hash = const.MODEL_SHA256.get(model_name)
if expected_hash and not verify_file_hash(model_path, expected_hash):
@@ -78,18 +78,21 @@ def download_file(url: str, dest_path: str) -> None:
try:
response = requests.get(url, stream=True)
response.raise_for_status()
with open(dest_path, "wb") as file, tqdm(
desc=f"Downloading {dest_path}",
unit='B',
unit_scale=True,
unit_divisor=1024
) as progress:
with (
open(dest_path, "wb") as file,
tqdm(
desc=f"Downloading {dest_path}",
unit="B",
unit_scale=True,
unit_divisor=1024,
) as progress,
):
for chunk in response.iter_content(chunk_size=const.CHUNK_SIZE):
if chunk:
file.write(chunk)
progress.update(len(chunk))
except requests.RequestException as e:
raise ConnectionError(f"Failed to download file from {url}. Error: {e}")
raise ConnectionError(f"Failed to download file from {url}. Error: {e}") from e
def verify_file_hash(file_path: str, expected_hash: str) -> bool:

View File

@@ -77,10 +77,25 @@ def create_onnx_session(model_path: str, providers: List[str] = None) -> ort.Inf
if providers is None:
providers = get_available_providers()
# Suppress ONNX Runtime warnings (e.g., CoreML partition warnings)
# Log levels: 0=VERBOSE, 1=INFO, 2=WARNING, 3=ERROR, 4=FATAL
sess_options = ort.SessionOptions()
sess_options.log_severity_level = 3 # Only show ERROR and FATAL
try:
session = ort.InferenceSession(model_path, providers=providers)
session = ort.InferenceSession(model_path, sess_options=sess_options, providers=providers)
active_provider = session.get_providers()[0]
Logger.debug(f"Session created with provider: {active_provider}")
# Show user-friendly message about which provider is being used
provider_names = {
"CoreMLExecutionProvider": "CoreML (Apple Silicon)",
"CUDAExecutionProvider": "CUDA (NVIDIA GPU)",
"CPUExecutionProvider": "CPU",
}
provider_display = provider_names.get(active_provider, active_provider)
print(f"Model loaded ({provider_display})")
return session
except Exception as e:
Logger.error(f"Failed to create ONNX session: {e}", exc_info=True)

View File

@@ -2,12 +2,12 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import Dict
from .models import ArcFace, MobileFace, SphereFace
from .base import BaseRecognizer
from uniface.constants import ArcFaceWeights, MobileFaceWeights, SphereFaceWeights
def create_recognizer(method: str = 'arcface', **kwargs) -> BaseRecognizer:
from .base import BaseRecognizer
from .models import ArcFace, MobileFace, SphereFace
def create_recognizer(method: str = "arcface", **kwargs) -> BaseRecognizer:
"""
Factory function to create face recognizers.
@@ -44,20 +44,21 @@ def create_recognizer(method: str = 'arcface', **kwargs) -> BaseRecognizer:
"""
method = method.lower()
if method == 'arcface':
if method == "arcface":
return ArcFace(**kwargs)
elif method == 'mobileface':
elif method == "mobileface":
return MobileFace(**kwargs)
elif method == 'sphereface':
elif method == "sphereface":
return SphereFace(**kwargs)
else:
available = ['arcface', 'mobileface', 'sphereface']
available = ["arcface", "mobileface", "sphereface"]
raise ValueError(f"Unsupported method: '{method}'. Available: {available}")
__all__ = [
"create_recognizer",
"ArcFace",
"MobileFace",
"SphereFace",
"BaseRecognizer",
]
]

View File

@@ -3,13 +3,14 @@
# GitHub: https://github.com/yakhyo
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import List, Tuple, Union
import cv2
import numpy as np
from dataclasses import dataclass
from typing import Tuple, Union, List
from uniface.log import Logger
from uniface.face_utils import face_alignment
from uniface.log import Logger
from uniface.onnx_utils import create_onnx_session
@@ -18,6 +19,7 @@ class PreprocessConfig:
"""
Configuration for preprocessing images before feeding them into the model.
"""
input_mean: Union[float, List[float]] = 127.5
input_std: Union[float, List[float]] = 127.5
input_size: Tuple[int, int] = (112, 112)
@@ -28,6 +30,7 @@ class BaseRecognizer(ABC):
Abstract Base Class for all face recognition models.
It provides the core functionality for preprocessing, inference, and embedding extraction.
"""
@abstractmethod
def __init__(self, model_path: str, preprocessing: PreprocessConfig) -> None:
"""
@@ -73,7 +76,10 @@ class BaseRecognizer(ABC):
Logger.info(f"Successfully initialized face encoder from {self.model_path}")
except Exception as e:
Logger.error(f"Failed to load face encoder model from '{self.model_path}'", exc_info=True)
Logger.error(
f"Failed to load face encoder model from '{self.model_path}'",
exc_info=True,
)
raise RuntimeError(f"Failed to initialize model session for '{self.model_path}'") from e
def preprocess(self, face_img: np.ndarray) -> np.ndarray:
@@ -91,8 +97,9 @@ class BaseRecognizer(ABC):
if isinstance(self.input_std, (list, tuple)):
# Per-channel normalization
rgb_img = cv2.cvtColor(resized_img, cv2.COLOR_BGR2RGB).astype(np.float32)
normalized_img = (rgb_img - np.array(self.input_mean, dtype=np.float32)) / \
np.array(self.input_std, dtype=np.float32)
normalized_img = (rgb_img - np.array(self.input_mean, dtype=np.float32)) / np.array(
self.input_std, dtype=np.float32
)
# Change to NCHW (batch, channels, height, width)
blob = np.transpose(normalized_img, (2, 0, 1)) # CHW
@@ -104,24 +111,28 @@ class BaseRecognizer(ABC):
scalefactor=1.0 / self.input_std,
size=self.input_size,
mean=(self.input_mean, self.input_mean, self.input_mean),
swapRB=True # Convert BGR to RGB
swapRB=True, # Convert BGR to RGB
)
return blob
def get_embedding(self, image: np.ndarray, landmarks: np.ndarray) -> np.ndarray:
def get_embedding(self, image: np.ndarray, landmarks: np.ndarray = None) -> np.ndarray:
"""
Extracts face embedding from an image.
Args:
image: Input face image (BGR format).
landmarks: Facial landmarks (5 points for alignment).
image: Input face image (BGR format). If already aligned (112x112), landmarks can be None.
landmarks: Facial landmarks (5 points for alignment). Optional if image is already aligned.
Returns:
Face embedding vector (typically 512-dimensional).
"""
# Align face using landmarks
aligned_face, _ = face_alignment(image, landmarks)
# If landmarks are provided, align the face first
if landmarks is not None:
aligned_face, _ = face_alignment(image, landmarks, image_size=self.input_size)
else:
# Assume image is already aligned
aligned_face = image
# Generate embedding from aligned face
face_blob = self.preprocess(aligned_face)

View File

@@ -6,6 +6,7 @@ from typing import Optional
from uniface.constants import ArcFaceWeights, MobileFaceWeights, SphereFaceWeights
from uniface.model_store import verify_model_weights
from .base import BaseRecognizer, PreprocessConfig
__all__ = ["ArcFace", "MobileFace", "SphereFace"]
@@ -33,14 +34,10 @@ class ArcFace(BaseRecognizer):
def __init__(
self,
model_name: ArcFaceWeights = ArcFaceWeights.MNET,
preprocessing: Optional[PreprocessConfig] = None
preprocessing: Optional[PreprocessConfig] = None,
) -> None:
if preprocessing is None:
preprocessing = PreprocessConfig(
input_mean=127.5,
input_std=127.5,
input_size=(112, 112)
)
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
model_path = verify_model_weights(model_name)
super().__init__(model_path=model_path, preprocessing=preprocessing)
@@ -67,14 +64,10 @@ class MobileFace(BaseRecognizer):
def __init__(
self,
model_name: MobileFaceWeights = MobileFaceWeights.MNET_V2,
preprocessing: Optional[PreprocessConfig] = None
preprocessing: Optional[PreprocessConfig] = None,
) -> None:
if preprocessing is None:
preprocessing = PreprocessConfig(
input_mean=127.5,
input_std=127.5,
input_size=(112, 112)
)
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
model_path = verify_model_weights(model_name)
super().__init__(model_path=model_path, preprocessing=preprocessing)
@@ -101,14 +94,10 @@ class SphereFace(BaseRecognizer):
def __init__(
self,
model_name: SphereFaceWeights = SphereFaceWeights.SPHERE20,
preprocessing: Optional[PreprocessConfig] = None
preprocessing: Optional[PreprocessConfig] = None,
) -> None:
if preprocessing is None:
preprocessing = PreprocessConfig(
input_mean=127.5,
input_std=127.5,
input_size=(112, 112)
)
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
model_path = verify_model_weights(model_name)
super().__init__(model_path=model_path, preprocessing=preprocessing)

View File

@@ -2,9 +2,10 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import List, Union
import cv2
import numpy as np
from typing import List, Union
def draw_detections(
@@ -12,7 +13,7 @@ def draw_detections(
bboxes: Union[np.ndarray, List[List[float]]],
scores: Union[np.ndarray, List[float]],
landmarks: Union[np.ndarray, List[List[List[float]]]],
vis_threshold: float = 0.6
vis_threshold: float = 0.6,
):
"""
Draws bounding boxes, scores, and landmarks from separate lists onto an image.
@@ -42,8 +43,15 @@ def draw_detections(
cv2.rectangle(image, tuple(bbox[:2]), tuple(bbox[2:]), (0, 0, 255), thickness)
# Draw score
cv2.putText(image, f"{score:.2f}", (bbox[0], bbox[1] - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), thickness)
cv2.putText(
image,
f"{score:.2f}",
(bbox[0], bbox[1] - 10),
cv2.FONT_HERSHEY_SIMPLEX,
0.5,
(255, 255, 255),
thickness,
)
# Draw landmarks
for j, point in enumerate(landmark_set):