feat: Update examples and some minor changes to UniFace API (#28)

* chore: Style changes and create jupyter notebook template

* docs: Update docstring for detection

* feat: Keyword only for common parameters: model_name, conf_thresh, nms_thresh, input_size

* chore: Update drawing and let the conf text optional for drawing

* feat: add fancy bbox draw

* docs: Add examples of using UniFace

* feat: Add version to all examples
This commit is contained in:
Yakhyokhuja Valikhujaev
2025-12-07 19:51:08 +09:00
committed by GitHub
parent 6b1d2a1ce6
commit 637316f077
20 changed files with 1158 additions and 252 deletions

58
CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,58 @@
# Contributing to UniFace
Thank you for considering contributing to UniFace! We welcome contributions of all kinds.
## How to Contribute
### Reporting Issues
- Use GitHub Issues to report bugs or suggest features
- Include clear descriptions and reproducible examples
- Check existing issues before creating new ones
### Pull Requests
1. Fork the repository
2. Create a new branch for your feature
3. Write clear, documented code with type hints
4. Add tests for new functionality
5. Ensure all tests pass
6. Submit a pull request with a clear description
### Code Style
- Follow PEP8 guidelines
- Use type hints (Python 3.10+)
- Write docstrings for public APIs
- Keep code simple and readable
## Development Setup
```bash
git clone https://github.com/yakhyo/uniface.git
cd uniface
pip install -e ".[dev]"
```
## Running Tests
```bash
pytest tests/
```
## Examples
Example notebooks demonstrating library usage:
| Example | Notebook |
|---------|----------|
| Face Detection | [face_detection.ipynb](examples/face_detection.ipynb) |
| Face Alignment | [face_alignment.ipynb](examples/face_alignment.ipynb) |
| Face Recognition | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
| Face Verification | [face_verification.ipynb](examples/face_verification.ipynb) |
| Face Search | [face_search.ipynb](examples/face_search.ipynb) |
## Questions?
Open an issue or start a discussion on GitHub.

View File

@@ -75,7 +75,13 @@ scores = [f['confidence'] for f in faces]
landmarks = [f['landmarks'] for f in faces] landmarks = [f['landmarks'] for f in faces]
# Draw on image # Draw on image
draw_detections(image, bboxes, scores, landmarks, vis_threshold=0.6) draw_detections(
image=image,
bboxes=bboxes,
scores=scores,
landmarks=landmarks,
vis_threshold=0.6,
)
# Save result # Save result
cv2.imwrite("output.jpg", image) cv2.imwrite("output.jpg", image)
@@ -156,7 +162,12 @@ while True:
bboxes = [f['bbox'] for f in faces] bboxes = [f['bbox'] for f in faces]
scores = [f['confidence'] for f in faces] scores = [f['confidence'] for f in faces]
landmarks = [f['landmarks'] for f in faces] landmarks = [f['landmarks'] for f in faces]
draw_detections(frame, bboxes, scores, landmarks) draw_detections(
image=frame,
bboxes=bboxes,
scores=scores,
landmarks=landmarks,
)
# Show frame # Show frame
cv2.imshow("UniFace - Press 'q' to quit", frame) cv2.imshow("UniFace - Press 'q' to quit", frame)
@@ -365,7 +376,20 @@ from uniface import retinaface # Module, not class
## Next Steps ## Next Steps
- **Detailed Examples**: Check the [examples/](examples/) folder for Jupyter notebooks ### Jupyter Notebook Examples
Explore interactive examples for common tasks:
| Example | Description | Notebook |
|---------|-------------|----------|
| **Face Detection** | Detect faces and facial landmarks | [face_detection.ipynb](examples/face_detection.ipynb) |
| **Face Alignment** | Align and crop faces for recognition | [face_alignment.ipynb](examples/face_alignment.ipynb) |
| **Face Recognition** | Extract face embeddings and compare faces | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
| **Face Verification** | Compare two faces to verify identity | [face_verification.ipynb](examples/face_verification.ipynb) |
| **Face Search** | Find a person in a group photo | [face_search.ipynb](examples/face_search.ipynb) |
### Additional Resources
- **Model Benchmarks**: See [MODELS.md](MODELS.md) for performance comparisons - **Model Benchmarks**: See [MODELS.md](MODELS.md) for performance comparisons
- **Full Documentation**: Read [README.md](README.md) for complete API reference - **Full Documentation**: Read [README.md](README.md) for complete API reference
@@ -374,7 +398,6 @@ from uniface import retinaface # Module, not class
## References ## References
- **RetinaFace Training**: [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) - **RetinaFace Training**: [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch)
- **YOLOv5-Face Original**: [deepcam-cn/yolov5-face](https://github.com/deepcam-cn/yolov5-face)
- **YOLOv5-Face ONNX**: [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) - **YOLOv5-Face ONNX**: [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference)
- **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) - **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition)
- **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface) - **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface)

View File

@@ -17,7 +17,7 @@
## Features ## Features
- **High-Speed Face Detection**: ONNX-optimized RetinaFace and SCRFD models - **High-Speed Face Detection**: ONNX-optimized RetinaFace, SCRFD, and YOLOv5-Face models
- **Facial Landmark Detection**: Accurate 106-point landmark localization - **Facial Landmark Detection**: Accurate 106-point landmark localization
- **Face Recognition**: ArcFace, MobileFace, and SphereFace embeddings - **Face Recognition**: ArcFace, MobileFace, and SphereFace embeddings
- **Attribute Analysis**: Age, gender, and emotion detection - **Attribute Analysis**: Age, gender, and emotion detection
@@ -218,9 +218,35 @@ recognizer = SphereFace() # Angular softmax alternative
from uniface import detect_faces from uniface import detect_faces
# One-line face detection # One-line face detection
faces = detect_faces(image, method='retinaface', conf_thresh=0.8) faces = detect_faces(image, method='retinaface', conf_thresh=0.8) # methods: retinaface, scrfd, yolov5face
``` ```
### Key Parameters (quick reference)
**Detection**
| Class | Key params (defaults) | Notes |
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------- |
| `RetinaFace` | `model_name=RetinaFaceWeights.MNET_V2`, `conf_thresh=0.5`, `nms_thresh=0.4`, `input_size=(640, 640)`, `dynamic_size=False` | Supports 5-point landmarks |
| `SCRFD` | `model_name=SCRFDWeights.SCRFD_10G_KPS`, `conf_thresh=0.5`, `nms_thresh=0.4`, `input_size=(640, 640)` | Supports 5-point landmarks |
| `YOLOv5Face` | `model_name=YOLOv5FaceWeights.YOLOV5S`, `conf_thresh=0.6`, `nms_thresh=0.5`, `input_size=640` (fixed) | Landmarks supported;`input_size` must be 640 |
**Recognition**
| Class | Key params (defaults) | Notes |
| -------------- | ----------------------------------------- | ------------------------------------- |
| `ArcFace` | `model_name=ArcFaceWeights.MNET` | Returns 512-dim normalized embeddings |
| `MobileFace` | `model_name=MobileFaceWeights.MNET_V2` | Lightweight embeddings |
| `SphereFace` | `model_name=SphereFaceWeights.SPHERE20` | Angular softmax variant |
**Landmark & Attributes**
| Class | Key params (defaults) | Notes |
| --------------- | --------------------------------------------------------------------- | --------------------------------------- |
| `Landmark106` | No required params | 106-point landmarks |
| `AgeGender` | `model_name=AgeGenderWeights.DEFAULT`; `input_size` auto-detected | Requires bbox; ONNXRuntime |
| `Emotion` | `model_weights=DDAMFNWeights.AFFECNET7`, `input_size=(112, 112)` | Requires 5-point landmarks; TorchScript |
--- ---
## Model Performance ## Model Performance
@@ -255,6 +281,18 @@ See [MODELS.md](MODELS.md) for detailed model information and selection guide.
## Examples ## Examples
### Jupyter Notebooks
Interactive examples covering common face analysis tasks:
| Example | Description | Notebook |
|---------|-------------|----------|
| **Face Detection** | Detect faces and facial landmarks | [face_detection.ipynb](examples/face_detection.ipynb) |
| **Face Alignment** | Align and crop faces for recognition | [face_alignment.ipynb](examples/face_alignment.ipynb) |
| **Face Recognition** | Extract face embeddings and compare faces | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
| **Face Verification** | Compare two faces to verify identity | [face_verification.ipynb](examples/face_verification.ipynb) |
| **Face Search** | Find a person in a group photo | [face_search.ipynb](examples/face_search.ipynb) |
### Webcam Face Detection ### Webcam Face Detection
```python ```python
@@ -277,7 +315,13 @@ while True:
scores = [f['confidence'] for f in faces] scores = [f['confidence'] for f in faces]
landmarks = [f['landmarks'] for f in faces] landmarks = [f['landmarks'] for f in faces]
draw_detections(frame, bboxes, scores, landmarks, vis_threshold=0.6) draw_detections(
image=frame,
bboxes=bboxes,
scores=scores,
landmarks=landmarks,
vis_threshold=0.6,
)
cv2.imshow("Face Detection", frame) cv2.imshow("Face Detection", frame)
if cv2.waitKey(1) & 0xFF == ord('q'): if cv2.waitKey(1) & 0xFF == ord('q'):
@@ -452,7 +496,6 @@ uniface/
## References ## References
- **RetinaFace Training**: [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) - PyTorch implementation and training code - **RetinaFace Training**: [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) - PyTorch implementation and training code
- **YOLOv5-Face Original**: [deepcam-cn/yolov5-face](https://github.com/deepcam-cn/yolov5-face) - Original PyTorch implementation
- **YOLOv5-Face ONNX**: [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) - ONNX inference implementation - **YOLOv5-Face ONNX**: [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) - ONNX inference implementation
- **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) - ArcFace, MobileFace, SphereFace training code - **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) - ArcFace, MobileFace, SphereFace training code
- **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface) - Model architectures and pretrained weights - **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface) - Model architectures and pretrained weights

BIN
assets/einstien.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.3 MiB

BIN
assets/scientists.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.9 MiB

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

360
examples/face_search.ipynb Normal file

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -1,6 +1,6 @@
[project] [project]
name = "uniface" name = "uniface"
version = "1.2.0" version = "1.3.0"
description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Landmark Analysis, Age, and Gender Detection" description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Landmark Analysis, Age, and Gender Detection"
readme = "README.md" readme = "README.md"
license = { text = "MIT" } license = { text = "MIT" }

View File

@@ -63,8 +63,8 @@ python scripts/download_model.py # downloads all
|--------|-------------| |--------|-------------|
| `--image` | Path to input image | | `--image` | Path to input image |
| `--webcam` | Use webcam instead of image | | `--webcam` | Use webcam instead of image |
| `--detector` | Choose detector: `retinaface` or `scrfd` | | `--method` | Choose detector: `retinaface`, `scrfd`, `yolov5face` |
| `--threshold` | Visualization confidence threshold (default: 0.6) | | `--threshold` | Visualization confidence threshold (default: 0.25) |
| `--save_dir` | Output directory (default: `outputs`) | | `--save_dir` | Output directory (default: `outputs`) |
## Quick Test ## Quick Test

View File

@@ -51,7 +51,7 @@ def run_webcam(detector, threshold: float = 0.6):
bboxes = [f['bbox'] for f in faces] bboxes = [f['bbox'] for f in faces]
scores = [f['confidence'] for f in faces] scores = [f['confidence'] for f in faces]
landmarks = [f['landmarks'] for f in faces] landmarks = [f['landmarks'] for f in faces]
draw_detections(frame, bboxes, scores, landmarks, vis_threshold=threshold) draw_detections(frame, bboxes, scores, landmarks, vis_threshold=threshold, draw_score=True, fancy_bbox=True)
cv2.putText( cv2.putText(
frame, frame,
@@ -89,6 +89,7 @@ def main():
detector = SCRFD() detector = SCRFD()
else: else:
from uniface.constants import YOLOv5FaceWeights from uniface.constants import YOLOv5FaceWeights
detector = YOLOv5Face(model_name=YOLOv5FaceWeights.YOLOV5M) detector = YOLOv5Face(model_name=YOLOv5FaceWeights.YOLOV5M)
if args.webcam: if args.webcam:

View File

@@ -13,7 +13,7 @@
__license__ = 'MIT' __license__ = 'MIT'
__author__ = 'Yakhyokhuja Valikhujaev' __author__ = 'Yakhyokhuja Valikhujaev'
__version__ = '1.2.0' __version__ = '1.3.0'
from uniface.face_utils import compute_similarity, face_alignment from uniface.face_utils import compute_similarity, face_alignment

View File

@@ -2,7 +2,7 @@
# Author: Yakhyokhuja Valikhujaev # Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo # GitHub: https://github.com/yakhyo
from typing import List, Tuple, Union from typing import List, Optional, Tuple, Union
import cv2 import cv2
import numpy as np import numpy as np
@@ -24,18 +24,30 @@ class AgeGender(Attribute):
This class inherits from the base `Attribute` class and implements the This class inherits from the base `Attribute` class and implements the
functionality for predicting age (in years) and gender ID (0 for Female, functionality for predicting age (in years) and gender ID (0 for Female,
1 for Male) from a face image. It requires a bounding box to locate the face. 1 for Male) from a face image. It requires a bounding box to locate the face.
Args:
model_name (AgeGenderWeights): The enum specifying the model weights to load.
Defaults to `AgeGenderWeights.DEFAULT`.
input_size (Optional[Tuple[int, int]]): Input size (height, width).
If None, automatically detected from model metadata. Defaults to None.
""" """
def __init__(self, model_name: AgeGenderWeights = AgeGenderWeights.DEFAULT) -> None: def __init__(
self,
model_name: AgeGenderWeights = AgeGenderWeights.DEFAULT,
input_size: Optional[Tuple[int, int]] = None,
) -> None:
""" """
Initializes the AgeGender prediction model. Initializes the AgeGender prediction model.
Args: Args:
model_name (AgeGenderWeights): The enum specifying the model weights model_name (AgeGenderWeights): The enum specifying the model weights to load.
to load. input_size (Optional[Tuple[int, int]]): Input size (height, width).
If None, automatically detected from model metadata. Defaults to None.
""" """
Logger.info(f'Initializing AgeGender with model={model_name.name}') Logger.info(f'Initializing AgeGender with model={model_name.name}')
self.model_path = verify_model_weights(model_name) self.model_path = verify_model_weights(model_name)
self._user_input_size = input_size # Store user preference
self._initialize_model() self._initialize_model()
def _initialize_model(self) -> None: def _initialize_model(self) -> None:
@@ -47,7 +59,19 @@ class AgeGender(Attribute):
# Get model input details from the loaded model # Get model input details from the loaded model
input_meta = self.session.get_inputs()[0] input_meta = self.session.get_inputs()[0]
self.input_name = input_meta.name self.input_name = input_meta.name
self.input_size = tuple(input_meta.shape[2:4]) # (height, width)
# Use user-provided size if given, otherwise auto-detect from model
model_input_size = tuple(input_meta.shape[2:4]) # (height, width)
if self._user_input_size is not None:
self.input_size = self._user_input_size
if self._user_input_size != model_input_size:
Logger.warning(
f'Using custom input_size {self.input_size}, '
f'but model expects {model_input_size}. This may affect accuracy.'
)
else:
self.input_size = model_input_size
self.output_names = [output.name for output in self.session.get_outputs()] self.output_names = [output.name for output in self.session.get_outputs()]
Logger.info(f'Successfully initialized AgeGender model with input size {self.input_size}') Logger.info(f'Successfully initialized AgeGender model with input size {self.input_size}')
except Exception as e: except Exception as e:

View File

@@ -22,7 +22,7 @@ def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs) -> Lis
Args: Args:
image (np.ndarray): Input image as numpy array. image (np.ndarray): Input image as numpy array.
method (str): Detection method to use. Options: 'retinaface', 'scrfd'. method (str): Detection method to use. Options: 'retinaface', 'scrfd', 'yolov5face'.
**kwargs: Additional arguments passed to the detector. **kwargs: Additional arguments passed to the detector.
Returns: Returns:

View File

@@ -27,18 +27,19 @@ class RetinaFace(BaseDetector):
Title: "RetinaFace: Single-stage Dense Face Localisation in the Wild" Title: "RetinaFace: Single-stage Dense Face Localisation in the Wild"
Paper: https://arxiv.org/abs/1905.00641 Paper: https://arxiv.org/abs/1905.00641
Code: https://github.com/yakhyo/retinaface-pytorch
Args: Args:
**kwargs: Keyword arguments passed to BaseDetector and RetinaFace. Supported keys include: model_name (RetinaFaceWeights): Model weights to use. Defaults to `RetinaFaceWeights.MNET_V2`.
model_name (RetinaFaceWeights, optional): Model weights to use. Defaults to `RetinaFaceWeights.MNET_V2`. conf_thresh (float): Confidence threshold for filtering detections. Defaults to 0.5.
conf_thresh (float, optional): Confidence threshold for filtering detections. Defaults to 0.5. nms_thresh (float): Non-maximum suppression (NMS) IoU threshold. Defaults to 0.4.
nms_thresh (float, optional): Non-maximum suppression (NMS) IoU threshold. Defaults to 0.4. input_size (Tuple[int, int]): Fixed input size (width, height) if `dynamic_size=False`.
pre_nms_topk (int, optional): Number of top-scoring boxes considered before NMS. Defaults to 5000. Defaults to (640, 640).
post_nms_topk (int, optional): Max number of detections kept after NMS. Defaults to 750. Note: Non-default sizes may cause slower inference and CoreML compatibility issues.
dynamic_size (bool, optional): If True, generate anchors dynamically per input image. Defaults to False. **kwargs: Advanced options:
input_size (Tuple[int, int], optional): Fixed input size (width, height) if `dynamic_size=False`. pre_nms_topk (int): Number of top-scoring boxes considered before NMS. Defaults to 5000.
Defaults to (640, 640). post_nms_topk (int): Max number of detections kept after NMS. Defaults to 750.
Note: Non-default sizes may cause slower inference and CoreML compatibility issues. dynamic_size (bool): If True, generate anchors dynamically per input image. Defaults to False.
Attributes: Attributes:
model_name (RetinaFaceWeights): Selected model variant. model_name (RetinaFaceWeights): Selected model variant.
@@ -57,17 +58,33 @@ class RetinaFace(BaseDetector):
RuntimeError: If the ONNX model fails to load or initialize. RuntimeError: If the ONNX model fails to load or initialize.
""" """
def __init__(self, **kwargs) -> None: def __init__(
super().__init__(**kwargs) self,
*,
model_name: RetinaFaceWeights = RetinaFaceWeights.MNET_V2,
conf_thresh: float = 0.5,
nms_thresh: float = 0.4,
input_size: Tuple[int, int] = (640, 640),
**kwargs: Any,
) -> None:
super().__init__(
model_name=model_name,
conf_thresh=conf_thresh,
nms_thresh=nms_thresh,
input_size=input_size,
**kwargs,
)
self._supports_landmarks = True # RetinaFace supports landmarks self._supports_landmarks = True # RetinaFace supports landmarks
self.model_name = kwargs.get('model_name', RetinaFaceWeights.MNET_V2) self.model_name = model_name
self.conf_thresh = kwargs.get('conf_thresh', 0.5) self.conf_thresh = conf_thresh
self.nms_thresh = kwargs.get('nms_thresh', 0.4) self.nms_thresh = nms_thresh
self.input_size = input_size
# Advanced options from kwargs
self.pre_nms_topk = kwargs.get('pre_nms_topk', 5000) self.pre_nms_topk = kwargs.get('pre_nms_topk', 5000)
self.post_nms_topk = kwargs.get('post_nms_topk', 750) self.post_nms_topk = kwargs.get('post_nms_topk', 750)
self.dynamic_size = kwargs.get('dynamic_size', False) self.dynamic_size = kwargs.get('dynamic_size', False)
self.input_size = kwargs.get('input_size', (640, 640))
Logger.info( Logger.info(
f'Initializing RetinaFace with model={self.model_name}, conf_thresh={self.conf_thresh}, ' f'Initializing RetinaFace with model={self.model_name}, conf_thresh={self.conf_thresh}, '
@@ -133,6 +150,7 @@ class RetinaFace(BaseDetector):
def detect( def detect(
self, self,
image: np.ndarray, image: np.ndarray,
*,
max_num: int = 0, max_num: int = 0,
metric: Literal['default', 'max'] = 'max', metric: Literal['default', 'max'] = 'max',
center_weight: float = 2.0, center_weight: float = 2.0,

View File

@@ -24,18 +24,20 @@ class SCRFD(BaseDetector):
Title: "Sample and Computation Redistribution for Efficient Face Detection" Title: "Sample and Computation Redistribution for Efficient Face Detection"
Paper: https://arxiv.org/abs/2105.04714 Paper: https://arxiv.org/abs/2105.04714
Code: https://github.com/insightface/insightface
Args: Args:
**kwargs: Keyword arguments passed to BaseDetector and SCRFD. Supported keys include: model_name (SCRFDWeights): Predefined model enum (e.g., `SCRFD_10G_KPS`).
model_name (SCRFDWeights, optional): Predefined model enum (e.g., `SCRFD_10G_KPS`). Specifies the SCRFD variant to load. Defaults to SCRFD_10G_KPS.
Specifies the SCRFD variant to load. Defaults to SCRFD_10G_KPS. conf_thresh (float): Confidence threshold for filtering detections. Defaults to 0.5.
conf_thresh (float, optional): Confidence threshold for filtering detections. Defaults to 0.5. nms_thresh (float): Non-Maximum Suppression threshold. Defaults to 0.4.
nms_thresh (float, optional): Non-Maximum Suppression threshold. Defaults to 0.4. input_size (Tuple[int, int]): Input image size (width, height).
input_size (Tuple[int, int], optional): Input image size (width, height). Defaults to (640, 640).
Defaults to (640, 640). Note: Non-default sizes may cause slower inference and CoreML compatibility issues.
Note: Non-default sizes may cause slower inference and CoreML compatibility issues. **kwargs: Reserved for future advanced options.
Attributes: Attributes:
model_name (SCRFDWeights): Selected model variant.
conf_thresh (float): Threshold used to filter low-confidence detections. conf_thresh (float): Threshold used to filter low-confidence detections.
nms_thresh (float): Threshold used during NMS to suppress overlapping boxes. nms_thresh (float): Threshold used during NMS to suppress overlapping boxes.
input_size (Tuple[int, int]): Image size to which inputs are resized before inference. input_size (Tuple[int, int]): Image size to which inputs are resized before inference.
@@ -50,15 +52,25 @@ class SCRFD(BaseDetector):
RuntimeError: If the ONNX model fails to load or initialize. RuntimeError: If the ONNX model fails to load or initialize.
""" """
def __init__(self, **kwargs) -> None: def __init__(
super().__init__(**kwargs) self,
*,
model_name: SCRFDWeights = SCRFDWeights.SCRFD_10G_KPS,
conf_thresh: float = 0.5,
nms_thresh: float = 0.4,
input_size: Tuple[int, int] = (640, 640),
**kwargs: Any,
) -> None:
super().__init__(
model_name=model_name,
conf_thresh=conf_thresh,
nms_thresh=nms_thresh,
input_size=input_size,
**kwargs,
)
self._supports_landmarks = True # SCRFD supports landmarks self._supports_landmarks = True # SCRFD supports landmarks
model_name = kwargs.get('model_name', SCRFDWeights.SCRFD_10G_KPS) self.model_name = model_name
conf_thresh = kwargs.get('conf_thresh', 0.5)
nms_thresh = kwargs.get('nms_thresh', 0.4)
input_size = kwargs.get('input_size', (640, 640))
self.conf_thresh = conf_thresh self.conf_thresh = conf_thresh
self.nms_thresh = nms_thresh self.nms_thresh = nms_thresh
self.input_size = input_size self.input_size = input_size
@@ -71,12 +83,12 @@ class SCRFD(BaseDetector):
# --------------------------------- # ---------------------------------
Logger.info( Logger.info(
f'Initializing SCRFD with model={model_name}, conf_thresh={conf_thresh}, nms_thresh={nms_thresh}, ' f'Initializing SCRFD with model={self.model_name}, conf_thresh={self.conf_thresh}, '
f'input_size={input_size}' f'nms_thresh={self.nms_thresh}, input_size={self.input_size}'
) )
# Get path to model weights # Get path to model weights
self._model_path = verify_model_weights(model_name) self._model_path = verify_model_weights(self.model_name)
Logger.info(f'Verified model weights located at: {self._model_path}') Logger.info(f'Verified model weights located at: {self._model_path}')
# Initialize model # Initialize model
@@ -177,9 +189,10 @@ class SCRFD(BaseDetector):
def detect( def detect(
self, self,
image: np.ndarray, image: np.ndarray,
*,
max_num: int = 0, max_num: int = 0,
metric: Literal['default', 'max'] = 'max', metric: Literal['default', 'max'] = 'max',
center_weight: float = 2, center_weight: float = 2.0,
) -> List[Dict[str, Any]]: ) -> List[Dict[str, Any]]:
""" """
Perform face detection on an input image and return bounding boxes and facial landmarks. Perform face detection on an input image and return bounding boxes and facial landmarks.

View File

@@ -22,20 +22,22 @@ class YOLOv5Face(BaseDetector):
""" """
Face detector based on the YOLOv5-Face architecture. Face detector based on the YOLOv5-Face architecture.
Title: "YOLO5Face: Why Reinventing a Face Detector"
Paper: https://arxiv.org/abs/2105.12931 Paper: https://arxiv.org/abs/2105.12931
Original Implementation: https://github.com/deepcam-cn/yolov5-face Code: https://github.com/yakhyo/yolov5-face-onnx-inference (ONNX inference implementation)
Args: Args:
**kwargs: Keyword arguments passed to BaseDetector and YOLOv5Face. Supported keys include: model_name (YOLOv5FaceWeights): Predefined model enum (e.g., `YOLOV5S`).
model_name (YOLOv5FaceWeights, optional): Predefined model enum (e.g., `YOLOV5S`). Specifies the YOLOv5-Face variant to load. Defaults to YOLOV5S.
Specifies the YOLOv5-Face variant to load. Defaults to YOLOV5S. conf_thresh (float): Confidence threshold for filtering detections. Defaults to 0.6.
conf_thresh (float, optional): Confidence threshold for filtering detections. Defaults to 0.25. nms_thresh (float): Non-Maximum Suppression threshold. Defaults to 0.5.
nms_thresh (float, optional): Non-Maximum Suppression threshold. Defaults to 0.45. input_size (int): Input image size. Defaults to 640.
input_size (int, optional): Input image size. Defaults to 640. Note: ONNX model is fixed at 640. Changing this will cause inference errors.
Note: ONNX model is fixed at 640. Changing this will cause inference errors. **kwargs: Advanced options:
max_det (int, optional): Maximum number of detections to return. Defaults to 750. max_det (int): Maximum number of detections to return. Defaults to 750.
Attributes: Attributes:
model_name (YOLOv5FaceWeights): Selected model variant.
conf_thresh (float): Threshold used to filter low-confidence detections. conf_thresh (float): Threshold used to filter low-confidence detections.
nms_thresh (float): Threshold used during NMS to suppress overlapping boxes. nms_thresh (float): Threshold used during NMS to suppress overlapping boxes.
input_size (int): Image size to which inputs are resized before inference. input_size (int): Image size to which inputs are resized before inference.
@@ -47,34 +49,45 @@ class YOLOv5Face(BaseDetector):
RuntimeError: If the ONNX model fails to load or initialize. RuntimeError: If the ONNX model fails to load or initialize.
""" """
def __init__(self, **kwargs) -> None: def __init__(
super().__init__(**kwargs) self,
*,
model_name: YOLOv5FaceWeights = YOLOv5FaceWeights.YOLOV5S,
conf_thresh: float = 0.6,
nms_thresh: float = 0.5,
input_size: int = 640,
**kwargs: Any,
) -> None:
super().__init__(
model_name=model_name,
conf_thresh=conf_thresh,
nms_thresh=nms_thresh,
input_size=input_size,
**kwargs,
)
self._supports_landmarks = True # YOLOv5-Face supports landmarks self._supports_landmarks = True # YOLOv5-Face supports landmarks
model_name = kwargs.get('model_name', YOLOv5FaceWeights.YOLOV5S)
conf_thresh = kwargs.get('conf_thresh', 0.6) # 0.6 is default from original YOLOv5-Face repository
nms_thresh = kwargs.get('nms_thresh', 0.5) # 0.5 is default from original YOLOv5-Face repository
input_size = kwargs.get('input_size', 640)
max_det = kwargs.get('max_det', 750)
# Validate input size # Validate input size
if input_size != 640: if input_size != 640:
raise ValueError( raise ValueError(
f'YOLOv5Face only supports input_size=640 (got {input_size}). The ONNX model has a fixed input shape.' f'YOLOv5Face only supports input_size=640 (got {input_size}). The ONNX model has a fixed input shape.'
) )
self.model_name = model_name
self.conf_thresh = conf_thresh self.conf_thresh = conf_thresh
self.nms_thresh = nms_thresh self.nms_thresh = nms_thresh
self.input_size = input_size self.input_size = input_size
self.max_det = max_det
# Advanced options from kwargs
self.max_det = kwargs.get('max_det', 750)
Logger.info( Logger.info(
f'Initializing YOLOv5Face with model={model_name}, conf_thresh={conf_thresh}, ' f'Initializing YOLOv5Face with model={self.model_name}, conf_thresh={self.conf_thresh}, '
f'nms_thresh={nms_thresh}, input_size={input_size}' f'nms_thresh={self.nms_thresh}, input_size={self.input_size}'
) )
# Get path to model weights # Get path to model weights
self._model_path = verify_model_weights(model_name) self._model_path = verify_model_weights(self.model_name)
Logger.info(f'Verified model weights located at: {self._model_path}') Logger.info(f'Verified model weights located at: {self._model_path}')
# Initialize model # Initialize model
@@ -242,6 +255,7 @@ class YOLOv5Face(BaseDetector):
def detect( def detect(
self, self,
image: np.ndarray, image: np.ndarray,
*,
max_num: int = 0, max_num: int = 0,
metric: Literal['default', 'max'] = 'max', metric: Literal['default', 'max'] = 'max',
center_weight: float = 2.0, center_weight: float = 2.0,

View File

@@ -2,59 +2,127 @@
# Author: Yakhyokhuja Valikhujaev # Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo # GitHub: https://github.com/yakhyo
from typing import List, Union from typing import List, Tuple, Union
import cv2 import cv2
import numpy as np import numpy as np
def draw_detections( def draw_detections(
*,
image: np.ndarray, image: np.ndarray,
bboxes: Union[List[np.ndarray], List[List[float]]], bboxes: Union[List[np.ndarray], List[List[float]]],
scores: Union[np.ndarray, List[float]], scores: Union[np.ndarray, List[float]],
landmarks: Union[List[np.ndarray], List[List[List[float]]]], landmarks: Union[List[np.ndarray], List[List[List[float]]]],
vis_threshold: float = 0.6, vis_threshold: float = 0.6,
draw_score: bool = False,
fancy_bbox: bool = True,
): ):
""" """
Draws bounding boxes, scores, and landmarks from separate lists onto an image. Draws bounding boxes, landmarks, and optional scores on an image.
Args: Args:
image (np.ndarray): The image to draw on. image: Input image to draw on.
bboxes (List[np.ndarray] or List[List[float]]): List of bounding boxes. Each bbox can be bboxes: List of bounding boxes [x1, y1, x2, y2].
np.ndarray with shape (4,) or list [x1, y1, x2, y2]. scores: List of confidence scores.
scores (List[float] or np.ndarray): List or array of confidence scores. landmarks: List of landmark sets with shape (5, 2).
landmarks (List[np.ndarray] or List[List[List[float]]]): List of landmark sets. Each landmark vis_threshold: Confidence threshold for filtering. Defaults to 0.6.
set can be np.ndarray with shape (5, 2) or nested list [[[x,y],...],...]. draw_score: Whether to draw confidence scores. Defaults to False.
vis_threshold (float): Confidence threshold for filtering which detections to draw.
""" """
_colors = [(0, 0, 255), (0, 255, 255), (255, 0, 255), (0, 255, 0), (255, 0, 0)] colors = [(0, 0, 255), (0, 255, 255), (255, 0, 255), (0, 255, 0), (255, 0, 0)]
# Filter detections by score # Calculate line thickness based on image size
line_thickness = max(round(sum(image.shape[:2]) / 2 * 0.003), 2)
# Filter detections by confidence threshold
keep_indices = [i for i, score in enumerate(scores) if score >= vis_threshold] keep_indices = [i for i, score in enumerate(scores) if score >= vis_threshold]
# Draw the filtered detections
for i in keep_indices: for i in keep_indices:
bbox = np.array(bboxes[i], dtype=np.int32) bbox = np.array(bboxes[i], dtype=np.int32)
score = scores[i] score = scores[i]
landmark_set = np.array(landmarks[i], dtype=np.int32) landmark_set = np.array(landmarks[i], dtype=np.int32)
# Calculate adaptive thickness # Calculate dynamic font scale based on bbox height
thickness = max(1, int(min(bbox[2] - bbox[0], bbox[3] - bbox[1]) / 100)) bbox_h = bbox[3] - bbox[1]
font_scale = max(0.4, min(0.7, bbox_h / 200))
font_thickness = 2
# Draw bounding box # Draw bounding box
cv2.rectangle(image, tuple(bbox[:2]), tuple(bbox[2:]), (0, 0, 255), thickness) if fancy_bbox:
draw_fancy_bbox(image, bbox, color=(0, 255, 0), thickness=line_thickness, proportion=0.2)
else:
cv2.rectangle(image, tuple(bbox[:2]), tuple(bbox[2:]), (0, 255, 0), line_thickness)
# Draw score # Draw confidence score with background
cv2.putText( if draw_score:
image, text = f'{score:.2f}'
f'{score:.2f}', (text_width, text_height), baseline = cv2.getTextSize(
(bbox[0], bbox[1] - 10), text, cv2.FONT_HERSHEY_SIMPLEX, font_scale, font_thickness
cv2.FONT_HERSHEY_SIMPLEX, )
0.5,
(255, 255, 255), # Draw background rectangle
thickness, cv2.rectangle(
) image,
(bbox[0], bbox[1] - text_height - baseline - 10),
(bbox[0] + text_width + 10, bbox[1]),
(0, 255, 0),
-1,
)
# Draw text
cv2.putText(
image,
text,
(bbox[0] + 5, bbox[1] - 5),
cv2.FONT_HERSHEY_SIMPLEX,
font_scale,
(0, 0, 0),
font_thickness,
)
# Draw landmarks # Draw landmarks
for j, point in enumerate(landmark_set): for j, point in enumerate(landmark_set):
cv2.circle(image, tuple(point), thickness + 1, _colors[j], -1) cv2.circle(image, tuple(point), line_thickness + 1, colors[j], -1)
def draw_fancy_bbox(
image: np.ndarray,
bbox: np.ndarray,
color: Tuple[int, int, int] = (0, 255, 0),
thickness: int = 3,
proportion: float = 0.2,
):
"""
Draws a bounding box with fancy corners on an image.
Args:
image: Input image to draw on.
bbox: Bounding box coordinates [x1, y1, x2, y2].
color: Color of the bounding box. Defaults to green.
thickness: Thickness of the bounding box lines. Defaults to 3.
proportion: Proportion of the corner length to the width/height of the bounding box. Defaults to 0.2.
"""
x1, y1, x2, y2 = map(int, bbox)
width = x2 - x1
height = y2 - y1
corner_length = int(proportion * min(width, height))
# Draw the rectangle
cv2.rectangle(image, (x1, y1), (x2, y2), color, 1)
# Top-left corner
cv2.line(image, (x1, y1), (x1 + corner_length, y1), color, thickness)
cv2.line(image, (x1, y1), (x1, y1 + corner_length), color, thickness)
# Top-right corner
cv2.line(image, (x2, y1), (x2 - corner_length, y1), color, thickness)
cv2.line(image, (x2, y1), (x2, y1 + corner_length), color, thickness)
# Bottom-left corner
cv2.line(image, (x1, y2), (x1, y2 - corner_length), color, thickness)
cv2.line(image, (x1, y2), (x1 + corner_length, y2), color, thickness)
# Bottom-right corner
cv2.line(image, (x2, y2), (x2, y2 - corner_length), color, thickness)
cv2.line(image, (x2, y2), (x2 - corner_length, y2), color, thickness)