feat: Update examples and some minor changes to UniFace API (#28)

* chore: Style changes and create jupyter notebook template

* docs: Update docstring for detection

* feat: Keyword only for common parameters: model_name, conf_thresh, nms_thresh, input_size

* chore: Update drawing and let the conf text optional for drawing

* feat: add fancy bbox draw

* docs: Add examples of using UniFace

* feat: Add version to all examples
This commit is contained in:
Yakhyokhuja Valikhujaev
2025-12-07 19:51:08 +09:00
committed by GitHub
parent 6b1d2a1ce6
commit 637316f077
20 changed files with 1158 additions and 252 deletions

58
CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,58 @@
# Contributing to UniFace
Thank you for considering contributing to UniFace! We welcome contributions of all kinds.
## How to Contribute
### Reporting Issues
- Use GitHub Issues to report bugs or suggest features
- Include clear descriptions and reproducible examples
- Check existing issues before creating new ones
### Pull Requests
1. Fork the repository
2. Create a new branch for your feature
3. Write clear, documented code with type hints
4. Add tests for new functionality
5. Ensure all tests pass
6. Submit a pull request with a clear description
### Code Style
- Follow PEP8 guidelines
- Use type hints (Python 3.10+)
- Write docstrings for public APIs
- Keep code simple and readable
## Development Setup
```bash
git clone https://github.com/yakhyo/uniface.git
cd uniface
pip install -e ".[dev]"
```
## Running Tests
```bash
pytest tests/
```
## Examples
Example notebooks demonstrating library usage:
| Example | Notebook |
|---------|----------|
| Face Detection | [face_detection.ipynb](examples/face_detection.ipynb) |
| Face Alignment | [face_alignment.ipynb](examples/face_alignment.ipynb) |
| Face Recognition | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
| Face Verification | [face_verification.ipynb](examples/face_verification.ipynb) |
| Face Search | [face_search.ipynb](examples/face_search.ipynb) |
## Questions?
Open an issue or start a discussion on GitHub.

View File

@@ -75,7 +75,13 @@ scores = [f['confidence'] for f in faces]
landmarks = [f['landmarks'] for f in faces]
# Draw on image
draw_detections(image, bboxes, scores, landmarks, vis_threshold=0.6)
draw_detections(
image=image,
bboxes=bboxes,
scores=scores,
landmarks=landmarks,
vis_threshold=0.6,
)
# Save result
cv2.imwrite("output.jpg", image)
@@ -156,7 +162,12 @@ while True:
bboxes = [f['bbox'] for f in faces]
scores = [f['confidence'] for f in faces]
landmarks = [f['landmarks'] for f in faces]
draw_detections(frame, bboxes, scores, landmarks)
draw_detections(
image=frame,
bboxes=bboxes,
scores=scores,
landmarks=landmarks,
)
# Show frame
cv2.imshow("UniFace - Press 'q' to quit", frame)
@@ -365,7 +376,20 @@ from uniface import retinaface # Module, not class
## Next Steps
- **Detailed Examples**: Check the [examples/](examples/) folder for Jupyter notebooks
### Jupyter Notebook Examples
Explore interactive examples for common tasks:
| Example | Description | Notebook |
|---------|-------------|----------|
| **Face Detection** | Detect faces and facial landmarks | [face_detection.ipynb](examples/face_detection.ipynb) |
| **Face Alignment** | Align and crop faces for recognition | [face_alignment.ipynb](examples/face_alignment.ipynb) |
| **Face Recognition** | Extract face embeddings and compare faces | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
| **Face Verification** | Compare two faces to verify identity | [face_verification.ipynb](examples/face_verification.ipynb) |
| **Face Search** | Find a person in a group photo | [face_search.ipynb](examples/face_search.ipynb) |
### Additional Resources
- **Model Benchmarks**: See [MODELS.md](MODELS.md) for performance comparisons
- **Full Documentation**: Read [README.md](README.md) for complete API reference
@@ -374,7 +398,6 @@ from uniface import retinaface # Module, not class
## References
- **RetinaFace Training**: [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch)
- **YOLOv5-Face Original**: [deepcam-cn/yolov5-face](https://github.com/deepcam-cn/yolov5-face)
- **YOLOv5-Face ONNX**: [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference)
- **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition)
- **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface)

View File

@@ -17,7 +17,7 @@
## Features
- **High-Speed Face Detection**: ONNX-optimized RetinaFace and SCRFD models
- **High-Speed Face Detection**: ONNX-optimized RetinaFace, SCRFD, and YOLOv5-Face models
- **Facial Landmark Detection**: Accurate 106-point landmark localization
- **Face Recognition**: ArcFace, MobileFace, and SphereFace embeddings
- **Attribute Analysis**: Age, gender, and emotion detection
@@ -218,9 +218,35 @@ recognizer = SphereFace() # Angular softmax alternative
from uniface import detect_faces
# One-line face detection
faces = detect_faces(image, method='retinaface', conf_thresh=0.8)
faces = detect_faces(image, method='retinaface', conf_thresh=0.8) # methods: retinaface, scrfd, yolov5face
```
### Key Parameters (quick reference)
**Detection**
| Class | Key params (defaults) | Notes |
| -------------- | ------------------------------------------------------------------------------------------------------------------------------------ | ---------------------------------------------- |
| `RetinaFace` | `model_name=RetinaFaceWeights.MNET_V2`, `conf_thresh=0.5`, `nms_thresh=0.4`, `input_size=(640, 640)`, `dynamic_size=False` | Supports 5-point landmarks |
| `SCRFD` | `model_name=SCRFDWeights.SCRFD_10G_KPS`, `conf_thresh=0.5`, `nms_thresh=0.4`, `input_size=(640, 640)` | Supports 5-point landmarks |
| `YOLOv5Face` | `model_name=YOLOv5FaceWeights.YOLOV5S`, `conf_thresh=0.6`, `nms_thresh=0.5`, `input_size=640` (fixed) | Landmarks supported;`input_size` must be 640 |
**Recognition**
| Class | Key params (defaults) | Notes |
| -------------- | ----------------------------------------- | ------------------------------------- |
| `ArcFace` | `model_name=ArcFaceWeights.MNET` | Returns 512-dim normalized embeddings |
| `MobileFace` | `model_name=MobileFaceWeights.MNET_V2` | Lightweight embeddings |
| `SphereFace` | `model_name=SphereFaceWeights.SPHERE20` | Angular softmax variant |
**Landmark & Attributes**
| Class | Key params (defaults) | Notes |
| --------------- | --------------------------------------------------------------------- | --------------------------------------- |
| `Landmark106` | No required params | 106-point landmarks |
| `AgeGender` | `model_name=AgeGenderWeights.DEFAULT`; `input_size` auto-detected | Requires bbox; ONNXRuntime |
| `Emotion` | `model_weights=DDAMFNWeights.AFFECNET7`, `input_size=(112, 112)` | Requires 5-point landmarks; TorchScript |
---
## Model Performance
@@ -255,6 +281,18 @@ See [MODELS.md](MODELS.md) for detailed model information and selection guide.
## Examples
### Jupyter Notebooks
Interactive examples covering common face analysis tasks:
| Example | Description | Notebook |
|---------|-------------|----------|
| **Face Detection** | Detect faces and facial landmarks | [face_detection.ipynb](examples/face_detection.ipynb) |
| **Face Alignment** | Align and crop faces for recognition | [face_alignment.ipynb](examples/face_alignment.ipynb) |
| **Face Recognition** | Extract face embeddings and compare faces | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
| **Face Verification** | Compare two faces to verify identity | [face_verification.ipynb](examples/face_verification.ipynb) |
| **Face Search** | Find a person in a group photo | [face_search.ipynb](examples/face_search.ipynb) |
### Webcam Face Detection
```python
@@ -277,7 +315,13 @@ while True:
scores = [f['confidence'] for f in faces]
landmarks = [f['landmarks'] for f in faces]
draw_detections(frame, bboxes, scores, landmarks, vis_threshold=0.6)
draw_detections(
image=frame,
bboxes=bboxes,
scores=scores,
landmarks=landmarks,
vis_threshold=0.6,
)
cv2.imshow("Face Detection", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
@@ -452,7 +496,6 @@ uniface/
## References
- **RetinaFace Training**: [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) - PyTorch implementation and training code
- **YOLOv5-Face Original**: [deepcam-cn/yolov5-face](https://github.com/deepcam-cn/yolov5-face) - Original PyTorch implementation
- **YOLOv5-Face ONNX**: [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) - ONNX inference implementation
- **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) - ArcFace, MobileFace, SphereFace training code
- **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface) - Model architectures and pretrained weights

BIN
assets/einstien.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.3 MiB

BIN
assets/scientists.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.9 MiB

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

360
examples/face_search.ipynb Normal file

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -1,6 +1,6 @@
[project]
name = "uniface"
version = "1.2.0"
version = "1.3.0"
description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Landmark Analysis, Age, and Gender Detection"
readme = "README.md"
license = { text = "MIT" }

View File

@@ -63,8 +63,8 @@ python scripts/download_model.py # downloads all
|--------|-------------|
| `--image` | Path to input image |
| `--webcam` | Use webcam instead of image |
| `--detector` | Choose detector: `retinaface` or `scrfd` |
| `--threshold` | Visualization confidence threshold (default: 0.6) |
| `--method` | Choose detector: `retinaface`, `scrfd`, `yolov5face` |
| `--threshold` | Visualization confidence threshold (default: 0.25) |
| `--save_dir` | Output directory (default: `outputs`) |
## Quick Test

View File

@@ -51,7 +51,7 @@ def run_webcam(detector, threshold: float = 0.6):
bboxes = [f['bbox'] for f in faces]
scores = [f['confidence'] for f in faces]
landmarks = [f['landmarks'] for f in faces]
draw_detections(frame, bboxes, scores, landmarks, vis_threshold=threshold)
draw_detections(frame, bboxes, scores, landmarks, vis_threshold=threshold, draw_score=True, fancy_bbox=True)
cv2.putText(
frame,
@@ -89,6 +89,7 @@ def main():
detector = SCRFD()
else:
from uniface.constants import YOLOv5FaceWeights
detector = YOLOv5Face(model_name=YOLOv5FaceWeights.YOLOV5M)
if args.webcam:

View File

@@ -13,7 +13,7 @@
__license__ = 'MIT'
__author__ = 'Yakhyokhuja Valikhujaev'
__version__ = '1.2.0'
__version__ = '1.3.0'
from uniface.face_utils import compute_similarity, face_alignment

View File

@@ -2,7 +2,7 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import List, Tuple, Union
from typing import List, Optional, Tuple, Union
import cv2
import numpy as np
@@ -24,18 +24,30 @@ class AgeGender(Attribute):
This class inherits from the base `Attribute` class and implements the
functionality for predicting age (in years) and gender ID (0 for Female,
1 for Male) from a face image. It requires a bounding box to locate the face.
Args:
model_name (AgeGenderWeights): The enum specifying the model weights to load.
Defaults to `AgeGenderWeights.DEFAULT`.
input_size (Optional[Tuple[int, int]]): Input size (height, width).
If None, automatically detected from model metadata. Defaults to None.
"""
def __init__(self, model_name: AgeGenderWeights = AgeGenderWeights.DEFAULT) -> None:
def __init__(
self,
model_name: AgeGenderWeights = AgeGenderWeights.DEFAULT,
input_size: Optional[Tuple[int, int]] = None,
) -> None:
"""
Initializes the AgeGender prediction model.
Args:
model_name (AgeGenderWeights): The enum specifying the model weights
to load.
model_name (AgeGenderWeights): The enum specifying the model weights to load.
input_size (Optional[Tuple[int, int]]): Input size (height, width).
If None, automatically detected from model metadata. Defaults to None.
"""
Logger.info(f'Initializing AgeGender with model={model_name.name}')
self.model_path = verify_model_weights(model_name)
self._user_input_size = input_size # Store user preference
self._initialize_model()
def _initialize_model(self) -> None:
@@ -47,7 +59,19 @@ class AgeGender(Attribute):
# Get model input details from the loaded model
input_meta = self.session.get_inputs()[0]
self.input_name = input_meta.name
self.input_size = tuple(input_meta.shape[2:4]) # (height, width)
# Use user-provided size if given, otherwise auto-detect from model
model_input_size = tuple(input_meta.shape[2:4]) # (height, width)
if self._user_input_size is not None:
self.input_size = self._user_input_size
if self._user_input_size != model_input_size:
Logger.warning(
f'Using custom input_size {self.input_size}, '
f'but model expects {model_input_size}. This may affect accuracy.'
)
else:
self.input_size = model_input_size
self.output_names = [output.name for output in self.session.get_outputs()]
Logger.info(f'Successfully initialized AgeGender model with input size {self.input_size}')
except Exception as e:

View File

@@ -22,7 +22,7 @@ def detect_faces(image: np.ndarray, method: str = 'retinaface', **kwargs) -> Lis
Args:
image (np.ndarray): Input image as numpy array.
method (str): Detection method to use. Options: 'retinaface', 'scrfd'.
method (str): Detection method to use. Options: 'retinaface', 'scrfd', 'yolov5face'.
**kwargs: Additional arguments passed to the detector.
Returns:

View File

@@ -27,18 +27,19 @@ class RetinaFace(BaseDetector):
Title: "RetinaFace: Single-stage Dense Face Localisation in the Wild"
Paper: https://arxiv.org/abs/1905.00641
Code: https://github.com/yakhyo/retinaface-pytorch
Args:
**kwargs: Keyword arguments passed to BaseDetector and RetinaFace. Supported keys include:
model_name (RetinaFaceWeights, optional): Model weights to use. Defaults to `RetinaFaceWeights.MNET_V2`.
conf_thresh (float, optional): Confidence threshold for filtering detections. Defaults to 0.5.
nms_thresh (float, optional): Non-maximum suppression (NMS) IoU threshold. Defaults to 0.4.
pre_nms_topk (int, optional): Number of top-scoring boxes considered before NMS. Defaults to 5000.
post_nms_topk (int, optional): Max number of detections kept after NMS. Defaults to 750.
dynamic_size (bool, optional): If True, generate anchors dynamically per input image. Defaults to False.
input_size (Tuple[int, int], optional): Fixed input size (width, height) if `dynamic_size=False`.
model_name (RetinaFaceWeights): Model weights to use. Defaults to `RetinaFaceWeights.MNET_V2`.
conf_thresh (float): Confidence threshold for filtering detections. Defaults to 0.5.
nms_thresh (float): Non-maximum suppression (NMS) IoU threshold. Defaults to 0.4.
input_size (Tuple[int, int]): Fixed input size (width, height) if `dynamic_size=False`.
Defaults to (640, 640).
Note: Non-default sizes may cause slower inference and CoreML compatibility issues.
**kwargs: Advanced options:
pre_nms_topk (int): Number of top-scoring boxes considered before NMS. Defaults to 5000.
post_nms_topk (int): Max number of detections kept after NMS. Defaults to 750.
dynamic_size (bool): If True, generate anchors dynamically per input image. Defaults to False.
Attributes:
model_name (RetinaFaceWeights): Selected model variant.
@@ -57,17 +58,33 @@ class RetinaFace(BaseDetector):
RuntimeError: If the ONNX model fails to load or initialize.
"""
def __init__(self, **kwargs) -> None:
super().__init__(**kwargs)
def __init__(
self,
*,
model_name: RetinaFaceWeights = RetinaFaceWeights.MNET_V2,
conf_thresh: float = 0.5,
nms_thresh: float = 0.4,
input_size: Tuple[int, int] = (640, 640),
**kwargs: Any,
) -> None:
super().__init__(
model_name=model_name,
conf_thresh=conf_thresh,
nms_thresh=nms_thresh,
input_size=input_size,
**kwargs,
)
self._supports_landmarks = True # RetinaFace supports landmarks
self.model_name = kwargs.get('model_name', RetinaFaceWeights.MNET_V2)
self.conf_thresh = kwargs.get('conf_thresh', 0.5)
self.nms_thresh = kwargs.get('nms_thresh', 0.4)
self.model_name = model_name
self.conf_thresh = conf_thresh
self.nms_thresh = nms_thresh
self.input_size = input_size
# Advanced options from kwargs
self.pre_nms_topk = kwargs.get('pre_nms_topk', 5000)
self.post_nms_topk = kwargs.get('post_nms_topk', 750)
self.dynamic_size = kwargs.get('dynamic_size', False)
self.input_size = kwargs.get('input_size', (640, 640))
Logger.info(
f'Initializing RetinaFace with model={self.model_name}, conf_thresh={self.conf_thresh}, '
@@ -133,6 +150,7 @@ class RetinaFace(BaseDetector):
def detect(
self,
image: np.ndarray,
*,
max_num: int = 0,
metric: Literal['default', 'max'] = 'max',
center_weight: float = 2.0,

View File

@@ -24,18 +24,20 @@ class SCRFD(BaseDetector):
Title: "Sample and Computation Redistribution for Efficient Face Detection"
Paper: https://arxiv.org/abs/2105.04714
Code: https://github.com/insightface/insightface
Args:
**kwargs: Keyword arguments passed to BaseDetector and SCRFD. Supported keys include:
model_name (SCRFDWeights, optional): Predefined model enum (e.g., `SCRFD_10G_KPS`).
model_name (SCRFDWeights): Predefined model enum (e.g., `SCRFD_10G_KPS`).
Specifies the SCRFD variant to load. Defaults to SCRFD_10G_KPS.
conf_thresh (float, optional): Confidence threshold for filtering detections. Defaults to 0.5.
nms_thresh (float, optional): Non-Maximum Suppression threshold. Defaults to 0.4.
input_size (Tuple[int, int], optional): Input image size (width, height).
conf_thresh (float): Confidence threshold for filtering detections. Defaults to 0.5.
nms_thresh (float): Non-Maximum Suppression threshold. Defaults to 0.4.
input_size (Tuple[int, int]): Input image size (width, height).
Defaults to (640, 640).
Note: Non-default sizes may cause slower inference and CoreML compatibility issues.
**kwargs: Reserved for future advanced options.
Attributes:
model_name (SCRFDWeights): Selected model variant.
conf_thresh (float): Threshold used to filter low-confidence detections.
nms_thresh (float): Threshold used during NMS to suppress overlapping boxes.
input_size (Tuple[int, int]): Image size to which inputs are resized before inference.
@@ -50,15 +52,25 @@ class SCRFD(BaseDetector):
RuntimeError: If the ONNX model fails to load or initialize.
"""
def __init__(self, **kwargs) -> None:
super().__init__(**kwargs)
def __init__(
self,
*,
model_name: SCRFDWeights = SCRFDWeights.SCRFD_10G_KPS,
conf_thresh: float = 0.5,
nms_thresh: float = 0.4,
input_size: Tuple[int, int] = (640, 640),
**kwargs: Any,
) -> None:
super().__init__(
model_name=model_name,
conf_thresh=conf_thresh,
nms_thresh=nms_thresh,
input_size=input_size,
**kwargs,
)
self._supports_landmarks = True # SCRFD supports landmarks
model_name = kwargs.get('model_name', SCRFDWeights.SCRFD_10G_KPS)
conf_thresh = kwargs.get('conf_thresh', 0.5)
nms_thresh = kwargs.get('nms_thresh', 0.4)
input_size = kwargs.get('input_size', (640, 640))
self.model_name = model_name
self.conf_thresh = conf_thresh
self.nms_thresh = nms_thresh
self.input_size = input_size
@@ -71,12 +83,12 @@ class SCRFD(BaseDetector):
# ---------------------------------
Logger.info(
f'Initializing SCRFD with model={model_name}, conf_thresh={conf_thresh}, nms_thresh={nms_thresh}, '
f'input_size={input_size}'
f'Initializing SCRFD with model={self.model_name}, conf_thresh={self.conf_thresh}, '
f'nms_thresh={self.nms_thresh}, input_size={self.input_size}'
)
# Get path to model weights
self._model_path = verify_model_weights(model_name)
self._model_path = verify_model_weights(self.model_name)
Logger.info(f'Verified model weights located at: {self._model_path}')
# Initialize model
@@ -177,9 +189,10 @@ class SCRFD(BaseDetector):
def detect(
self,
image: np.ndarray,
*,
max_num: int = 0,
metric: Literal['default', 'max'] = 'max',
center_weight: float = 2,
center_weight: float = 2.0,
) -> List[Dict[str, Any]]:
"""
Perform face detection on an input image and return bounding boxes and facial landmarks.

View File

@@ -22,20 +22,22 @@ class YOLOv5Face(BaseDetector):
"""
Face detector based on the YOLOv5-Face architecture.
Title: "YOLO5Face: Why Reinventing a Face Detector"
Paper: https://arxiv.org/abs/2105.12931
Original Implementation: https://github.com/deepcam-cn/yolov5-face
Code: https://github.com/yakhyo/yolov5-face-onnx-inference (ONNX inference implementation)
Args:
**kwargs: Keyword arguments passed to BaseDetector and YOLOv5Face. Supported keys include:
model_name (YOLOv5FaceWeights, optional): Predefined model enum (e.g., `YOLOV5S`).
model_name (YOLOv5FaceWeights): Predefined model enum (e.g., `YOLOV5S`).
Specifies the YOLOv5-Face variant to load. Defaults to YOLOV5S.
conf_thresh (float, optional): Confidence threshold for filtering detections. Defaults to 0.25.
nms_thresh (float, optional): Non-Maximum Suppression threshold. Defaults to 0.45.
input_size (int, optional): Input image size. Defaults to 640.
conf_thresh (float): Confidence threshold for filtering detections. Defaults to 0.6.
nms_thresh (float): Non-Maximum Suppression threshold. Defaults to 0.5.
input_size (int): Input image size. Defaults to 640.
Note: ONNX model is fixed at 640. Changing this will cause inference errors.
max_det (int, optional): Maximum number of detections to return. Defaults to 750.
**kwargs: Advanced options:
max_det (int): Maximum number of detections to return. Defaults to 750.
Attributes:
model_name (YOLOv5FaceWeights): Selected model variant.
conf_thresh (float): Threshold used to filter low-confidence detections.
nms_thresh (float): Threshold used during NMS to suppress overlapping boxes.
input_size (int): Image size to which inputs are resized before inference.
@@ -47,34 +49,45 @@ class YOLOv5Face(BaseDetector):
RuntimeError: If the ONNX model fails to load or initialize.
"""
def __init__(self, **kwargs) -> None:
super().__init__(**kwargs)
def __init__(
self,
*,
model_name: YOLOv5FaceWeights = YOLOv5FaceWeights.YOLOV5S,
conf_thresh: float = 0.6,
nms_thresh: float = 0.5,
input_size: int = 640,
**kwargs: Any,
) -> None:
super().__init__(
model_name=model_name,
conf_thresh=conf_thresh,
nms_thresh=nms_thresh,
input_size=input_size,
**kwargs,
)
self._supports_landmarks = True # YOLOv5-Face supports landmarks
model_name = kwargs.get('model_name', YOLOv5FaceWeights.YOLOV5S)
conf_thresh = kwargs.get('conf_thresh', 0.6) # 0.6 is default from original YOLOv5-Face repository
nms_thresh = kwargs.get('nms_thresh', 0.5) # 0.5 is default from original YOLOv5-Face repository
input_size = kwargs.get('input_size', 640)
max_det = kwargs.get('max_det', 750)
# Validate input size
if input_size != 640:
raise ValueError(
f'YOLOv5Face only supports input_size=640 (got {input_size}). The ONNX model has a fixed input shape.'
)
self.model_name = model_name
self.conf_thresh = conf_thresh
self.nms_thresh = nms_thresh
self.input_size = input_size
self.max_det = max_det
# Advanced options from kwargs
self.max_det = kwargs.get('max_det', 750)
Logger.info(
f'Initializing YOLOv5Face with model={model_name}, conf_thresh={conf_thresh}, '
f'nms_thresh={nms_thresh}, input_size={input_size}'
f'Initializing YOLOv5Face with model={self.model_name}, conf_thresh={self.conf_thresh}, '
f'nms_thresh={self.nms_thresh}, input_size={self.input_size}'
)
# Get path to model weights
self._model_path = verify_model_weights(model_name)
self._model_path = verify_model_weights(self.model_name)
Logger.info(f'Verified model weights located at: {self._model_path}')
# Initialize model
@@ -242,6 +255,7 @@ class YOLOv5Face(BaseDetector):
def detect(
self,
image: np.ndarray,
*,
max_num: int = 0,
metric: Literal['default', 'max'] = 'max',
center_weight: float = 2.0,

View File

@@ -2,59 +2,127 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from typing import List, Union
from typing import List, Tuple, Union
import cv2
import numpy as np
def draw_detections(
*,
image: np.ndarray,
bboxes: Union[List[np.ndarray], List[List[float]]],
scores: Union[np.ndarray, List[float]],
landmarks: Union[List[np.ndarray], List[List[List[float]]]],
vis_threshold: float = 0.6,
draw_score: bool = False,
fancy_bbox: bool = True,
):
"""
Draws bounding boxes, scores, and landmarks from separate lists onto an image.
Draws bounding boxes, landmarks, and optional scores on an image.
Args:
image (np.ndarray): The image to draw on.
bboxes (List[np.ndarray] or List[List[float]]): List of bounding boxes. Each bbox can be
np.ndarray with shape (4,) or list [x1, y1, x2, y2].
scores (List[float] or np.ndarray): List or array of confidence scores.
landmarks (List[np.ndarray] or List[List[List[float]]]): List of landmark sets. Each landmark
set can be np.ndarray with shape (5, 2) or nested list [[[x,y],...],...].
vis_threshold (float): Confidence threshold for filtering which detections to draw.
image: Input image to draw on.
bboxes: List of bounding boxes [x1, y1, x2, y2].
scores: List of confidence scores.
landmarks: List of landmark sets with shape (5, 2).
vis_threshold: Confidence threshold for filtering. Defaults to 0.6.
draw_score: Whether to draw confidence scores. Defaults to False.
"""
_colors = [(0, 0, 255), (0, 255, 255), (255, 0, 255), (0, 255, 0), (255, 0, 0)]
colors = [(0, 0, 255), (0, 255, 255), (255, 0, 255), (0, 255, 0), (255, 0, 0)]
# Filter detections by score
# Calculate line thickness based on image size
line_thickness = max(round(sum(image.shape[:2]) / 2 * 0.003), 2)
# Filter detections by confidence threshold
keep_indices = [i for i, score in enumerate(scores) if score >= vis_threshold]
# Draw the filtered detections
for i in keep_indices:
bbox = np.array(bboxes[i], dtype=np.int32)
score = scores[i]
landmark_set = np.array(landmarks[i], dtype=np.int32)
# Calculate adaptive thickness
thickness = max(1, int(min(bbox[2] - bbox[0], bbox[3] - bbox[1]) / 100))
# Calculate dynamic font scale based on bbox height
bbox_h = bbox[3] - bbox[1]
font_scale = max(0.4, min(0.7, bbox_h / 200))
font_thickness = 2
# Draw bounding box
cv2.rectangle(image, tuple(bbox[:2]), tuple(bbox[2:]), (0, 0, 255), thickness)
if fancy_bbox:
draw_fancy_bbox(image, bbox, color=(0, 255, 0), thickness=line_thickness, proportion=0.2)
else:
cv2.rectangle(image, tuple(bbox[:2]), tuple(bbox[2:]), (0, 255, 0), line_thickness)
# Draw score
# Draw confidence score with background
if draw_score:
text = f'{score:.2f}'
(text_width, text_height), baseline = cv2.getTextSize(
text, cv2.FONT_HERSHEY_SIMPLEX, font_scale, font_thickness
)
# Draw background rectangle
cv2.rectangle(
image,
(bbox[0], bbox[1] - text_height - baseline - 10),
(bbox[0] + text_width + 10, bbox[1]),
(0, 255, 0),
-1,
)
# Draw text
cv2.putText(
image,
f'{score:.2f}',
(bbox[0], bbox[1] - 10),
text,
(bbox[0] + 5, bbox[1] - 5),
cv2.FONT_HERSHEY_SIMPLEX,
0.5,
(255, 255, 255),
thickness,
font_scale,
(0, 0, 0),
font_thickness,
)
# Draw landmarks
for j, point in enumerate(landmark_set):
cv2.circle(image, tuple(point), thickness + 1, _colors[j], -1)
cv2.circle(image, tuple(point), line_thickness + 1, colors[j], -1)
def draw_fancy_bbox(
image: np.ndarray,
bbox: np.ndarray,
color: Tuple[int, int, int] = (0, 255, 0),
thickness: int = 3,
proportion: float = 0.2,
):
"""
Draws a bounding box with fancy corners on an image.
Args:
image: Input image to draw on.
bbox: Bounding box coordinates [x1, y1, x2, y2].
color: Color of the bounding box. Defaults to green.
thickness: Thickness of the bounding box lines. Defaults to 3.
proportion: Proportion of the corner length to the width/height of the bounding box. Defaults to 0.2.
"""
x1, y1, x2, y2 = map(int, bbox)
width = x2 - x1
height = y2 - y1
corner_length = int(proportion * min(width, height))
# Draw the rectangle
cv2.rectangle(image, (x1, y1), (x2, y2), color, 1)
# Top-left corner
cv2.line(image, (x1, y1), (x1 + corner_length, y1), color, thickness)
cv2.line(image, (x1, y1), (x1, y1 + corner_length), color, thickness)
# Top-right corner
cv2.line(image, (x2, y1), (x2 - corner_length, y1), color, thickness)
cv2.line(image, (x2, y1), (x2, y1 + corner_length), color, thickness)
# Bottom-left corner
cv2.line(image, (x1, y2), (x1, y2 - corner_length), color, thickness)
cv2.line(image, (x1, y2), (x1 + corner_length, y2), color, thickness)
# Bottom-right corner
cv2.line(image, (x2, y2), (x2, y2 - corner_length), color, thickness)
cv2.line(image, (x2, y2), (x2 - corner_length, y2), color, thickness)