mirror of
https://github.com/yakhyo/uniface.git
synced 2026-05-15 04:37:49 +00:00
Compare commits
5 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
331f46be7c | ||
|
|
9991fae62a | ||
|
|
b74ab95d39 | ||
|
|
d2b0303bfe | ||
|
|
5f74487eb3 |
BIN
.github/logos/new/uniface_enhanced.webp
vendored
Normal file
BIN
.github/logos/new/uniface_enhanced.webp
vendored
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 427 KiB |
BIN
.github/logos/new/uniface_high_res_original.png
vendored
Normal file
BIN
.github/logos/new/uniface_high_res_original.png
vendored
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 1.7 MiB |
BIN
.github/logos/new/uniface_rounded.png
vendored
Normal file
BIN
.github/logos/new/uniface_rounded.png
vendored
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 1.8 MiB |
BIN
.github/logos/new/uniface_rounded_150px.png
vendored
Normal file
BIN
.github/logos/new/uniface_rounded_150px.png
vendored
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 1.9 MiB |
BIN
.github/logos/new/uniface_rounded_q80.png
vendored
Normal file
BIN
.github/logos/new/uniface_rounded_q80.png
vendored
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 872 KiB |
BIN
.github/logos/new/uniface_rounded_q80.webp
vendored
Normal file
BIN
.github/logos/new/uniface_rounded_q80.webp
vendored
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 62 KiB |
2
.github/workflows/ci.yml
vendored
2
.github/workflows/ci.yml
vendored
@@ -1,4 +1,4 @@
|
||||
name: CI
|
||||
name: Build
|
||||
|
||||
on:
|
||||
push:
|
||||
|
||||
1
.gitignore
vendored
1
.gitignore
vendored
@@ -1,4 +1,5 @@
|
||||
tmp_*
|
||||
.vscode/
|
||||
|
||||
# Byte-compiled / optimized / DLL files
|
||||
__pycache__/
|
||||
|
||||
136
README.md
136
README.md
@@ -1,32 +1,33 @@
|
||||
# UniFace: All-in-One Face Analysis Library
|
||||
<h1 align="center">UniFace: All-in-One Face Analysis Library</h1>
|
||||
|
||||
<div align="center">
|
||||
|
||||
[](https://pypi.org/project/uniface/)
|
||||
[](https://www.python.org/)
|
||||
[](https://pypi.org/project/uniface/)
|
||||
[](https://www.python.org/)
|
||||
[](https://opensource.org/licenses/MIT)
|
||||
[](https://github.com/yakhyo/uniface/actions)
|
||||
[](https://github.com/yakhyo/uniface/actions)
|
||||
[](https://pepy.tech/projects/uniface)
|
||||
[](https://yakhyo.github.io/uniface/)
|
||||
[](https://yakhyo.github.io/uniface/)
|
||||
[](https://www.kaggle.com/yakhyokhuja/code)
|
||||
|
||||
</div>
|
||||
|
||||
<div align="center">
|
||||
<img src=".github/logos/logo_web.webp" width=80%>
|
||||
<img src="https://raw.githubusercontent.com/yakhyo/uniface/main/.github/logos/new/uniface_rounded_q80.webp" width="90%" alt="UniFace - All-in-One Open-Source Face Analysis Library">
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
**UniFace** is a lightweight, production-ready face analysis library built on ONNX Runtime. It provides high-performance face detection, recognition, landmark detection, face parsing, gaze estimation, and attribute analysis with hardware acceleration support across platforms.
|
||||
|
||||
> 💬 **Have questions?** [Chat with this codebase on DeepWiki](https://deepwiki.com/yakhyo/uniface) - AI-powered docs that let you ask anything about UniFace.
|
||||
|
||||
---
|
||||
|
||||
## Features
|
||||
|
||||
- **Face Detection** — RetinaFace, SCRFD, YOLOv5-Face, and YOLOv8-Face with 5-point landmarks
|
||||
- **Face Recognition** — ArcFace, MobileFace, and SphereFace embeddings
|
||||
- **Facial Landmarks** — 106-point landmark localization
|
||||
- **Face Parsing** — BiSeNet semantic segmentation (19 classes)
|
||||
- **Facial Landmarks** — 106-point landmark localization module (separate from 5-point detector landmarks)
|
||||
- **Face Parsing** — BiSeNet semantic segmentation (19 classes), XSeg face masking
|
||||
- **Gaze Estimation** — Real-time gaze direction with MobileGaze
|
||||
- **Attribute Analysis** — Age, gender, race (FairFace), and emotion
|
||||
- **Anti-Spoofing** — Face liveness detection with MiniFASNet
|
||||
@@ -37,31 +38,55 @@
|
||||
|
||||
## Installation
|
||||
|
||||
**Standard installation**
|
||||
|
||||
```bash
|
||||
# Standard installation
|
||||
pip install uniface
|
||||
```
|
||||
|
||||
# GPU support (CUDA)
|
||||
**GPU support (CUDA)**
|
||||
|
||||
```bash
|
||||
pip install uniface[gpu]
|
||||
```
|
||||
|
||||
# From source
|
||||
**From source (latest version)**
|
||||
|
||||
```bash
|
||||
git clone https://github.com/yakhyo/uniface.git
|
||||
cd uniface && pip install -e .
|
||||
```
|
||||
|
||||
**Optional dependencies**
|
||||
- Emotion model uses TorchScript and requires `torch`:
|
||||
`pip install torch` (choose the correct build for your OS/CUDA)
|
||||
- YOLOv5-Face and YOLOv8-Face support faster NMS with `torchvision`:
|
||||
`pip install torch torchvision` then use `nms_mode='torchvision'`
|
||||
|
||||
---
|
||||
|
||||
## Quick Example
|
||||
## Model Downloads and Cache
|
||||
|
||||
Models are downloaded automatically on first use and verified via SHA-256.
|
||||
|
||||
Default cache location: `~/.uniface/models`
|
||||
|
||||
You can override it with `UNIFACE_CACHE_DIR=/your/cache/path`
|
||||
|
||||
---
|
||||
|
||||
## Quick Example (Detection)
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
|
||||
# Initialize detector (models auto-download on first use)
|
||||
detector = RetinaFace()
|
||||
|
||||
# Detect faces
|
||||
image = cv2.imread("photo.jpg")
|
||||
if image is None:
|
||||
raise ValueError("Failed to load image. Check the path to 'photo.jpg'.")
|
||||
|
||||
faces = detector.detect(image)
|
||||
|
||||
for face in faces:
|
||||
@@ -71,14 +96,52 @@ for face in faces:
|
||||
```
|
||||
|
||||
<div align="center">
|
||||
<img src="assets/test_result.png">
|
||||
<img src="https://raw.githubusercontent.com/yakhyo/uniface/main/assets/test_result.png" width="90%">
|
||||
<p>Face Detection Model Output</p>
|
||||
</div>
|
||||
|
||||
---
|
||||
|
||||
## Example (Face Analyzer)
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace, ArcFace, FaceAnalyzer
|
||||
|
||||
detector = RetinaFace()
|
||||
recognizer = ArcFace()
|
||||
|
||||
analyzer = FaceAnalyzer(detector, recognizer=recognizer)
|
||||
|
||||
image = cv2.imread("photo.jpg")
|
||||
if image is None:
|
||||
raise ValueError("Failed to load image. Check the path to 'photo.jpg'.")
|
||||
|
||||
faces = analyzer.analyze(image)
|
||||
|
||||
for face in faces:
|
||||
print(face.bbox, face.embedding.shape if face.embedding is not None else None)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Execution Providers (ONNX Runtime)
|
||||
|
||||
```python
|
||||
from uniface import RetinaFace
|
||||
|
||||
# Force CPU-only inference
|
||||
detector = RetinaFace(providers=["CPUExecutionProvider"])
|
||||
```
|
||||
|
||||
See more in the docs:
|
||||
https://yakhyo.github.io/uniface/concepts/execution-providers/
|
||||
|
||||
---
|
||||
|
||||
## Documentation
|
||||
|
||||
📚 **Full documentation**: [yakhyo.github.io/uniface](https://yakhyo.github.io/uniface/)
|
||||
Full documentation: https://yakhyo.github.io/uniface/
|
||||
|
||||
| Resource | Description |
|
||||
|----------|-------------|
|
||||
@@ -88,7 +151,9 @@ for face in faces:
|
||||
| [Tutorials](https://yakhyo.github.io/uniface/recipes/image-pipeline/) | Step-by-step workflow examples |
|
||||
| [Guides](https://yakhyo.github.io/uniface/concepts/overview/) | Architecture and design principles |
|
||||
|
||||
### Jupyter Notebooks
|
||||
---
|
||||
|
||||
## Jupyter Notebooks
|
||||
|
||||
| Example | Colab | Description |
|
||||
|---------|:-----:|-------------|
|
||||
@@ -100,6 +165,20 @@ for face in faces:
|
||||
| [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | Semantic face segmentation |
|
||||
| [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | Privacy-preserving blur |
|
||||
| [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | Gaze direction estimation |
|
||||
| [09_face_segmentation.ipynb](examples/09_face_segmentation.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | Face segmentation with XSeg |
|
||||
|
||||
---
|
||||
|
||||
## Licensing and Model Usage
|
||||
|
||||
UniFace is MIT-licensed, but several pretrained models carry their own licenses.
|
||||
Review: https://yakhyo.github.io/uniface/license-attribution/
|
||||
|
||||
Notable examples:
|
||||
- YOLOv5-Face and YOLOv8-Face weights are GPL-3.0
|
||||
- FairFace weights are CC BY 4.0
|
||||
|
||||
If you plan commercial use, verify model license compatibility.
|
||||
|
||||
---
|
||||
|
||||
@@ -107,12 +186,13 @@ for face in faces:
|
||||
|
||||
| Feature | Repository | Training | Description |
|
||||
|---------|------------|:--------:|-------------|
|
||||
| Detection | [retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) | [x] | RetinaFace PyTorch Training & Export |
|
||||
| Detection | [retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) | ✓ | RetinaFace PyTorch Training & Export |
|
||||
| Detection | [yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) | - | YOLOv5-Face ONNX Inference |
|
||||
| Detection | [yolov8-face-onnx-inference](https://github.com/yakhyo/yolov8-face-onnx-inference) | - | YOLOv8-Face ONNX Inference |
|
||||
| Recognition | [face-recognition](https://github.com/yakhyo/face-recognition) | [x] | MobileFace, SphereFace Training |
|
||||
| Parsing | [face-parsing](https://github.com/yakhyo/face-parsing) | [x] | BiSeNet Face Parsing |
|
||||
| Gaze | [gaze-estimation](https://github.com/yakhyo/gaze-estimation) | [x] | MobileGaze Training |
|
||||
| Recognition | [face-recognition](https://github.com/yakhyo/face-recognition) | ✓ | MobileFace, SphereFace Training |
|
||||
| Parsing | [face-parsing](https://github.com/yakhyo/face-parsing) | ✓ | BiSeNet Face Parsing |
|
||||
| Parsing | [face-segmentation](https://github.com/yakhyo/face-segmentation) | - | XSeg Face Segmentation |
|
||||
| Gaze | [gaze-estimation](https://github.com/yakhyo/gaze-estimation) | ✓ | MobileGaze Training |
|
||||
| Anti-Spoofing | [face-anti-spoofing](https://github.com/yakhyo/face-anti-spoofing) | - | MiniFASNet Inference |
|
||||
| Attributes | [fairface-onnx](https://github.com/yakhyo/fairface-onnx) | - | FairFace ONNX Inference |
|
||||
|
||||
@@ -122,7 +202,15 @@ for face in faces:
|
||||
|
||||
## Contributing
|
||||
|
||||
Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
|
||||
Contributions are welcome. Please see [CONTRIBUTING.md](CONTRIBUTING.md).
|
||||
|
||||
## Support
|
||||
|
||||
If you find this project useful, consider giving it a ⭐ on GitHub — it helps others discover it!
|
||||
|
||||
Questions or feedback:
|
||||
- GitHub Issues: https://github.com/yakhyo/uniface/issues
|
||||
- DeepWiki Q&A: https://deepwiki.com/yakhyo/uniface
|
||||
|
||||
## License
|
||||
|
||||
|
||||
@@ -17,8 +17,8 @@ detector = RetinaFace()
|
||||
|
||||
**Priority order:**
|
||||
|
||||
1. **CUDAExecutionProvider** - NVIDIA GPU
|
||||
2. **CoreMLExecutionProvider** - Apple Silicon
|
||||
1. **CoreMLExecutionProvider** - Apple Silicon
|
||||
2. **CUDAExecutionProvider** - NVIDIA GPU
|
||||
3. **CPUExecutionProvider** - Fallback
|
||||
|
||||
---
|
||||
|
||||
@@ -199,16 +199,16 @@ print(f"Classes: {np.unique(mask)}") # [0, 1, 2, ...]
|
||||
|
||||
| ID | Class | ID | Class |
|
||||
|----|-------|----|-------|
|
||||
| 0 | Background | 10 | Ear Ring |
|
||||
| 1 | Skin | 11 | Nose |
|
||||
| 2 | Left Eyebrow | 12 | Mouth |
|
||||
| 3 | Right Eyebrow | 13 | Upper Lip |
|
||||
| 4 | Left Eye | 14 | Lower Lip |
|
||||
| 5 | Right Eye | 15 | Neck |
|
||||
| 6 | Eye Glasses | 16 | Neck Lace |
|
||||
| 7 | Left Ear | 17 | Cloth |
|
||||
| 8 | Right Ear | 18 | Hair |
|
||||
| 9 | Hat | | |
|
||||
| 0 | Background | 10 | Nose |
|
||||
| 1 | Skin | 11 | Mouth |
|
||||
| 2 | Left Eyebrow | 12 | Upper Lip |
|
||||
| 3 | Right Eyebrow | 13 | Lower Lip |
|
||||
| 4 | Left Eye | 14 | Neck |
|
||||
| 5 | Right Eye | 15 | Necklace |
|
||||
| 6 | Eyeglasses | 16 | Cloth |
|
||||
| 7 | Left Ear | 17 | Hair |
|
||||
| 8 | Right Ear | 18 | Hat |
|
||||
| 9 | Earring | | |
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -32,9 +32,9 @@ Default cache directory:
|
||||
|
||||
```
|
||||
~/.uniface/models/
|
||||
├── retinaface_mv2.onnx
|
||||
├── w600k_mbf.onnx
|
||||
├── 2d106det.onnx
|
||||
├── retinaface_mnet_v2.onnx
|
||||
├── arcface_mnet.onnx
|
||||
├── 2d_106.onnx
|
||||
├── gaze_resnet34.onnx
|
||||
├── parsing_resnet18.onnx
|
||||
└── ...
|
||||
|
||||
@@ -151,12 +151,18 @@ for face in faces:
|
||||
For convenience, `FaceAnalyzer` combines multiple modules:
|
||||
|
||||
```python
|
||||
from uniface import FaceAnalyzer
|
||||
from uniface import FaceAnalyzer, RetinaFace, ArcFace, AgeGender, FairFace
|
||||
|
||||
detector = RetinaFace()
|
||||
recognizer = ArcFace()
|
||||
age_gender = AgeGender()
|
||||
fairface = FairFace()
|
||||
|
||||
analyzer = FaceAnalyzer(
|
||||
detect=True,
|
||||
recognize=True,
|
||||
attributes=True
|
||||
detector,
|
||||
recognizer=recognizer,
|
||||
age_gender=age_gender,
|
||||
fairface=fairface,
|
||||
)
|
||||
|
||||
faces = analyzer.analyze(image)
|
||||
|
||||
@@ -10,12 +10,16 @@ template: home.html
|
||||
|
||||
# UniFace { .hero-title }
|
||||
|
||||
<p class="hero-subtitle">A lightweight, production-ready face analysis library built on ONNX Runtime</p>
|
||||
<p class="hero-subtitle">All-in-One Open-Source Face Analysis Library</p>
|
||||
|
||||
[](https://pypi.org/project/uniface/)
|
||||
[](https://www.python.org/)
|
||||
[](https://pypi.org/project/uniface/)
|
||||
[](https://www.python.org/)
|
||||
[](https://opensource.org/licenses/MIT)
|
||||
[](https://github.com/yakhyo/uniface/actions)
|
||||
[](https://pepy.tech/projects/uniface)
|
||||
[](https://www.kaggle.com/yakhyokhuja/code)
|
||||
|
||||
<img src="https://raw.githubusercontent.com/yakhyo/uniface/main/.github/logos/new/uniface_rounded_q80.webp" alt="UniFace - All-in-One Open-Source Face Analysis Library" style="max-width: 90%; margin: 1rem 0;">
|
||||
|
||||
[Get Started](quickstart.md){ .md-button .md-button--primary }
|
||||
[View on GitHub](https://github.com/yakhyo/uniface){ .md-button }
|
||||
|
||||
@@ -107,7 +107,9 @@ UniFace has minimal dependencies:
|
||||
|---------|---------|
|
||||
| `numpy` | Array operations |
|
||||
| `opencv-python` | Image processing |
|
||||
| `onnx` | ONNX model format support |
|
||||
| `onnxruntime` | Model inference |
|
||||
| `scikit-image` | Geometric transforms |
|
||||
| `requests` | Model download |
|
||||
| `tqdm` | Progress bars |
|
||||
|
||||
|
||||
@@ -22,7 +22,7 @@ RetinaFace models are trained on the WIDER FACE dataset.
|
||||
!!! info "Accuracy & Benchmarks"
|
||||
**Accuracy**: WIDER FACE validation set (Easy/Medium/Hard subsets) - from [RetinaFace paper](https://arxiv.org/abs/1905.00641)
|
||||
|
||||
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --iterations 100`
|
||||
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image>`
|
||||
|
||||
---
|
||||
|
||||
@@ -32,13 +32,13 @@ SCRFD (Sample and Computation Redistribution for Efficient Face Detection) model
|
||||
|
||||
| Model Name | Params | Size | Easy | Medium | Hard |
|
||||
| ---------------- | ------ | ----- | ------ | ------ | ------ |
|
||||
| `SCRFD_500M` | 0.6M | 2.5MB | 90.57% | 88.12% | 68.51% |
|
||||
| `SCRFD_10G` :material-check-circle: | 4.2M | 17MB | 95.16% | 93.87% | 83.05% |
|
||||
| `SCRFD_500M_KPS` | 0.6M | 2.5MB | 90.57% | 88.12% | 68.51% |
|
||||
| `SCRFD_10G_KPS` :material-check-circle: | 4.2M | 17MB | 95.16% | 93.87% | 83.05% |
|
||||
|
||||
!!! info "Accuracy & Benchmarks"
|
||||
**Accuracy**: WIDER FACE validation set - from [SCRFD paper](https://arxiv.org/abs/2105.04714)
|
||||
|
||||
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --iterations 100`
|
||||
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image>`
|
||||
|
||||
---
|
||||
|
||||
@@ -55,7 +55,7 @@ YOLOv5-Face models provide detection with 5-point facial landmarks, trained on W
|
||||
!!! info "Accuracy & Benchmarks"
|
||||
**Accuracy**: WIDER FACE validation set - from [YOLOv5-Face paper](https://arxiv.org/abs/2105.12931)
|
||||
|
||||
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --iterations 100`
|
||||
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image>`
|
||||
|
||||
!!! note "Fixed Input Size"
|
||||
All YOLOv5-Face models use a fixed input size of 640×640.
|
||||
@@ -300,6 +300,32 @@ BiSeNet (Bilateral Segmentation Network) models for semantic face parsing. Segme
|
||||
|
||||
---
|
||||
|
||||
### XSeg
|
||||
|
||||
XSeg from DeepFaceLab outputs masks for face regions. Requires 5-point landmarks for face alignment.
|
||||
|
||||
| Model Name | Size | Output |
|
||||
|------------|--------|--------|
|
||||
| `DEFAULT` | 67 MB | Mask [0, 1] |
|
||||
|
||||
!!! info "Model Details"
|
||||
**Origin**: DeepFaceLab
|
||||
|
||||
**Input**: NHWC format, normalized to [0, 1]
|
||||
|
||||
**Alignment**: Requires 5-point landmarks (not bbox crops)
|
||||
|
||||
**Applications:**
|
||||
|
||||
- Face region extraction
|
||||
- Face swapping pipelines
|
||||
- Occlusion handling
|
||||
|
||||
!!! note "Input Requirements"
|
||||
Requires 5-point facial landmarks. Use a face detector like RetinaFace to obtain landmarks first.
|
||||
|
||||
---
|
||||
|
||||
## Anti-Spoofing Models
|
||||
|
||||
### MiniFASNet Family
|
||||
@@ -343,6 +369,7 @@ Models are automatically downloaded and cached on first use.
|
||||
- **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) - ArcFace, MobileFace, SphereFace training code
|
||||
- **Gaze Estimation Training**: [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) - MobileGaze training code and pretrained weights
|
||||
- **Face Parsing Training**: [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) - BiSeNet training code and pretrained weights
|
||||
- **Face Segmentation**: [yakhyo/face-segmentation](https://github.com/yakhyo/face-segmentation) - XSeg ONNX Inference
|
||||
- **Face Anti-Spoofing**: [yakhyo/face-anti-spoofing](https://github.com/yakhyo/face-anti-spoofing) - MiniFASNet ONNX inference (weights from [minivision-ai/Silent-Face-Anti-Spoofing](https://github.com/minivision-ai/Silent-Face-Anti-Spoofing))
|
||||
- **FairFace**: [yakhyo/fairface-onnx](https://github.com/yakhyo/fairface-onnx) - FairFace ONNX inference for race, gender, age prediction
|
||||
- **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface) - Model architectures and pretrained weights
|
||||
|
||||
@@ -125,7 +125,7 @@ from uniface.attribute import Emotion
|
||||
from uniface.constants import DDAMFNWeights
|
||||
|
||||
detector = RetinaFace()
|
||||
emotion = Emotion(model_name=DDAMFNWeights.AFFECNET7)
|
||||
emotion = Emotion(model_weights=DDAMFNWeights.AFFECNET7)
|
||||
|
||||
faces = detector.detect(image)
|
||||
|
||||
@@ -169,10 +169,10 @@ from uniface.attribute import Emotion
|
||||
from uniface.constants import DDAMFNWeights
|
||||
|
||||
# 7-class emotion
|
||||
emotion = Emotion(model_name=DDAMFNWeights.AFFECNET7)
|
||||
emotion = Emotion(model_weights=DDAMFNWeights.AFFECNET7)
|
||||
|
||||
# 8-class emotion
|
||||
emotion = Emotion(model_name=DDAMFNWeights.AFFECNET8)
|
||||
emotion = Emotion(model_weights=DDAMFNWeights.AFFECNET8)
|
||||
```
|
||||
|
||||
---
|
||||
@@ -206,12 +206,11 @@ for face in faces:
|
||||
### Using FaceAnalyzer
|
||||
|
||||
```python
|
||||
from uniface import FaceAnalyzer
|
||||
from uniface import FaceAnalyzer, RetinaFace, AgeGender
|
||||
|
||||
analyzer = FaceAnalyzer(
|
||||
detect=True,
|
||||
recognize=False,
|
||||
attributes=True # Uses AgeGender
|
||||
RetinaFace(),
|
||||
age_gender=AgeGender(),
|
||||
)
|
||||
|
||||
faces = analyzer.analyze(image)
|
||||
|
||||
@@ -296,7 +296,7 @@ cv2.imwrite("result.jpg", image)
|
||||
Benchmark on your hardware:
|
||||
|
||||
```bash
|
||||
python tools/detection.py --source image.jpg --iterations 100
|
||||
python tools/detection.py --source image.jpg
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@@ -1,15 +1,16 @@
|
||||
# Parsing
|
||||
|
||||
Face parsing segments faces into semantic components (skin, eyes, nose, mouth, hair, etc.).
|
||||
Face parsing segments faces into semantic components or face regions.
|
||||
|
||||
---
|
||||
|
||||
## Available Models
|
||||
|
||||
| Model | Backbone | Size | Classes |
|
||||
|-------|----------|------|---------|
|
||||
| **BiSeNet ResNet18** :material-check-circle: | ResNet18 | 51 MB | 19 |
|
||||
| BiSeNet ResNet34 | ResNet34 | 89 MB | 19 |
|
||||
| Model | Backbone | Size | Output |
|
||||
|-------|----------|------|--------|
|
||||
| **BiSeNet ResNet18** :material-check-circle: | ResNet18 | 51 MB | 19 classes |
|
||||
| BiSeNet ResNet34 | ResNet34 | 89 MB | 19 classes |
|
||||
| XSeg | - | 67 MB | Mask |
|
||||
|
||||
---
|
||||
|
||||
@@ -45,16 +46,16 @@ cv2.imwrite("parsed.jpg", vis_bgr)
|
||||
|
||||
| ID | Class | ID | Class |
|
||||
|----|-------|----|-------|
|
||||
| 0 | Background | 10 | Ear Ring |
|
||||
| 1 | Skin | 11 | Nose |
|
||||
| 2 | Left Eyebrow | 12 | Mouth |
|
||||
| 3 | Right Eyebrow | 13 | Upper Lip |
|
||||
| 4 | Left Eye | 14 | Lower Lip |
|
||||
| 5 | Right Eye | 15 | Neck |
|
||||
| 6 | Eye Glasses | 16 | Neck Lace |
|
||||
| 7 | Left Ear | 17 | Cloth |
|
||||
| 8 | Right Ear | 18 | Hair |
|
||||
| 9 | Hat | | |
|
||||
| 0 | Background | 10 | Nose |
|
||||
| 1 | Skin | 11 | Mouth |
|
||||
| 2 | Left Eyebrow | 12 | Upper Lip |
|
||||
| 3 | Right Eyebrow | 13 | Lower Lip |
|
||||
| 4 | Left Eye | 14 | Neck |
|
||||
| 5 | Right Eye | 15 | Necklace |
|
||||
| 6 | Eyeglasses | 16 | Cloth |
|
||||
| 7 | Left Ear | 17 | Hair |
|
||||
| 8 | Right Ear | 18 | Hat |
|
||||
| 9 | Earring | | |
|
||||
|
||||
---
|
||||
|
||||
@@ -125,7 +126,7 @@ mask = parser.parse(face_image)
|
||||
|
||||
# Extract specific component
|
||||
SKIN = 1
|
||||
HAIR = 18
|
||||
HAIR = 17
|
||||
LEFT_EYE = 4
|
||||
RIGHT_EYE = 5
|
||||
|
||||
@@ -148,10 +149,10 @@ mask = parser.parse(face_image)
|
||||
|
||||
component_names = {
|
||||
0: 'Background', 1: 'Skin', 2: 'L-Eyebrow', 3: 'R-Eyebrow',
|
||||
4: 'L-Eye', 5: 'R-Eye', 6: 'Glasses', 7: 'L-Ear', 8: 'R-Ear',
|
||||
9: 'Hat', 10: 'Earring', 11: 'Nose', 12: 'Mouth',
|
||||
13: 'U-Lip', 14: 'L-Lip', 15: 'Neck', 16: 'Necklace',
|
||||
17: 'Cloth', 18: 'Hair'
|
||||
4: 'L-Eye', 5: 'R-Eye', 6: 'Eyeglasses', 7: 'L-Ear', 8: 'R-Ear',
|
||||
9: 'Earring', 10: 'Nose', 11: 'Mouth',
|
||||
12: 'U-Lip', 13: 'L-Lip', 14: 'Neck', 15: 'Necklace',
|
||||
16: 'Cloth', 17: 'Hair', 18: 'Hat'
|
||||
}
|
||||
|
||||
for class_id in np.unique(mask):
|
||||
@@ -218,7 +219,7 @@ def replace_background(image, mask, background):
|
||||
```python
|
||||
def get_hair_mask(mask):
|
||||
"""Extract clean hair mask."""
|
||||
hair_mask = (mask == 18).astype(np.uint8) * 255
|
||||
hair_mask = (mask == 17).astype(np.uint8) * 255
|
||||
|
||||
# Clean up with morphological operations
|
||||
kernel = np.ones((5, 5), np.uint8)
|
||||
@@ -248,12 +249,83 @@ vis_result = vis_parsing_maps(
|
||||
|
||||
---
|
||||
|
||||
## XSeg
|
||||
|
||||
XSeg outputs a mask for face regions. Unlike BiSeNet which works on bbox crops, XSeg requires 5-point landmarks for face alignment.
|
||||
|
||||
### Basic Usage
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from uniface import RetinaFace
|
||||
from uniface.parsing import XSeg
|
||||
|
||||
detector = RetinaFace()
|
||||
parser = XSeg()
|
||||
|
||||
image = cv2.imread("photo.jpg")
|
||||
faces = detector.detect(image)
|
||||
|
||||
for face in faces:
|
||||
if face.landmarks is not None:
|
||||
mask = parser.parse(image, face.landmarks)
|
||||
print(f"Mask shape: {mask.shape}") # (H, W), values in [0, 1]
|
||||
```
|
||||
|
||||
### Parameters
|
||||
|
||||
```python
|
||||
from uniface.parsing import XSeg
|
||||
|
||||
# Default settings
|
||||
parser = XSeg()
|
||||
|
||||
# Custom settings
|
||||
parser = XSeg(
|
||||
align_size=256, # Face alignment size
|
||||
blur_sigma=5, # Gaussian blur for smoothing (0 = raw)
|
||||
)
|
||||
```
|
||||
|
||||
| Parameter | Default | Description |
|
||||
|-----------|---------|-------------|
|
||||
| `align_size` | 256 | Face alignment output size |
|
||||
| `blur_sigma` | 0 | Mask smoothing (0 = no blur) |
|
||||
|
||||
### Methods
|
||||
|
||||
```python
|
||||
# Full pipeline: align -> segment -> warp back to original space
|
||||
mask = parser.parse(image, landmarks)
|
||||
|
||||
# For pre-aligned face crops
|
||||
mask = parser.parse_aligned(face_crop)
|
||||
|
||||
# Get mask + crop + inverse matrix for custom warping
|
||||
mask, face_crop, inverse_matrix = parser.parse_with_inverse(image, landmarks)
|
||||
```
|
||||
|
||||
### BiSeNet vs XSeg
|
||||
|
||||
| Feature | BiSeNet | XSeg |
|
||||
|---------|---------|------|
|
||||
| Output | 19 class labels | Mask [0, 1] |
|
||||
| Input | Bbox crop | Requires landmarks |
|
||||
| Use case | Facial components | Face region extraction |
|
||||
|
||||
---
|
||||
|
||||
## Factory Function
|
||||
|
||||
```python
|
||||
from uniface import create_face_parser
|
||||
from uniface.constants import ParsingWeights, XSegWeights
|
||||
|
||||
parser = create_face_parser() # Returns BiSeNet
|
||||
# BiSeNet (default)
|
||||
parser = create_face_parser()
|
||||
|
||||
# XSeg
|
||||
parser = create_face_parser(XSegWeights.DEFAULT)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@@ -64,7 +64,7 @@ blurrer = BlurFace(method='pixelate', pixel_blocks=10)
|
||||
|
||||
| Parameter | Default | Description |
|
||||
|-----------|---------|-------------|
|
||||
| `pixel_blocks` | 10 | Number of blocks (lower = more pixelated) |
|
||||
| `pixel_blocks` | 15 | Number of blocks (lower = more pixelated) |
|
||||
|
||||
### Gaussian
|
||||
|
||||
|
||||
@@ -69,20 +69,21 @@ spoofer = MiniFASNet(model_name=MiniFASNetWeights.V1SE)
|
||||
|
||||
## Confidence Thresholds
|
||||
|
||||
The default threshold is 0.5. Adjust for your use case:
|
||||
`result.is_real` is based on the model's top predicted class (argmax). If you want stricter behavior,
|
||||
apply your own confidence threshold:
|
||||
|
||||
```python
|
||||
result = spoofer.predict(image, face.bbox)
|
||||
|
||||
# High security (fewer false accepts)
|
||||
HIGH_THRESHOLD = 0.7
|
||||
if result.confidence > HIGH_THRESHOLD:
|
||||
if result.is_real and result.confidence > HIGH_THRESHOLD:
|
||||
print("Real (high confidence)")
|
||||
else:
|
||||
print("Suspicious")
|
||||
|
||||
# Balanced
|
||||
if result.is_real: # Uses default 0.5 threshold
|
||||
# Balanced (argmax decision)
|
||||
if result.is_real:
|
||||
print("Real")
|
||||
else:
|
||||
print("Fake")
|
||||
|
||||
@@ -16,6 +16,7 @@ Run UniFace examples directly in your browser with Google Colab, or download and
|
||||
| [Face Parsing](https://github.com/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | Semantic face segmentation |
|
||||
| [Face Anonymization](https://github.com/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | Privacy-preserving blur |
|
||||
| [Gaze Estimation](https://github.com/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | Gaze direction estimation |
|
||||
| [Face Segmentation](https://github.com/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | [](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | Face segmentation with XSeg |
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -27,29 +27,29 @@ import numpy as np
|
||||
|
||||
class MyDetector(BaseDetector):
|
||||
def __init__(self, model_path: str, confidence_threshold: float = 0.5):
|
||||
super().__init__(confidence_threshold=confidence_threshold)
|
||||
self.session = create_onnx_session(model_path)
|
||||
self.threshold = confidence_threshold
|
||||
|
||||
def preprocess(self, image: np.ndarray) -> np.ndarray:
|
||||
# Your preprocessing logic
|
||||
# e.g., resize, normalize, transpose
|
||||
raise NotImplementedError
|
||||
|
||||
def postprocess(self, outputs, shape) -> list[Face]:
|
||||
# Your postprocessing logic
|
||||
# e.g., decode boxes, apply NMS, create Face objects
|
||||
raise NotImplementedError
|
||||
|
||||
def detect(self, image: np.ndarray) -> list[Face]:
|
||||
# 1. Preprocess image
|
||||
input_tensor = self._preprocess(image)
|
||||
input_tensor = self.preprocess(image)
|
||||
|
||||
# 2. Run inference
|
||||
outputs = self.session.run(None, {'input': input_tensor})
|
||||
|
||||
# 3. Postprocess outputs to Face objects
|
||||
faces = self._postprocess(outputs, image.shape)
|
||||
return faces
|
||||
|
||||
def _preprocess(self, image):
|
||||
# Your preprocessing logic
|
||||
# e.g., resize, normalize, transpose
|
||||
pass
|
||||
|
||||
def _postprocess(self, outputs, shape):
|
||||
# Your postprocessing logic
|
||||
# e.g., decode boxes, apply NMS, create Face objects
|
||||
pass
|
||||
return self.postprocess(outputs, image.shape)
|
||||
```
|
||||
|
||||
---
|
||||
@@ -57,36 +57,14 @@ class MyDetector(BaseDetector):
|
||||
## Add Custom Recognition Model
|
||||
|
||||
```python
|
||||
from uniface.recognition.base import BaseRecognizer
|
||||
from uniface.onnx_utils import create_onnx_session
|
||||
from uniface import face_alignment
|
||||
import numpy as np
|
||||
from uniface.recognition.base import BaseRecognizer, PreprocessConfig
|
||||
|
||||
class MyRecognizer(BaseRecognizer):
|
||||
def __init__(self, model_path: str):
|
||||
self.session = create_onnx_session(model_path)
|
||||
def __init__(self, model_path: str, providers=None):
|
||||
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
|
||||
super().__init__(model_path, preprocessing, providers=providers)
|
||||
|
||||
def get_normalized_embedding(
|
||||
self,
|
||||
image: np.ndarray,
|
||||
landmarks: np.ndarray
|
||||
) -> np.ndarray:
|
||||
# 1. Align face
|
||||
aligned = face_alignment(image, landmarks)
|
||||
|
||||
# 2. Preprocess
|
||||
input_tensor = self._preprocess(aligned)
|
||||
|
||||
# 3. Run inference
|
||||
embedding = self.session.run(None, {'input': input_tensor})[0]
|
||||
|
||||
# 4. Normalize
|
||||
embedding = embedding / np.linalg.norm(embedding)
|
||||
return embedding
|
||||
|
||||
def _preprocess(self, image):
|
||||
# Your preprocessing logic
|
||||
pass
|
||||
# Optional: override preprocess() if your model expects custom normalization.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@@ -67,14 +67,18 @@ cv2.imwrite("result.jpg", result_image)
|
||||
For convenience, use the built-in `FaceAnalyzer`:
|
||||
|
||||
```python
|
||||
from uniface import FaceAnalyzer
|
||||
from uniface import FaceAnalyzer, RetinaFace, ArcFace, AgeGender
|
||||
import cv2
|
||||
|
||||
# Initialize with desired modules
|
||||
detector = RetinaFace()
|
||||
recognizer = ArcFace()
|
||||
age_gender = AgeGender()
|
||||
|
||||
analyzer = FaceAnalyzer(
|
||||
detect=True,
|
||||
recognize=True,
|
||||
attributes=True
|
||||
detector,
|
||||
recognizer=recognizer,
|
||||
age_gender=age_gender,
|
||||
)
|
||||
|
||||
# Process image
|
||||
|
||||
495
examples/09_face_segmentation.ipynb
Normal file
495
examples/09_face_segmentation.ipynb
Normal file
File diff suppressed because one or more lines are too long
@@ -1,6 +1,6 @@
|
||||
[project]
|
||||
name = "uniface"
|
||||
version = "2.2.1"
|
||||
version = "2.3.0"
|
||||
description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Landmark Analysis, Face Parsing, Gaze Estimation, Age, and Gender Detection"
|
||||
readme = "README.md"
|
||||
license = "MIT"
|
||||
|
||||
@@ -2,15 +2,15 @@
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
"""Tests for BiSeNet face parsing model."""
|
||||
"""Tests for face parsing models (BiSeNet and XSeg)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import numpy as np
|
||||
import pytest
|
||||
|
||||
from uniface.constants import ParsingWeights
|
||||
from uniface.parsing import BiSeNet, create_face_parser
|
||||
from uniface.constants import ParsingWeights, XSegWeights
|
||||
from uniface.parsing import BiSeNet, XSeg, create_face_parser
|
||||
|
||||
|
||||
def test_bisenet_initialization():
|
||||
@@ -120,3 +120,151 @@ def test_bisenet_different_input_sizes():
|
||||
|
||||
assert mask.shape == (h, w), f'Failed for size {h}x{w}'
|
||||
assert mask.dtype == np.uint8
|
||||
|
||||
|
||||
# XSeg Tests
|
||||
|
||||
|
||||
def test_xseg_initialization():
|
||||
"""Test XSeg initialization."""
|
||||
parser = XSeg()
|
||||
assert parser is not None
|
||||
assert parser.input_size == (256, 256)
|
||||
assert parser.align_size == 256
|
||||
assert parser.blur_sigma == 0
|
||||
|
||||
|
||||
def test_xseg_with_custom_params():
|
||||
"""Test XSeg with custom parameters."""
|
||||
parser = XSeg(align_size=512, blur_sigma=5)
|
||||
assert parser.align_size == 512
|
||||
assert parser.blur_sigma == 5
|
||||
|
||||
|
||||
def test_xseg_preprocess():
|
||||
"""Test XSeg preprocessing."""
|
||||
parser = XSeg()
|
||||
|
||||
# Create a dummy aligned face crop
|
||||
face_crop = np.random.randint(0, 255, (256, 256, 3), dtype=np.uint8)
|
||||
|
||||
# Preprocess
|
||||
preprocessed = parser.preprocess(face_crop)
|
||||
|
||||
assert preprocessed.shape == (1, 256, 256, 3) # NHWC format
|
||||
assert preprocessed.dtype == np.float32
|
||||
assert preprocessed.min() >= 0
|
||||
assert preprocessed.max() <= 1
|
||||
|
||||
|
||||
def test_xseg_postprocess():
|
||||
"""Test XSeg postprocessing."""
|
||||
parser = XSeg()
|
||||
|
||||
# Create dummy model output (NHWC format)
|
||||
dummy_output = np.random.rand(1, 256, 256, 1).astype(np.float32)
|
||||
|
||||
# Postprocess
|
||||
mask = parser.postprocess(dummy_output, crop_size=(256, 256))
|
||||
|
||||
assert mask.shape == (256, 256)
|
||||
assert mask.dtype == np.float32
|
||||
assert mask.min() >= 0
|
||||
assert mask.max() <= 1
|
||||
|
||||
|
||||
def test_xseg_parse_aligned():
|
||||
"""Test XSeg parse_aligned method."""
|
||||
parser = XSeg()
|
||||
|
||||
# Create a dummy aligned face crop
|
||||
face_crop = np.random.randint(0, 255, (256, 256, 3), dtype=np.uint8)
|
||||
|
||||
# Parse
|
||||
mask = parser.parse_aligned(face_crop)
|
||||
|
||||
assert mask.shape == (256, 256)
|
||||
assert mask.dtype == np.float32
|
||||
assert mask.min() >= 0
|
||||
assert mask.max() <= 1
|
||||
|
||||
|
||||
def test_xseg_parse_with_landmarks():
|
||||
"""Test XSeg parse method with landmarks."""
|
||||
parser = XSeg()
|
||||
|
||||
# Create a dummy image
|
||||
image = np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
|
||||
|
||||
# Create dummy 5-point landmarks
|
||||
landmarks = np.array(
|
||||
[
|
||||
[250, 200], # left eye
|
||||
[390, 200], # right eye
|
||||
[320, 280], # nose
|
||||
[260, 350], # left mouth
|
||||
[380, 350], # right mouth
|
||||
],
|
||||
dtype=np.float32,
|
||||
)
|
||||
|
||||
# Parse
|
||||
mask = parser.parse(image, landmarks)
|
||||
|
||||
assert mask.shape == (480, 640)
|
||||
assert mask.dtype == np.float32
|
||||
assert mask.min() >= 0
|
||||
assert mask.max() <= 1
|
||||
|
||||
|
||||
def test_xseg_parse_invalid_landmarks():
|
||||
"""Test XSeg parse with invalid landmarks shape."""
|
||||
parser = XSeg()
|
||||
image = np.random.randint(0, 255, (256, 256, 3), dtype=np.uint8)
|
||||
|
||||
# Wrong shape
|
||||
invalid_landmarks = np.array([[0, 0], [1, 1], [2, 2]])
|
||||
|
||||
with pytest.raises(ValueError, match='Landmarks must have shape'):
|
||||
parser.parse(image, invalid_landmarks)
|
||||
|
||||
|
||||
def test_xseg_parse_with_inverse():
|
||||
"""Test XSeg parse_with_inverse method."""
|
||||
parser = XSeg()
|
||||
|
||||
# Create a dummy image
|
||||
image = np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
|
||||
|
||||
# Create dummy 5-point landmarks
|
||||
landmarks = np.array(
|
||||
[
|
||||
[250, 200],
|
||||
[390, 200],
|
||||
[320, 280],
|
||||
[260, 350],
|
||||
[380, 350],
|
||||
],
|
||||
dtype=np.float32,
|
||||
)
|
||||
|
||||
# Parse with inverse
|
||||
mask, face_crop, inverse_matrix = parser.parse_with_inverse(image, landmarks)
|
||||
|
||||
assert mask.shape == (256, 256)
|
||||
assert face_crop.shape == (256, 256, 3)
|
||||
assert inverse_matrix.shape == (2, 3)
|
||||
|
||||
|
||||
def test_create_face_parser_xseg_enum():
|
||||
"""Test factory function with XSeg enum."""
|
||||
parser = create_face_parser(XSegWeights.DEFAULT)
|
||||
assert parser is not None
|
||||
assert isinstance(parser, XSeg)
|
||||
|
||||
|
||||
def test_create_face_parser_xseg_string():
|
||||
"""Test factory function with XSeg string."""
|
||||
parser = create_face_parser('xseg')
|
||||
assert parser is not None
|
||||
assert isinstance(parser, XSeg)
|
||||
|
||||
@@ -17,7 +17,8 @@ CLI utilities for testing and running UniFace features.
|
||||
| `face_search.py` | Real-time face matching against reference |
|
||||
| `fairface.py` | FairFace attribute prediction (race, gender, age) |
|
||||
| `spoofing.py` | Face anti-spoofing detection |
|
||||
| `face_parsing.py` | Face semantic segmentation |
|
||||
| `face_parsing.py` | Face semantic segmentation (BiSeNet) |
|
||||
| `xseg.py` | Face segmentation (XSeg) |
|
||||
| `video_detection.py` | Face detection on video files with progress bar |
|
||||
| `batch_process.py` | Batch process folder of images |
|
||||
| `download_model.py` | Download model weights |
|
||||
@@ -63,10 +64,14 @@ python tools/landmarks.py --source 0
|
||||
python tools/fairface.py --source assets/test.jpg
|
||||
python tools/fairface.py --source 0
|
||||
|
||||
# Face parsing
|
||||
# Face parsing (BiSeNet)
|
||||
python tools/face_parsing.py --source assets/test.jpg
|
||||
python tools/face_parsing.py --source 0
|
||||
|
||||
# Face segmentation (XSeg)
|
||||
python tools/xseg.py --source assets/test.jpg
|
||||
python tools/xseg.py --source 0
|
||||
|
||||
# Face anti-spoofing
|
||||
python tools/spoofing.py --source assets/test.jpg
|
||||
python tools/spoofing.py --source 0
|
||||
|
||||
250
tools/xseg.py
Normal file
250
tools/xseg.py
Normal file
@@ -0,0 +1,250 @@
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
"""XSeg face segmentation on detected faces.
|
||||
|
||||
Usage:
|
||||
python tools/xseg.py --source path/to/image.jpg
|
||||
python tools/xseg.py --source path/to/video.mp4
|
||||
python tools/xseg.py --source 0 # webcam
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
from uniface import RetinaFace, XSeg
|
||||
|
||||
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
|
||||
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
|
||||
|
||||
|
||||
def get_source_type(source: str) -> str:
|
||||
"""Determine if source is image, video, or camera."""
|
||||
if source.isdigit():
|
||||
return 'camera'
|
||||
path = Path(source)
|
||||
suffix = path.suffix.lower()
|
||||
if suffix in IMAGE_EXTENSIONS:
|
||||
return 'image'
|
||||
elif suffix in VIDEO_EXTENSIONS:
|
||||
return 'video'
|
||||
else:
|
||||
return 'unknown'
|
||||
|
||||
|
||||
def apply_mask_visualization(image: np.ndarray, mask: np.ndarray, alpha: float = 0.5) -> np.ndarray:
|
||||
"""Apply colored mask overlay for visualization."""
|
||||
overlay = image.copy().astype(np.float32)
|
||||
mask_3ch = np.stack([mask * 0.3, mask * 0.7, mask * 0.3], axis=-1)
|
||||
overlay = overlay * (1 - mask[..., None] * alpha) + mask_3ch * 255 * alpha
|
||||
|
||||
return overlay.clip(0, 255).astype(np.uint8)
|
||||
|
||||
|
||||
def process_image(
|
||||
detector: RetinaFace,
|
||||
parser: XSeg,
|
||||
image_path: str,
|
||||
save_dir: str = 'outputs',
|
||||
) -> None:
|
||||
"""Process a single image."""
|
||||
image = cv2.imread(image_path)
|
||||
if image is None:
|
||||
print(f"Error: Failed to load image from '{image_path}'")
|
||||
return
|
||||
|
||||
faces = detector.detect(image)
|
||||
print(f'Detected {len(faces)} face(s)')
|
||||
|
||||
if len(faces) == 0:
|
||||
print('No faces detected.')
|
||||
return
|
||||
|
||||
# Accumulate masks from all faces
|
||||
full_mask = np.zeros((image.shape[0], image.shape[1]), dtype=np.float32)
|
||||
for i, face in enumerate(faces):
|
||||
if face.landmarks is None:
|
||||
print(f' Face {i + 1}: skipped (no landmarks)')
|
||||
continue
|
||||
|
||||
mask = parser.parse(image, face.landmarks)
|
||||
full_mask = np.maximum(full_mask, mask)
|
||||
print(f' Face {i + 1}: done')
|
||||
|
||||
# Apply visualization
|
||||
result_image = apply_mask_visualization(image, full_mask)
|
||||
|
||||
# Draw bounding boxes
|
||||
for face in faces:
|
||||
x1, y1, x2, y2 = map(int, face.bbox[:4])
|
||||
cv2.rectangle(result_image, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
||||
|
||||
# Save results
|
||||
os.makedirs(save_dir, exist_ok=True)
|
||||
output_path = os.path.join(save_dir, f'{Path(image_path).stem}_xseg.jpg')
|
||||
cv2.imwrite(output_path, result_image)
|
||||
print(f'Output saved: {output_path}')
|
||||
|
||||
mask_path = os.path.join(save_dir, f'{Path(image_path).stem}_xseg_mask.png')
|
||||
mask_uint8 = (full_mask * 255).astype(np.uint8)
|
||||
cv2.imwrite(mask_path, mask_uint8)
|
||||
print(f'Mask saved: {mask_path}')
|
||||
|
||||
|
||||
def process_video(
|
||||
detector: RetinaFace,
|
||||
parser: XSeg,
|
||||
video_path: str,
|
||||
save_dir: str = 'outputs',
|
||||
) -> None:
|
||||
"""Process a video file."""
|
||||
cap = cv2.VideoCapture(video_path)
|
||||
if not cap.isOpened():
|
||||
print(f"Error: Cannot open video file '{video_path}'")
|
||||
return
|
||||
|
||||
fps = cap.get(cv2.CAP_PROP_FPS)
|
||||
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
|
||||
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
|
||||
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
|
||||
|
||||
os.makedirs(save_dir, exist_ok=True)
|
||||
output_path = os.path.join(save_dir, f'{Path(video_path).stem}_xseg.mp4')
|
||||
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
|
||||
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
|
||||
|
||||
print(f'Processing video: {video_path} ({total_frames} frames)')
|
||||
frame_count = 0
|
||||
|
||||
while True:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
frame_count += 1
|
||||
faces = detector.detect(frame)
|
||||
|
||||
# Accumulate masks from all faces
|
||||
full_mask = np.zeros((frame.shape[0], frame.shape[1]), dtype=np.float32)
|
||||
for face in faces:
|
||||
if face.landmarks is None:
|
||||
continue
|
||||
mask = parser.parse(frame, face.landmarks)
|
||||
full_mask = np.maximum(full_mask, mask)
|
||||
|
||||
# Apply visualization
|
||||
result_frame = apply_mask_visualization(frame, full_mask)
|
||||
|
||||
# Draw bounding boxes
|
||||
for face in faces:
|
||||
x1, y1, x2, y2 = map(int, face.bbox[:4])
|
||||
cv2.rectangle(result_frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
||||
|
||||
cv2.putText(result_frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
|
||||
out.write(result_frame)
|
||||
|
||||
if frame_count % 100 == 0:
|
||||
print(f' Processed {frame_count}/{total_frames} frames...')
|
||||
|
||||
cap.release()
|
||||
out.release()
|
||||
print(f'Done! Output saved: {output_path}')
|
||||
|
||||
|
||||
def run_camera(
|
||||
detector: RetinaFace,
|
||||
parser: XSeg,
|
||||
camera_id: int = 0,
|
||||
) -> None:
|
||||
"""Run real-time detection on webcam."""
|
||||
cap = cv2.VideoCapture(camera_id)
|
||||
if not cap.isOpened():
|
||||
print(f'Cannot open camera {camera_id}')
|
||||
return
|
||||
|
||||
print("Press 'q' to quit")
|
||||
|
||||
while True:
|
||||
ret, frame = cap.read()
|
||||
if not ret:
|
||||
break
|
||||
|
||||
frame = cv2.flip(frame, 1)
|
||||
faces = detector.detect(frame)
|
||||
|
||||
# Accumulate masks from all faces
|
||||
full_mask = np.zeros((frame.shape[0], frame.shape[1]), dtype=np.float32)
|
||||
for face in faces:
|
||||
if face.landmarks is None:
|
||||
continue
|
||||
mask = parser.parse(frame, face.landmarks)
|
||||
full_mask = np.maximum(full_mask, mask)
|
||||
|
||||
# Apply visualization
|
||||
result_frame = apply_mask_visualization(frame, full_mask)
|
||||
|
||||
# Draw bounding boxes
|
||||
for face in faces:
|
||||
x1, y1, x2, y2 = map(int, face.bbox[:4])
|
||||
cv2.rectangle(result_frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
|
||||
|
||||
cv2.putText(result_frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
|
||||
cv2.imshow('XSeg Face Segmentation', result_frame)
|
||||
|
||||
if cv2.waitKey(1) & 0xFF == ord('q'):
|
||||
break
|
||||
|
||||
cap.release()
|
||||
cv2.destroyAllWindows()
|
||||
|
||||
|
||||
def main() -> None:
|
||||
arg_parser = argparse.ArgumentParser(description='XSeg face segmentation')
|
||||
arg_parser.add_argument('--source', type=str, required=True, help='Image/video path or camera ID (0, 1, ...)')
|
||||
arg_parser.add_argument('--save-dir', type=str, default='outputs', help='Output directory')
|
||||
arg_parser.add_argument(
|
||||
'--blur',
|
||||
type=float,
|
||||
default=0,
|
||||
help='Gaussian blur sigma for mask smoothing (default: 0 = raw)',
|
||||
)
|
||||
arg_parser.add_argument(
|
||||
'--align-size',
|
||||
type=int,
|
||||
default=256,
|
||||
help='Face alignment size (default: 256)',
|
||||
)
|
||||
args = arg_parser.parse_args()
|
||||
|
||||
# Initialize models
|
||||
detector = RetinaFace()
|
||||
parser = XSeg(blur_sigma=args.blur, align_size=args.align_size)
|
||||
|
||||
source_type = get_source_type(args.source)
|
||||
|
||||
if source_type == 'camera':
|
||||
run_camera(detector, parser, int(args.source))
|
||||
elif source_type == 'image':
|
||||
if not os.path.exists(args.source):
|
||||
print(f'Error: Image not found: {args.source}')
|
||||
return
|
||||
process_image(detector, parser, args.source, args.save_dir)
|
||||
elif source_type == 'video':
|
||||
if not os.path.exists(args.source):
|
||||
print(f'Error: Video not found: {args.source}')
|
||||
return
|
||||
process_video(detector, parser, args.source, args.save_dir)
|
||||
else:
|
||||
print(f"Error: Unknown source type for '{args.source}'")
|
||||
print('Supported formats: images (.jpg, .png, ...), videos (.mp4, .avi, ...), or camera ID (0, 1, ...)')
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
@@ -28,7 +28,7 @@ from __future__ import annotations
|
||||
|
||||
__license__ = 'MIT'
|
||||
__author__ = 'Yakhyokhuja Valikhujaev'
|
||||
__version__ = '2.2.1'
|
||||
__version__ = '2.3.0'
|
||||
|
||||
from uniface.face_utils import compute_similarity, face_alignment
|
||||
from uniface.log import Logger, enable_logging
|
||||
@@ -48,7 +48,7 @@ from .detection import (
|
||||
)
|
||||
from .gaze import MobileGaze, create_gaze_estimator
|
||||
from .landmark import Landmark106, create_landmarker
|
||||
from .parsing import BiSeNet, create_face_parser
|
||||
from .parsing import BiSeNet, XSeg, create_face_parser
|
||||
from .privacy import BlurFace, anonymize_faces
|
||||
from .recognition import AdaFace, ArcFace, MobileFace, SphereFace, create_recognizer
|
||||
from .spoofing import MiniFASNet, create_spoofer
|
||||
@@ -95,6 +95,7 @@ __all__ = [
|
||||
'MobileGaze',
|
||||
# Parsing models
|
||||
'BiSeNet',
|
||||
'XSeg',
|
||||
# Attribute models
|
||||
'AgeGender',
|
||||
'AttributeResult',
|
||||
|
||||
@@ -150,6 +150,15 @@ class ParsingWeights(str, Enum):
|
||||
RESNET34 = "parsing_resnet34"
|
||||
|
||||
|
||||
class XSegWeights(str, Enum):
|
||||
"""
|
||||
XSeg face segmentation model from DeepFaceLab.
|
||||
Outputs mask for face region.
|
||||
https://github.com/iperov/DeepFaceLab
|
||||
"""
|
||||
DEFAULT = "xseg"
|
||||
|
||||
|
||||
class MiniFASNetWeights(str, Enum):
|
||||
"""
|
||||
MiniFASNet: Lightweight Face Anti-Spoofing models.
|
||||
@@ -217,6 +226,8 @@ MODEL_URLS: dict[Enum, str] = {
|
||||
# Anti-Spoofing (MiniFASNet)
|
||||
MiniFASNetWeights.V1SE: 'https://github.com/yakhyo/face-anti-spoofing/releases/download/weights/MiniFASNetV1SE.onnx',
|
||||
MiniFASNetWeights.V2: 'https://github.com/yakhyo/face-anti-spoofing/releases/download/weights/MiniFASNetV2.onnx',
|
||||
# XSeg
|
||||
XSegWeights.DEFAULT: 'https://github.com/yakhyo/face-segmentation/releases/download/weights/xseg.onnx',
|
||||
}
|
||||
|
||||
MODEL_SHA256: dict[Enum, str] = {
|
||||
@@ -272,6 +283,8 @@ MODEL_SHA256: dict[Enum, str] = {
|
||||
# Anti-Spoofing (MiniFASNet)
|
||||
MiniFASNetWeights.V1SE: 'ebab7f90c7833fbccd46d3a555410e78d969db5438e169b6524be444862b3676',
|
||||
MiniFASNetWeights.V2: 'b32929adc2d9c34b9486f8c4c7bc97c1b69bc0ea9befefc380e4faae4e463907',
|
||||
# XSeg
|
||||
XSegWeights.DEFAULT: '0b57328efcb839d85973164b617ceee9dfe6cfcb2c82e8a033bba9f4f09b27e5',
|
||||
}
|
||||
|
||||
CHUNK_SIZE = 8192
|
||||
|
||||
@@ -4,27 +4,25 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from uniface.constants import ParsingWeights
|
||||
from uniface.constants import ParsingWeights, XSegWeights
|
||||
|
||||
from .base import BaseFaceParser
|
||||
from .bisenet import BiSeNet
|
||||
from .xseg import XSeg
|
||||
|
||||
__all__ = ['BaseFaceParser', 'BiSeNet', 'create_face_parser']
|
||||
__all__ = ['BaseFaceParser', 'BiSeNet', 'XSeg', 'create_face_parser']
|
||||
|
||||
|
||||
def create_face_parser(
|
||||
model_name: str | ParsingWeights = ParsingWeights.RESNET18,
|
||||
model_name: str | ParsingWeights | XSegWeights = ParsingWeights.RESNET18,
|
||||
**kwargs,
|
||||
) -> BaseFaceParser:
|
||||
"""Factory function to create a face parsing model instance.
|
||||
|
||||
This function provides a convenient way to instantiate face parsing models
|
||||
without directly importing the specific model classes.
|
||||
"""Factory function to create a face parsing model.
|
||||
|
||||
Args:
|
||||
model_name: The face parsing model to create. Can be either a string
|
||||
or a ParsingWeights enum value. Available options:
|
||||
- 'parsing_resnet18' or ParsingWeights.RESNET18 (default)
|
||||
- 'parsing_resnet34' or ParsingWeights.RESNET34
|
||||
model_name: Model to create. Options: ParsingWeights.RESNET18/RESNET34 (BiSeNet),
|
||||
XSegWeights.DEFAULT (XSeg, requires landmarks).
|
||||
**kwargs: Additional arguments passed to the model constructor.
|
||||
|
||||
Returns:
|
||||
An instance of the requested face parsing model.
|
||||
@@ -33,20 +31,32 @@ def create_face_parser(
|
||||
ValueError: If the model_name is not recognized.
|
||||
|
||||
Example:
|
||||
>>> from uniface.parsing import create_face_parser
|
||||
>>> from uniface.constants import ParsingWeights
|
||||
>>> parser = create_face_parser(ParsingWeights.RESNET18)
|
||||
>>> mask = parser.parse(face_crop)
|
||||
"""
|
||||
# Handle XSegWeights
|
||||
if isinstance(model_name, XSegWeights):
|
||||
return XSeg(model_name=model_name, **kwargs)
|
||||
|
||||
# Convert string to enum if necessary
|
||||
if isinstance(model_name, str):
|
||||
# Try XSegWeights first
|
||||
try:
|
||||
xseg_model = XSegWeights(model_name)
|
||||
return XSeg(model_name=xseg_model, **kwargs)
|
||||
except ValueError:
|
||||
pass
|
||||
|
||||
# Try ParsingWeights
|
||||
try:
|
||||
model_name = ParsingWeights(model_name)
|
||||
except ValueError as e:
|
||||
valid_models = [e.value for e in ParsingWeights]
|
||||
valid_parsing = [m.value for m in ParsingWeights]
|
||||
valid_xseg = [m.value for m in XSegWeights]
|
||||
valid_models = valid_parsing + valid_xseg
|
||||
raise ValueError(
|
||||
f"Unknown face parsing model: '{model_name}'. Valid options are: {', '.join(valid_models)}"
|
||||
) from e
|
||||
|
||||
# All parsing models use the same BiSeNet class
|
||||
return BiSeNet(model_name=model_name)
|
||||
# BiSeNet models
|
||||
return BiSeNet(model_name=model_name, **kwargs)
|
||||
|
||||
242
uniface/parsing/xseg.py
Normal file
242
uniface/parsing/xseg.py
Normal file
@@ -0,0 +1,242 @@
|
||||
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
|
||||
# Author: Yakhyokhuja Valikhujaev
|
||||
# GitHub: https://github.com/yakhyo
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
from uniface.constants import XSegWeights
|
||||
from uniface.face_utils import face_alignment
|
||||
from uniface.log import Logger
|
||||
from uniface.model_store import verify_model_weights
|
||||
from uniface.onnx_utils import create_onnx_session
|
||||
|
||||
from .base import BaseFaceParser
|
||||
|
||||
__all__ = ['XSeg']
|
||||
|
||||
|
||||
class XSeg(BaseFaceParser):
|
||||
"""
|
||||
XSeg: Face Segmentation Model from DeepFaceLab with ONNX Runtime.
|
||||
|
||||
XSeg outputs a mask for face regions. Unlike BiSeNet which works
|
||||
on bbox crops, XSeg requires 5-point landmarks for face alignment. The model
|
||||
uses NHWC input format and outputs values in [0, 1] range.
|
||||
|
||||
Reference:
|
||||
https://github.com/iperov/DeepFaceLab
|
||||
|
||||
Args:
|
||||
model_name (XSegWeights): The enum specifying the XSeg model to load.
|
||||
Defaults to `XSegWeights.DEFAULT`.
|
||||
align_size (int): Face alignment output size. Must be multiple of 112 or 128.
|
||||
Defaults to 256.
|
||||
blur_sigma (float): Gaussian blur sigma for mask smoothing.
|
||||
0 = raw output (no blur). Defaults to 0.
|
||||
providers (list[str] | None): ONNX Runtime execution providers. If None,
|
||||
auto-detects the best available provider.
|
||||
|
||||
Attributes:
|
||||
align_size (int): Face alignment output size.
|
||||
blur_sigma (float): Blur sigma for post-processing.
|
||||
input_size (tuple[int, int]): Model input dimensions (width, height).
|
||||
|
||||
Example:
|
||||
>>> from uniface.parsing import XSeg
|
||||
>>> from uniface import RetinaFace
|
||||
>>>
|
||||
>>> detector = RetinaFace()
|
||||
>>> parser = XSeg()
|
||||
>>>
|
||||
>>> faces = detector.detect(image)
|
||||
>>> for face in faces:
|
||||
... if face.landmarks is not None:
|
||||
... mask = parser.parse(image, face.landmarks)
|
||||
... print(f'Mask shape: {mask.shape}')
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
model_name: XSegWeights = XSegWeights.DEFAULT,
|
||||
align_size: int = 256,
|
||||
blur_sigma: float = 0,
|
||||
providers: list[str] | None = None,
|
||||
) -> None:
|
||||
Logger.info(f'Initializing XSeg with model={model_name}, align_size={align_size}, blur_sigma={blur_sigma}')
|
||||
|
||||
self.align_size = align_size
|
||||
self.blur_sigma = blur_sigma
|
||||
self.providers = providers
|
||||
|
||||
self.model_path = verify_model_weights(model_name)
|
||||
self._initialize_model()
|
||||
|
||||
def _initialize_model(self) -> None:
|
||||
"""
|
||||
Initialize the ONNX model from the stored model path.
|
||||
|
||||
Raises:
|
||||
RuntimeError: If the model fails to load or initialize.
|
||||
"""
|
||||
try:
|
||||
self.session = create_onnx_session(self.model_path, providers=self.providers)
|
||||
|
||||
# Get input configuration
|
||||
input_cfg = self.session.get_inputs()[0]
|
||||
input_shape = input_cfg.shape
|
||||
self.input_name = input_cfg.name
|
||||
|
||||
# NHWC format: (N, H, W, C)
|
||||
if isinstance(input_shape[1], int) and isinstance(input_shape[2], int):
|
||||
self.input_size = (input_shape[2], input_shape[1])
|
||||
else:
|
||||
self.input_size = (256, 256)
|
||||
Logger.info(f'Dynamic input shape detected, using default: {self.input_size}')
|
||||
|
||||
# Get output configuration
|
||||
outputs = self.session.get_outputs()
|
||||
self.output_names = [output.name for output in outputs]
|
||||
|
||||
Logger.info(f'XSeg initialized with input size {self.input_size}')
|
||||
|
||||
except Exception as e:
|
||||
Logger.error(f"Failed to load XSeg model from '{self.model_path}'", exc_info=True)
|
||||
raise RuntimeError(f'Failed to initialize XSeg model: {e}') from e
|
||||
|
||||
def preprocess(self, face_crop: np.ndarray) -> np.ndarray:
|
||||
"""
|
||||
Preprocess an aligned face crop for inference.
|
||||
|
||||
Args:
|
||||
face_crop (np.ndarray): An aligned face crop in BGR format.
|
||||
|
||||
Returns:
|
||||
np.ndarray: Preprocessed image tensor with shape (1, H, W, 3).
|
||||
"""
|
||||
# Resize to model input size
|
||||
image = cv2.resize(face_crop, self.input_size, interpolation=cv2.INTER_LINEAR)
|
||||
|
||||
# Normalize to [0, 1]
|
||||
image = image.astype(np.float32) / 255.0
|
||||
|
||||
# Add batch dimension (NHWC format)
|
||||
image = np.expand_dims(image, axis=0)
|
||||
|
||||
return image
|
||||
|
||||
def postprocess(self, outputs: np.ndarray, crop_size: tuple[int, int]) -> np.ndarray:
|
||||
"""
|
||||
Postprocess model output to segmentation mask.
|
||||
|
||||
Args:
|
||||
outputs (np.ndarray): Raw model output.
|
||||
crop_size (tuple[int, int]): Size to resize mask to (width, height).
|
||||
|
||||
Returns:
|
||||
np.ndarray: Segmentation mask as float32 in range [0, 1].
|
||||
"""
|
||||
# Squeeze and clip to valid range
|
||||
mask = outputs.squeeze().clip(0, 1).astype(np.float32)
|
||||
|
||||
# Resize back to crop size
|
||||
mask = cv2.resize(mask, crop_size, interpolation=cv2.INTER_LINEAR)
|
||||
|
||||
# Apply optional blur and threshold
|
||||
if self.blur_sigma > 0:
|
||||
mask = cv2.GaussianBlur(mask, (0, 0), self.blur_sigma)
|
||||
mask = (mask.clip(0.5, 1) - 0.5) * 2
|
||||
|
||||
return mask
|
||||
|
||||
def parse(self, image: np.ndarray, landmarks: np.ndarray) -> np.ndarray:
|
||||
"""
|
||||
Perform face segmentation using 5-point landmarks.
|
||||
|
||||
Args:
|
||||
image (np.ndarray): Input image in BGR format.
|
||||
landmarks (np.ndarray): 5-point facial landmarks with shape (5, 2).
|
||||
|
||||
Returns:
|
||||
np.ndarray: Segmentation mask in original image space, values in [0, 1].
|
||||
|
||||
Raises:
|
||||
ValueError: If landmarks shape is not (5, 2).
|
||||
"""
|
||||
if landmarks.shape != (5, 2):
|
||||
raise ValueError(f'Landmarks must have shape (5, 2), got {landmarks.shape}')
|
||||
|
||||
# Align face using landmarks
|
||||
face_crop, inverse_matrix = face_alignment(image, landmarks, image_size=self.align_size)
|
||||
|
||||
# Run inference
|
||||
crop_size = (face_crop.shape[1], face_crop.shape[0])
|
||||
input_tensor = self.preprocess(face_crop)
|
||||
outputs = self.session.run(self.output_names, {self.input_name: input_tensor})
|
||||
|
||||
# Postprocess mask
|
||||
mask = self.postprocess(outputs[0], crop_size)
|
||||
|
||||
# Warp mask back to original image space
|
||||
h, w = image.shape[:2]
|
||||
warped_mask = cv2.warpAffine(
|
||||
mask,
|
||||
inverse_matrix,
|
||||
(w, h),
|
||||
flags=cv2.INTER_LINEAR,
|
||||
borderMode=cv2.BORDER_CONSTANT,
|
||||
borderValue=0,
|
||||
)
|
||||
|
||||
return warped_mask
|
||||
|
||||
def parse_aligned(self, face_crop: np.ndarray) -> np.ndarray:
|
||||
"""
|
||||
Perform segmentation on an already aligned face crop.
|
||||
|
||||
Args:
|
||||
face_crop (np.ndarray): An aligned face crop in BGR format.
|
||||
|
||||
Returns:
|
||||
np.ndarray: Segmentation mask with same size as input, values in [0, 1].
|
||||
"""
|
||||
crop_size = (face_crop.shape[1], face_crop.shape[0])
|
||||
|
||||
# Run inference
|
||||
input_tensor = self.preprocess(face_crop)
|
||||
outputs = self.session.run(self.output_names, {self.input_name: input_tensor})
|
||||
|
||||
return self.postprocess(outputs[0], crop_size)
|
||||
|
||||
def parse_with_inverse(
|
||||
self,
|
||||
image: np.ndarray,
|
||||
landmarks: np.ndarray,
|
||||
) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
|
||||
"""
|
||||
Parse face and return mask with inverse matrix for custom warping.
|
||||
|
||||
Args:
|
||||
image (np.ndarray): Input image in BGR format.
|
||||
landmarks (np.ndarray): 5-point facial landmarks with shape (5, 2).
|
||||
|
||||
Returns:
|
||||
Tuple of (mask, face_crop, inverse_matrix).
|
||||
"""
|
||||
if landmarks.shape != (5, 2):
|
||||
raise ValueError(f'Landmarks must have shape (5, 2), got {landmarks.shape}')
|
||||
|
||||
# Align face using landmarks
|
||||
face_crop, inverse_matrix = face_alignment(image, landmarks, image_size=self.align_size)
|
||||
|
||||
# Run inference
|
||||
crop_size = (face_crop.shape[1], face_crop.shape[0])
|
||||
input_tensor = self.preprocess(face_crop)
|
||||
outputs = self.session.run(self.output_names, {self.input_name: input_tensor})
|
||||
|
||||
# Postprocess mask (in crop space)
|
||||
mask = self.postprocess(outputs[0], crop_size)
|
||||
|
||||
return mask, face_crop, inverse_matrix
|
||||
Reference in New Issue
Block a user