5 Commits

Author SHA1 Message Date
Yakhyokhuja Valikhujaev
331f46be7c release: Update release version and docs (#79) 2026-02-05 21:45:28 +09:00
Yakhyokhuja Valikhujaev
9991fae62a docs: Update UniFace library documentation and README.md (#78)
* docs: Update wrong/missing references

* docs: Update README.md
2026-02-04 20:45:02 +09:00
Yakhyokhuja Valikhujaev
b74ab95d39 docs: Update UniFace github image (#75) 2026-01-25 17:07:40 +09:00
Yakhyokhuja Valikhujaev
d2b0303bfe docs: Add additional badges to README.md (#74)
* Update badges in README.md
* Update ci.yml
2026-01-24 22:25:09 +09:00
Yakhyokhuja Valikhujaev
5f74487eb3 feat: Add XSeg for Face Segmentation (#72)
* feat: Add XSeg for Face Segmentation DeepFaceLab

* docs: Update model inference related reference

* chore: Update jupyter notebook example for face segmentation
2026-01-22 22:33:31 +09:00
33 changed files with 1502 additions and 155 deletions

BIN
.github/logos/new/uniface_enhanced.webp vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 427 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.7 MiB

BIN
.github/logos/new/uniface_rounded.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.8 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.9 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 872 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 62 KiB

View File

@@ -1,4 +1,4 @@
name: CI
name: Build
on:
push:

1
.gitignore vendored
View File

@@ -1,4 +1,5 @@
tmp_*
.vscode/
# Byte-compiled / optimized / DLL files
__pycache__/

136
README.md
View File

@@ -1,32 +1,33 @@
# UniFace: All-in-One Face Analysis Library
<h1 align="center">UniFace: All-in-One Face Analysis Library</h1>
<div align="center">
[![PyPI](https://img.shields.io/pypi/v/uniface.svg?label=PyPI)](https://pypi.org/project/uniface/)
[![Python](https://img.shields.io/badge/Python-3.10%2B-blue)](https://www.python.org/)
[![PyPI Version](https://img.shields.io/pypi/v/uniface.svg?label=Version)](https://pypi.org/project/uniface/)
[![Python Version](https://img.shields.io/badge/Python-3.10%2B-blue)](https://www.python.org/)
[![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![CI](https://github.com/yakhyo/uniface/actions/workflows/ci.yml/badge.svg)](https://github.com/yakhyo/uniface/actions)
[![Github Build Status](https://github.com/yakhyo/uniface/actions/workflows/ci.yml/badge.svg)](https://github.com/yakhyo/uniface/actions)
[![PyPI Downloads](https://static.pepy.tech/personalized-badge/uniface?period=total&units=INTERNATIONAL_SYSTEM&left_color=GRAY&right_color=BLUE&left_text=Downloads)](https://pepy.tech/projects/uniface)
[![Docs](https://img.shields.io/badge/Docs-UniFace-blue.svg)](https://yakhyo.github.io/uniface/)
[![UniFace Documentation](https://img.shields.io/badge/Docs-UniFace-blue.svg)](https://yakhyo.github.io/uniface/)
[![Kaggle Badge](https://img.shields.io/badge/Notebooks-Kaggle?label=Kaggle&color=blue)](https://www.kaggle.com/yakhyokhuja/code)
</div>
<div align="center">
<img src=".github/logos/logo_web.webp" width=80%>
<img src="https://raw.githubusercontent.com/yakhyo/uniface/main/.github/logos/new/uniface_rounded_q80.webp" width="90%" alt="UniFace - All-in-One Open-Source Face Analysis Library">
</div>
---
**UniFace** is a lightweight, production-ready face analysis library built on ONNX Runtime. It provides high-performance face detection, recognition, landmark detection, face parsing, gaze estimation, and attribute analysis with hardware acceleration support across platforms.
> 💬 **Have questions?** [Chat with this codebase on DeepWiki](https://deepwiki.com/yakhyo/uniface) - AI-powered docs that let you ask anything about UniFace.
---
## Features
- **Face Detection** — RetinaFace, SCRFD, YOLOv5-Face, and YOLOv8-Face with 5-point landmarks
- **Face Recognition** — ArcFace, MobileFace, and SphereFace embeddings
- **Facial Landmarks** — 106-point landmark localization
- **Face Parsing** — BiSeNet semantic segmentation (19 classes)
- **Facial Landmarks** — 106-point landmark localization module (separate from 5-point detector landmarks)
- **Face Parsing** — BiSeNet semantic segmentation (19 classes), XSeg face masking
- **Gaze Estimation** — Real-time gaze direction with MobileGaze
- **Attribute Analysis** — Age, gender, race (FairFace), and emotion
- **Anti-Spoofing** — Face liveness detection with MiniFASNet
@@ -37,31 +38,55 @@
## Installation
**Standard installation**
```bash
# Standard installation
pip install uniface
```
# GPU support (CUDA)
**GPU support (CUDA)**
```bash
pip install uniface[gpu]
```
# From source
**From source (latest version)**
```bash
git clone https://github.com/yakhyo/uniface.git
cd uniface && pip install -e .
```
**Optional dependencies**
- Emotion model uses TorchScript and requires `torch`:
`pip install torch` (choose the correct build for your OS/CUDA)
- YOLOv5-Face and YOLOv8-Face support faster NMS with `torchvision`:
`pip install torch torchvision` then use `nms_mode='torchvision'`
---
## Quick Example
## Model Downloads and Cache
Models are downloaded automatically on first use and verified via SHA-256.
Default cache location: `~/.uniface/models`
You can override it with `UNIFACE_CACHE_DIR=/your/cache/path`
---
## Quick Example (Detection)
```python
import cv2
from uniface import RetinaFace
# Initialize detector (models auto-download on first use)
detector = RetinaFace()
# Detect faces
image = cv2.imread("photo.jpg")
if image is None:
raise ValueError("Failed to load image. Check the path to 'photo.jpg'.")
faces = detector.detect(image)
for face in faces:
@@ -71,14 +96,52 @@ for face in faces:
```
<div align="center">
<img src="assets/test_result.png">
<img src="https://raw.githubusercontent.com/yakhyo/uniface/main/assets/test_result.png" width="90%">
<p>Face Detection Model Output</p>
</div>
---
## Example (Face Analyzer)
```python
import cv2
from uniface import RetinaFace, ArcFace, FaceAnalyzer
detector = RetinaFace()
recognizer = ArcFace()
analyzer = FaceAnalyzer(detector, recognizer=recognizer)
image = cv2.imread("photo.jpg")
if image is None:
raise ValueError("Failed to load image. Check the path to 'photo.jpg'.")
faces = analyzer.analyze(image)
for face in faces:
print(face.bbox, face.embedding.shape if face.embedding is not None else None)
```
---
## Execution Providers (ONNX Runtime)
```python
from uniface import RetinaFace
# Force CPU-only inference
detector = RetinaFace(providers=["CPUExecutionProvider"])
```
See more in the docs:
https://yakhyo.github.io/uniface/concepts/execution-providers/
---
## Documentation
📚 **Full documentation**: [yakhyo.github.io/uniface](https://yakhyo.github.io/uniface/)
Full documentation: https://yakhyo.github.io/uniface/
| Resource | Description |
|----------|-------------|
@@ -88,7 +151,9 @@ for face in faces:
| [Tutorials](https://yakhyo.github.io/uniface/recipes/image-pipeline/) | Step-by-step workflow examples |
| [Guides](https://yakhyo.github.io/uniface/concepts/overview/) | Architecture and design principles |
### Jupyter Notebooks
---
## Jupyter Notebooks
| Example | Colab | Description |
|---------|:-----:|-------------|
@@ -100,6 +165,20 @@ for face in faces:
| [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | Semantic face segmentation |
| [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | Privacy-preserving blur |
| [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | Gaze direction estimation |
| [09_face_segmentation.ipynb](examples/09_face_segmentation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | Face segmentation with XSeg |
---
## Licensing and Model Usage
UniFace is MIT-licensed, but several pretrained models carry their own licenses.
Review: https://yakhyo.github.io/uniface/license-attribution/
Notable examples:
- YOLOv5-Face and YOLOv8-Face weights are GPL-3.0
- FairFace weights are CC BY 4.0
If you plan commercial use, verify model license compatibility.
---
@@ -107,12 +186,13 @@ for face in faces:
| Feature | Repository | Training | Description |
|---------|------------|:--------:|-------------|
| Detection | [retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) | [x] | RetinaFace PyTorch Training & Export |
| Detection | [retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) | | RetinaFace PyTorch Training & Export |
| Detection | [yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) | - | YOLOv5-Face ONNX Inference |
| Detection | [yolov8-face-onnx-inference](https://github.com/yakhyo/yolov8-face-onnx-inference) | - | YOLOv8-Face ONNX Inference |
| Recognition | [face-recognition](https://github.com/yakhyo/face-recognition) | [x] | MobileFace, SphereFace Training |
| Parsing | [face-parsing](https://github.com/yakhyo/face-parsing) | [x] | BiSeNet Face Parsing |
| Gaze | [gaze-estimation](https://github.com/yakhyo/gaze-estimation) | [x] | MobileGaze Training |
| Recognition | [face-recognition](https://github.com/yakhyo/face-recognition) | | MobileFace, SphereFace Training |
| Parsing | [face-parsing](https://github.com/yakhyo/face-parsing) | | BiSeNet Face Parsing |
| Parsing | [face-segmentation](https://github.com/yakhyo/face-segmentation) | - | XSeg Face Segmentation |
| Gaze | [gaze-estimation](https://github.com/yakhyo/gaze-estimation) | ✓ | MobileGaze Training |
| Anti-Spoofing | [face-anti-spoofing](https://github.com/yakhyo/face-anti-spoofing) | - | MiniFASNet Inference |
| Attributes | [fairface-onnx](https://github.com/yakhyo/fairface-onnx) | - | FairFace ONNX Inference |
@@ -122,7 +202,15 @@ for face in faces:
## Contributing
Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
Contributions are welcome. Please see [CONTRIBUTING.md](CONTRIBUTING.md).
## Support
If you find this project useful, consider giving it a ⭐ on GitHub — it helps others discover it!
Questions or feedback:
- GitHub Issues: https://github.com/yakhyo/uniface/issues
- DeepWiki Q&A: https://deepwiki.com/yakhyo/uniface
## License

View File

@@ -17,8 +17,8 @@ detector = RetinaFace()
**Priority order:**
1. **CUDAExecutionProvider** - NVIDIA GPU
2. **CoreMLExecutionProvider** - Apple Silicon
1. **CoreMLExecutionProvider** - Apple Silicon
2. **CUDAExecutionProvider** - NVIDIA GPU
3. **CPUExecutionProvider** - Fallback
---

View File

@@ -199,16 +199,16 @@ print(f"Classes: {np.unique(mask)}") # [0, 1, 2, ...]
| ID | Class | ID | Class |
|----|-------|----|-------|
| 0 | Background | 10 | Ear Ring |
| 1 | Skin | 11 | Nose |
| 2 | Left Eyebrow | 12 | Mouth |
| 3 | Right Eyebrow | 13 | Upper Lip |
| 4 | Left Eye | 14 | Lower Lip |
| 5 | Right Eye | 15 | Neck |
| 6 | Eye Glasses | 16 | Neck Lace |
| 7 | Left Ear | 17 | Cloth |
| 8 | Right Ear | 18 | Hair |
| 9 | Hat | | |
| 0 | Background | 10 | Nose |
| 1 | Skin | 11 | Mouth |
| 2 | Left Eyebrow | 12 | Upper Lip |
| 3 | Right Eyebrow | 13 | Lower Lip |
| 4 | Left Eye | 14 | Neck |
| 5 | Right Eye | 15 | Necklace |
| 6 | Eyeglasses | 16 | Cloth |
| 7 | Left Ear | 17 | Hair |
| 8 | Right Ear | 18 | Hat |
| 9 | Earring | | |
---

View File

@@ -32,9 +32,9 @@ Default cache directory:
```
~/.uniface/models/
├── retinaface_mv2.onnx
├── w600k_mbf.onnx
├── 2d106det.onnx
├── retinaface_mnet_v2.onnx
├── arcface_mnet.onnx
├── 2d_106.onnx
├── gaze_resnet34.onnx
├── parsing_resnet18.onnx
└── ...

View File

@@ -151,12 +151,18 @@ for face in faces:
For convenience, `FaceAnalyzer` combines multiple modules:
```python
from uniface import FaceAnalyzer
from uniface import FaceAnalyzer, RetinaFace, ArcFace, AgeGender, FairFace
detector = RetinaFace()
recognizer = ArcFace()
age_gender = AgeGender()
fairface = FairFace()
analyzer = FaceAnalyzer(
detect=True,
recognize=True,
attributes=True
detector,
recognizer=recognizer,
age_gender=age_gender,
fairface=fairface,
)
faces = analyzer.analyze(image)

View File

@@ -10,12 +10,16 @@ template: home.html
# UniFace { .hero-title }
<p class="hero-subtitle">A lightweight, production-ready face analysis library built on ONNX Runtime</p>
<p class="hero-subtitle">All-in-One Open-Source Face Analysis Library</p>
[![PyPI](https://img.shields.io/pypi/v/uniface.svg?label=PyPI)](https://pypi.org/project/uniface/)
[![Python](https://img.shields.io/badge/Python-3.10%2B-blue)](https://www.python.org/)
[![PyPI Version](https://img.shields.io/pypi/v/uniface.svg?label=Version)](https://pypi.org/project/uniface/)
[![Python Version](https://img.shields.io/badge/Python-3.10%2B-blue)](https://www.python.org/)
[![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![Github Build Status](https://github.com/yakhyo/uniface/actions/workflows/ci.yml/badge.svg)](https://github.com/yakhyo/uniface/actions)
[![PyPI Downloads](https://static.pepy.tech/personalized-badge/uniface?period=total&units=INTERNATIONAL_SYSTEM&left_color=GRAY&right_color=BLUE&left_text=Downloads)](https://pepy.tech/projects/uniface)
[![Kaggle Badge](https://img.shields.io/badge/Notebooks-Kaggle?label=Kaggle&color=blue)](https://www.kaggle.com/yakhyokhuja/code)
<img src="https://raw.githubusercontent.com/yakhyo/uniface/main/.github/logos/new/uniface_rounded_q80.webp" alt="UniFace - All-in-One Open-Source Face Analysis Library" style="max-width: 90%; margin: 1rem 0;">
[Get Started](quickstart.md){ .md-button .md-button--primary }
[View on GitHub](https://github.com/yakhyo/uniface){ .md-button }

View File

@@ -107,7 +107,9 @@ UniFace has minimal dependencies:
|---------|---------|
| `numpy` | Array operations |
| `opencv-python` | Image processing |
| `onnx` | ONNX model format support |
| `onnxruntime` | Model inference |
| `scikit-image` | Geometric transforms |
| `requests` | Model download |
| `tqdm` | Progress bars |

View File

@@ -22,7 +22,7 @@ RetinaFace models are trained on the WIDER FACE dataset.
!!! info "Accuracy & Benchmarks"
**Accuracy**: WIDER FACE validation set (Easy/Medium/Hard subsets) - from [RetinaFace paper](https://arxiv.org/abs/1905.00641)
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --iterations 100`
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image>`
---
@@ -32,13 +32,13 @@ SCRFD (Sample and Computation Redistribution for Efficient Face Detection) model
| Model Name | Params | Size | Easy | Medium | Hard |
| ---------------- | ------ | ----- | ------ | ------ | ------ |
| `SCRFD_500M` | 0.6M | 2.5MB | 90.57% | 88.12% | 68.51% |
| `SCRFD_10G` :material-check-circle: | 4.2M | 17MB | 95.16% | 93.87% | 83.05% |
| `SCRFD_500M_KPS` | 0.6M | 2.5MB | 90.57% | 88.12% | 68.51% |
| `SCRFD_10G_KPS` :material-check-circle: | 4.2M | 17MB | 95.16% | 93.87% | 83.05% |
!!! info "Accuracy & Benchmarks"
**Accuracy**: WIDER FACE validation set - from [SCRFD paper](https://arxiv.org/abs/2105.04714)
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --iterations 100`
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image>`
---
@@ -55,7 +55,7 @@ YOLOv5-Face models provide detection with 5-point facial landmarks, trained on W
!!! info "Accuracy & Benchmarks"
**Accuracy**: WIDER FACE validation set - from [YOLOv5-Face paper](https://arxiv.org/abs/2105.12931)
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image> --iterations 100`
**Speed**: Benchmark on your own hardware using `python tools/detection.py --source <image>`
!!! note "Fixed Input Size"
All YOLOv5-Face models use a fixed input size of 640×640.
@@ -300,6 +300,32 @@ BiSeNet (Bilateral Segmentation Network) models for semantic face parsing. Segme
---
### XSeg
XSeg from DeepFaceLab outputs masks for face regions. Requires 5-point landmarks for face alignment.
| Model Name | Size | Output |
|------------|--------|--------|
| `DEFAULT` | 67 MB | Mask [0, 1] |
!!! info "Model Details"
**Origin**: DeepFaceLab
**Input**: NHWC format, normalized to [0, 1]
**Alignment**: Requires 5-point landmarks (not bbox crops)
**Applications:**
- Face region extraction
- Face swapping pipelines
- Occlusion handling
!!! note "Input Requirements"
Requires 5-point facial landmarks. Use a face detector like RetinaFace to obtain landmarks first.
---
## Anti-Spoofing Models
### MiniFASNet Family
@@ -343,6 +369,7 @@ Models are automatically downloaded and cached on first use.
- **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) - ArcFace, MobileFace, SphereFace training code
- **Gaze Estimation Training**: [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) - MobileGaze training code and pretrained weights
- **Face Parsing Training**: [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) - BiSeNet training code and pretrained weights
- **Face Segmentation**: [yakhyo/face-segmentation](https://github.com/yakhyo/face-segmentation) - XSeg ONNX Inference
- **Face Anti-Spoofing**: [yakhyo/face-anti-spoofing](https://github.com/yakhyo/face-anti-spoofing) - MiniFASNet ONNX inference (weights from [minivision-ai/Silent-Face-Anti-Spoofing](https://github.com/minivision-ai/Silent-Face-Anti-Spoofing))
- **FairFace**: [yakhyo/fairface-onnx](https://github.com/yakhyo/fairface-onnx) - FairFace ONNX inference for race, gender, age prediction
- **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface) - Model architectures and pretrained weights

View File

@@ -125,7 +125,7 @@ from uniface.attribute import Emotion
from uniface.constants import DDAMFNWeights
detector = RetinaFace()
emotion = Emotion(model_name=DDAMFNWeights.AFFECNET7)
emotion = Emotion(model_weights=DDAMFNWeights.AFFECNET7)
faces = detector.detect(image)
@@ -169,10 +169,10 @@ from uniface.attribute import Emotion
from uniface.constants import DDAMFNWeights
# 7-class emotion
emotion = Emotion(model_name=DDAMFNWeights.AFFECNET7)
emotion = Emotion(model_weights=DDAMFNWeights.AFFECNET7)
# 8-class emotion
emotion = Emotion(model_name=DDAMFNWeights.AFFECNET8)
emotion = Emotion(model_weights=DDAMFNWeights.AFFECNET8)
```
---
@@ -206,12 +206,11 @@ for face in faces:
### Using FaceAnalyzer
```python
from uniface import FaceAnalyzer
from uniface import FaceAnalyzer, RetinaFace, AgeGender
analyzer = FaceAnalyzer(
detect=True,
recognize=False,
attributes=True # Uses AgeGender
RetinaFace(),
age_gender=AgeGender(),
)
faces = analyzer.analyze(image)

View File

@@ -296,7 +296,7 @@ cv2.imwrite("result.jpg", image)
Benchmark on your hardware:
```bash
python tools/detection.py --source image.jpg --iterations 100
python tools/detection.py --source image.jpg
```
---

View File

@@ -1,15 +1,16 @@
# Parsing
Face parsing segments faces into semantic components (skin, eyes, nose, mouth, hair, etc.).
Face parsing segments faces into semantic components or face regions.
---
## Available Models
| Model | Backbone | Size | Classes |
|-------|----------|------|---------|
| **BiSeNet ResNet18** :material-check-circle: | ResNet18 | 51 MB | 19 |
| BiSeNet ResNet34 | ResNet34 | 89 MB | 19 |
| Model | Backbone | Size | Output |
|-------|----------|------|--------|
| **BiSeNet ResNet18** :material-check-circle: | ResNet18 | 51 MB | 19 classes |
| BiSeNet ResNet34 | ResNet34 | 89 MB | 19 classes |
| XSeg | - | 67 MB | Mask |
---
@@ -45,16 +46,16 @@ cv2.imwrite("parsed.jpg", vis_bgr)
| ID | Class | ID | Class |
|----|-------|----|-------|
| 0 | Background | 10 | Ear Ring |
| 1 | Skin | 11 | Nose |
| 2 | Left Eyebrow | 12 | Mouth |
| 3 | Right Eyebrow | 13 | Upper Lip |
| 4 | Left Eye | 14 | Lower Lip |
| 5 | Right Eye | 15 | Neck |
| 6 | Eye Glasses | 16 | Neck Lace |
| 7 | Left Ear | 17 | Cloth |
| 8 | Right Ear | 18 | Hair |
| 9 | Hat | | |
| 0 | Background | 10 | Nose |
| 1 | Skin | 11 | Mouth |
| 2 | Left Eyebrow | 12 | Upper Lip |
| 3 | Right Eyebrow | 13 | Lower Lip |
| 4 | Left Eye | 14 | Neck |
| 5 | Right Eye | 15 | Necklace |
| 6 | Eyeglasses | 16 | Cloth |
| 7 | Left Ear | 17 | Hair |
| 8 | Right Ear | 18 | Hat |
| 9 | Earring | | |
---
@@ -125,7 +126,7 @@ mask = parser.parse(face_image)
# Extract specific component
SKIN = 1
HAIR = 18
HAIR = 17
LEFT_EYE = 4
RIGHT_EYE = 5
@@ -148,10 +149,10 @@ mask = parser.parse(face_image)
component_names = {
0: 'Background', 1: 'Skin', 2: 'L-Eyebrow', 3: 'R-Eyebrow',
4: 'L-Eye', 5: 'R-Eye', 6: 'Glasses', 7: 'L-Ear', 8: 'R-Ear',
9: 'Hat', 10: 'Earring', 11: 'Nose', 12: 'Mouth',
13: 'U-Lip', 14: 'L-Lip', 15: 'Neck', 16: 'Necklace',
17: 'Cloth', 18: 'Hair'
4: 'L-Eye', 5: 'R-Eye', 6: 'Eyeglasses', 7: 'L-Ear', 8: 'R-Ear',
9: 'Earring', 10: 'Nose', 11: 'Mouth',
12: 'U-Lip', 13: 'L-Lip', 14: 'Neck', 15: 'Necklace',
16: 'Cloth', 17: 'Hair', 18: 'Hat'
}
for class_id in np.unique(mask):
@@ -218,7 +219,7 @@ def replace_background(image, mask, background):
```python
def get_hair_mask(mask):
"""Extract clean hair mask."""
hair_mask = (mask == 18).astype(np.uint8) * 255
hair_mask = (mask == 17).astype(np.uint8) * 255
# Clean up with morphological operations
kernel = np.ones((5, 5), np.uint8)
@@ -248,12 +249,83 @@ vis_result = vis_parsing_maps(
---
## XSeg
XSeg outputs a mask for face regions. Unlike BiSeNet which works on bbox crops, XSeg requires 5-point landmarks for face alignment.
### Basic Usage
```python
import cv2
from uniface import RetinaFace
from uniface.parsing import XSeg
detector = RetinaFace()
parser = XSeg()
image = cv2.imread("photo.jpg")
faces = detector.detect(image)
for face in faces:
if face.landmarks is not None:
mask = parser.parse(image, face.landmarks)
print(f"Mask shape: {mask.shape}") # (H, W), values in [0, 1]
```
### Parameters
```python
from uniface.parsing import XSeg
# Default settings
parser = XSeg()
# Custom settings
parser = XSeg(
align_size=256, # Face alignment size
blur_sigma=5, # Gaussian blur for smoothing (0 = raw)
)
```
| Parameter | Default | Description |
|-----------|---------|-------------|
| `align_size` | 256 | Face alignment output size |
| `blur_sigma` | 0 | Mask smoothing (0 = no blur) |
### Methods
```python
# Full pipeline: align -> segment -> warp back to original space
mask = parser.parse(image, landmarks)
# For pre-aligned face crops
mask = parser.parse_aligned(face_crop)
# Get mask + crop + inverse matrix for custom warping
mask, face_crop, inverse_matrix = parser.parse_with_inverse(image, landmarks)
```
### BiSeNet vs XSeg
| Feature | BiSeNet | XSeg |
|---------|---------|------|
| Output | 19 class labels | Mask [0, 1] |
| Input | Bbox crop | Requires landmarks |
| Use case | Facial components | Face region extraction |
---
## Factory Function
```python
from uniface import create_face_parser
from uniface.constants import ParsingWeights, XSegWeights
parser = create_face_parser() # Returns BiSeNet
# BiSeNet (default)
parser = create_face_parser()
# XSeg
parser = create_face_parser(XSegWeights.DEFAULT)
```
---

View File

@@ -64,7 +64,7 @@ blurrer = BlurFace(method='pixelate', pixel_blocks=10)
| Parameter | Default | Description |
|-----------|---------|-------------|
| `pixel_blocks` | 10 | Number of blocks (lower = more pixelated) |
| `pixel_blocks` | 15 | Number of blocks (lower = more pixelated) |
### Gaussian

View File

@@ -69,20 +69,21 @@ spoofer = MiniFASNet(model_name=MiniFASNetWeights.V1SE)
## Confidence Thresholds
The default threshold is 0.5. Adjust for your use case:
`result.is_real` is based on the model's top predicted class (argmax). If you want stricter behavior,
apply your own confidence threshold:
```python
result = spoofer.predict(image, face.bbox)
# High security (fewer false accepts)
HIGH_THRESHOLD = 0.7
if result.confidence > HIGH_THRESHOLD:
if result.is_real and result.confidence > HIGH_THRESHOLD:
print("Real (high confidence)")
else:
print("Suspicious")
# Balanced
if result.is_real: # Uses default 0.5 threshold
# Balanced (argmax decision)
if result.is_real:
print("Real")
else:
print("Fake")

View File

@@ -16,6 +16,7 @@ Run UniFace examples directly in your browser with Google Colab, or download and
| [Face Parsing](https://github.com/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | Semantic face segmentation |
| [Face Anonymization](https://github.com/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | Privacy-preserving blur |
| [Gaze Estimation](https://github.com/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | Gaze direction estimation |
| [Face Segmentation](https://github.com/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | Face segmentation with XSeg |
---

View File

@@ -27,29 +27,29 @@ import numpy as np
class MyDetector(BaseDetector):
def __init__(self, model_path: str, confidence_threshold: float = 0.5):
super().__init__(confidence_threshold=confidence_threshold)
self.session = create_onnx_session(model_path)
self.threshold = confidence_threshold
def preprocess(self, image: np.ndarray) -> np.ndarray:
# Your preprocessing logic
# e.g., resize, normalize, transpose
raise NotImplementedError
def postprocess(self, outputs, shape) -> list[Face]:
# Your postprocessing logic
# e.g., decode boxes, apply NMS, create Face objects
raise NotImplementedError
def detect(self, image: np.ndarray) -> list[Face]:
# 1. Preprocess image
input_tensor = self._preprocess(image)
input_tensor = self.preprocess(image)
# 2. Run inference
outputs = self.session.run(None, {'input': input_tensor})
# 3. Postprocess outputs to Face objects
faces = self._postprocess(outputs, image.shape)
return faces
def _preprocess(self, image):
# Your preprocessing logic
# e.g., resize, normalize, transpose
pass
def _postprocess(self, outputs, shape):
# Your postprocessing logic
# e.g., decode boxes, apply NMS, create Face objects
pass
return self.postprocess(outputs, image.shape)
```
---
@@ -57,36 +57,14 @@ class MyDetector(BaseDetector):
## Add Custom Recognition Model
```python
from uniface.recognition.base import BaseRecognizer
from uniface.onnx_utils import create_onnx_session
from uniface import face_alignment
import numpy as np
from uniface.recognition.base import BaseRecognizer, PreprocessConfig
class MyRecognizer(BaseRecognizer):
def __init__(self, model_path: str):
self.session = create_onnx_session(model_path)
def __init__(self, model_path: str, providers=None):
preprocessing = PreprocessConfig(input_mean=127.5, input_std=127.5, input_size=(112, 112))
super().__init__(model_path, preprocessing, providers=providers)
def get_normalized_embedding(
self,
image: np.ndarray,
landmarks: np.ndarray
) -> np.ndarray:
# 1. Align face
aligned = face_alignment(image, landmarks)
# 2. Preprocess
input_tensor = self._preprocess(aligned)
# 3. Run inference
embedding = self.session.run(None, {'input': input_tensor})[0]
# 4. Normalize
embedding = embedding / np.linalg.norm(embedding)
return embedding
def _preprocess(self, image):
# Your preprocessing logic
pass
# Optional: override preprocess() if your model expects custom normalization.
```
---

View File

@@ -67,14 +67,18 @@ cv2.imwrite("result.jpg", result_image)
For convenience, use the built-in `FaceAnalyzer`:
```python
from uniface import FaceAnalyzer
from uniface import FaceAnalyzer, RetinaFace, ArcFace, AgeGender
import cv2
# Initialize with desired modules
detector = RetinaFace()
recognizer = ArcFace()
age_gender = AgeGender()
analyzer = FaceAnalyzer(
detect=True,
recognize=True,
attributes=True
detector,
recognizer=recognizer,
age_gender=age_gender,
)
# Process image

File diff suppressed because one or more lines are too long

View File

@@ -1,6 +1,6 @@
[project]
name = "uniface"
version = "2.2.1"
version = "2.3.0"
description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Landmark Analysis, Face Parsing, Gaze Estimation, Age, and Gender Detection"
readme = "README.md"
license = "MIT"

View File

@@ -2,15 +2,15 @@
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Tests for BiSeNet face parsing model."""
"""Tests for face parsing models (BiSeNet and XSeg)."""
from __future__ import annotations
import numpy as np
import pytest
from uniface.constants import ParsingWeights
from uniface.parsing import BiSeNet, create_face_parser
from uniface.constants import ParsingWeights, XSegWeights
from uniface.parsing import BiSeNet, XSeg, create_face_parser
def test_bisenet_initialization():
@@ -120,3 +120,151 @@ def test_bisenet_different_input_sizes():
assert mask.shape == (h, w), f'Failed for size {h}x{w}'
assert mask.dtype == np.uint8
# XSeg Tests
def test_xseg_initialization():
"""Test XSeg initialization."""
parser = XSeg()
assert parser is not None
assert parser.input_size == (256, 256)
assert parser.align_size == 256
assert parser.blur_sigma == 0
def test_xseg_with_custom_params():
"""Test XSeg with custom parameters."""
parser = XSeg(align_size=512, blur_sigma=5)
assert parser.align_size == 512
assert parser.blur_sigma == 5
def test_xseg_preprocess():
"""Test XSeg preprocessing."""
parser = XSeg()
# Create a dummy aligned face crop
face_crop = np.random.randint(0, 255, (256, 256, 3), dtype=np.uint8)
# Preprocess
preprocessed = parser.preprocess(face_crop)
assert preprocessed.shape == (1, 256, 256, 3) # NHWC format
assert preprocessed.dtype == np.float32
assert preprocessed.min() >= 0
assert preprocessed.max() <= 1
def test_xseg_postprocess():
"""Test XSeg postprocessing."""
parser = XSeg()
# Create dummy model output (NHWC format)
dummy_output = np.random.rand(1, 256, 256, 1).astype(np.float32)
# Postprocess
mask = parser.postprocess(dummy_output, crop_size=(256, 256))
assert mask.shape == (256, 256)
assert mask.dtype == np.float32
assert mask.min() >= 0
assert mask.max() <= 1
def test_xseg_parse_aligned():
"""Test XSeg parse_aligned method."""
parser = XSeg()
# Create a dummy aligned face crop
face_crop = np.random.randint(0, 255, (256, 256, 3), dtype=np.uint8)
# Parse
mask = parser.parse_aligned(face_crop)
assert mask.shape == (256, 256)
assert mask.dtype == np.float32
assert mask.min() >= 0
assert mask.max() <= 1
def test_xseg_parse_with_landmarks():
"""Test XSeg parse method with landmarks."""
parser = XSeg()
# Create a dummy image
image = np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
# Create dummy 5-point landmarks
landmarks = np.array(
[
[250, 200], # left eye
[390, 200], # right eye
[320, 280], # nose
[260, 350], # left mouth
[380, 350], # right mouth
],
dtype=np.float32,
)
# Parse
mask = parser.parse(image, landmarks)
assert mask.shape == (480, 640)
assert mask.dtype == np.float32
assert mask.min() >= 0
assert mask.max() <= 1
def test_xseg_parse_invalid_landmarks():
"""Test XSeg parse with invalid landmarks shape."""
parser = XSeg()
image = np.random.randint(0, 255, (256, 256, 3), dtype=np.uint8)
# Wrong shape
invalid_landmarks = np.array([[0, 0], [1, 1], [2, 2]])
with pytest.raises(ValueError, match='Landmarks must have shape'):
parser.parse(image, invalid_landmarks)
def test_xseg_parse_with_inverse():
"""Test XSeg parse_with_inverse method."""
parser = XSeg()
# Create a dummy image
image = np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
# Create dummy 5-point landmarks
landmarks = np.array(
[
[250, 200],
[390, 200],
[320, 280],
[260, 350],
[380, 350],
],
dtype=np.float32,
)
# Parse with inverse
mask, face_crop, inverse_matrix = parser.parse_with_inverse(image, landmarks)
assert mask.shape == (256, 256)
assert face_crop.shape == (256, 256, 3)
assert inverse_matrix.shape == (2, 3)
def test_create_face_parser_xseg_enum():
"""Test factory function with XSeg enum."""
parser = create_face_parser(XSegWeights.DEFAULT)
assert parser is not None
assert isinstance(parser, XSeg)
def test_create_face_parser_xseg_string():
"""Test factory function with XSeg string."""
parser = create_face_parser('xseg')
assert parser is not None
assert isinstance(parser, XSeg)

View File

@@ -17,7 +17,8 @@ CLI utilities for testing and running UniFace features.
| `face_search.py` | Real-time face matching against reference |
| `fairface.py` | FairFace attribute prediction (race, gender, age) |
| `spoofing.py` | Face anti-spoofing detection |
| `face_parsing.py` | Face semantic segmentation |
| `face_parsing.py` | Face semantic segmentation (BiSeNet) |
| `xseg.py` | Face segmentation (XSeg) |
| `video_detection.py` | Face detection on video files with progress bar |
| `batch_process.py` | Batch process folder of images |
| `download_model.py` | Download model weights |
@@ -63,10 +64,14 @@ python tools/landmarks.py --source 0
python tools/fairface.py --source assets/test.jpg
python tools/fairface.py --source 0
# Face parsing
# Face parsing (BiSeNet)
python tools/face_parsing.py --source assets/test.jpg
python tools/face_parsing.py --source 0
# Face segmentation (XSeg)
python tools/xseg.py --source assets/test.jpg
python tools/xseg.py --source 0
# Face anti-spoofing
python tools/spoofing.py --source assets/test.jpg
python tools/spoofing.py --source 0

250
tools/xseg.py Normal file
View File

@@ -0,0 +1,250 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""XSeg face segmentation on detected faces.
Usage:
python tools/xseg.py --source path/to/image.jpg
python tools/xseg.py --source path/to/video.mp4
python tools/xseg.py --source 0 # webcam
"""
from __future__ import annotations
import argparse
import os
from pathlib import Path
import cv2
import numpy as np
from uniface import RetinaFace, XSeg
IMAGE_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff'}
VIDEO_EXTENSIONS = {'.mp4', '.avi', '.mov', '.mkv', '.webm', '.flv'}
def get_source_type(source: str) -> str:
"""Determine if source is image, video, or camera."""
if source.isdigit():
return 'camera'
path = Path(source)
suffix = path.suffix.lower()
if suffix in IMAGE_EXTENSIONS:
return 'image'
elif suffix in VIDEO_EXTENSIONS:
return 'video'
else:
return 'unknown'
def apply_mask_visualization(image: np.ndarray, mask: np.ndarray, alpha: float = 0.5) -> np.ndarray:
"""Apply colored mask overlay for visualization."""
overlay = image.copy().astype(np.float32)
mask_3ch = np.stack([mask * 0.3, mask * 0.7, mask * 0.3], axis=-1)
overlay = overlay * (1 - mask[..., None] * alpha) + mask_3ch * 255 * alpha
return overlay.clip(0, 255).astype(np.uint8)
def process_image(
detector: RetinaFace,
parser: XSeg,
image_path: str,
save_dir: str = 'outputs',
) -> None:
"""Process a single image."""
image = cv2.imread(image_path)
if image is None:
print(f"Error: Failed to load image from '{image_path}'")
return
faces = detector.detect(image)
print(f'Detected {len(faces)} face(s)')
if len(faces) == 0:
print('No faces detected.')
return
# Accumulate masks from all faces
full_mask = np.zeros((image.shape[0], image.shape[1]), dtype=np.float32)
for i, face in enumerate(faces):
if face.landmarks is None:
print(f' Face {i + 1}: skipped (no landmarks)')
continue
mask = parser.parse(image, face.landmarks)
full_mask = np.maximum(full_mask, mask)
print(f' Face {i + 1}: done')
# Apply visualization
result_image = apply_mask_visualization(image, full_mask)
# Draw bounding boxes
for face in faces:
x1, y1, x2, y2 = map(int, face.bbox[:4])
cv2.rectangle(result_image, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Save results
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(image_path).stem}_xseg.jpg')
cv2.imwrite(output_path, result_image)
print(f'Output saved: {output_path}')
mask_path = os.path.join(save_dir, f'{Path(image_path).stem}_xseg_mask.png')
mask_uint8 = (full_mask * 255).astype(np.uint8)
cv2.imwrite(mask_path, mask_uint8)
print(f'Mask saved: {mask_path}')
def process_video(
detector: RetinaFace,
parser: XSeg,
video_path: str,
save_dir: str = 'outputs',
) -> None:
"""Process a video file."""
cap = cv2.VideoCapture(video_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{video_path}'")
return
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(video_path).stem}_xseg.mp4')
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
print(f'Processing video: {video_path} ({total_frames} frames)')
frame_count = 0
while True:
ret, frame = cap.read()
if not ret:
break
frame_count += 1
faces = detector.detect(frame)
# Accumulate masks from all faces
full_mask = np.zeros((frame.shape[0], frame.shape[1]), dtype=np.float32)
for face in faces:
if face.landmarks is None:
continue
mask = parser.parse(frame, face.landmarks)
full_mask = np.maximum(full_mask, mask)
# Apply visualization
result_frame = apply_mask_visualization(frame, full_mask)
# Draw bounding boxes
for face in faces:
x1, y1, x2, y2 = map(int, face.bbox[:4])
cv2.rectangle(result_frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
cv2.putText(result_frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
out.write(result_frame)
if frame_count % 100 == 0:
print(f' Processed {frame_count}/{total_frames} frames...')
cap.release()
out.release()
print(f'Done! Output saved: {output_path}')
def run_camera(
detector: RetinaFace,
parser: XSeg,
camera_id: int = 0,
) -> None:
"""Run real-time detection on webcam."""
cap = cv2.VideoCapture(camera_id)
if not cap.isOpened():
print(f'Cannot open camera {camera_id}')
return
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)
# Accumulate masks from all faces
full_mask = np.zeros((frame.shape[0], frame.shape[1]), dtype=np.float32)
for face in faces:
if face.landmarks is None:
continue
mask = parser.parse(frame, face.landmarks)
full_mask = np.maximum(full_mask, mask)
# Apply visualization
result_frame = apply_mask_visualization(frame, full_mask)
# Draw bounding boxes
for face in faces:
x1, y1, x2, y2 = map(int, face.bbox[:4])
cv2.rectangle(result_frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
cv2.putText(result_frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.imshow('XSeg Face Segmentation', result_frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
def main() -> None:
arg_parser = argparse.ArgumentParser(description='XSeg face segmentation')
arg_parser.add_argument('--source', type=str, required=True, help='Image/video path or camera ID (0, 1, ...)')
arg_parser.add_argument('--save-dir', type=str, default='outputs', help='Output directory')
arg_parser.add_argument(
'--blur',
type=float,
default=0,
help='Gaussian blur sigma for mask smoothing (default: 0 = raw)',
)
arg_parser.add_argument(
'--align-size',
type=int,
default=256,
help='Face alignment size (default: 256)',
)
args = arg_parser.parse_args()
# Initialize models
detector = RetinaFace()
parser = XSeg(blur_sigma=args.blur, align_size=args.align_size)
source_type = get_source_type(args.source)
if source_type == 'camera':
run_camera(detector, parser, int(args.source))
elif source_type == 'image':
if not os.path.exists(args.source):
print(f'Error: Image not found: {args.source}')
return
process_image(detector, parser, args.source, args.save_dir)
elif source_type == 'video':
if not os.path.exists(args.source):
print(f'Error: Video not found: {args.source}')
return
process_video(detector, parser, args.source, args.save_dir)
else:
print(f"Error: Unknown source type for '{args.source}'")
print('Supported formats: images (.jpg, .png, ...), videos (.mp4, .avi, ...), or camera ID (0, 1, ...)')
if __name__ == '__main__':
main()

View File

@@ -28,7 +28,7 @@ from __future__ import annotations
__license__ = 'MIT'
__author__ = 'Yakhyokhuja Valikhujaev'
__version__ = '2.2.1'
__version__ = '2.3.0'
from uniface.face_utils import compute_similarity, face_alignment
from uniface.log import Logger, enable_logging
@@ -48,7 +48,7 @@ from .detection import (
)
from .gaze import MobileGaze, create_gaze_estimator
from .landmark import Landmark106, create_landmarker
from .parsing import BiSeNet, create_face_parser
from .parsing import BiSeNet, XSeg, create_face_parser
from .privacy import BlurFace, anonymize_faces
from .recognition import AdaFace, ArcFace, MobileFace, SphereFace, create_recognizer
from .spoofing import MiniFASNet, create_spoofer
@@ -95,6 +95,7 @@ __all__ = [
'MobileGaze',
# Parsing models
'BiSeNet',
'XSeg',
# Attribute models
'AgeGender',
'AttributeResult',

View File

@@ -150,6 +150,15 @@ class ParsingWeights(str, Enum):
RESNET34 = "parsing_resnet34"
class XSegWeights(str, Enum):
"""
XSeg face segmentation model from DeepFaceLab.
Outputs mask for face region.
https://github.com/iperov/DeepFaceLab
"""
DEFAULT = "xseg"
class MiniFASNetWeights(str, Enum):
"""
MiniFASNet: Lightweight Face Anti-Spoofing models.
@@ -217,6 +226,8 @@ MODEL_URLS: dict[Enum, str] = {
# Anti-Spoofing (MiniFASNet)
MiniFASNetWeights.V1SE: 'https://github.com/yakhyo/face-anti-spoofing/releases/download/weights/MiniFASNetV1SE.onnx',
MiniFASNetWeights.V2: 'https://github.com/yakhyo/face-anti-spoofing/releases/download/weights/MiniFASNetV2.onnx',
# XSeg
XSegWeights.DEFAULT: 'https://github.com/yakhyo/face-segmentation/releases/download/weights/xseg.onnx',
}
MODEL_SHA256: dict[Enum, str] = {
@@ -272,6 +283,8 @@ MODEL_SHA256: dict[Enum, str] = {
# Anti-Spoofing (MiniFASNet)
MiniFASNetWeights.V1SE: 'ebab7f90c7833fbccd46d3a555410e78d969db5438e169b6524be444862b3676',
MiniFASNetWeights.V2: 'b32929adc2d9c34b9486f8c4c7bc97c1b69bc0ea9befefc380e4faae4e463907',
# XSeg
XSegWeights.DEFAULT: '0b57328efcb839d85973164b617ceee9dfe6cfcb2c82e8a033bba9f4f09b27e5',
}
CHUNK_SIZE = 8192

View File

@@ -4,27 +4,25 @@
from __future__ import annotations
from uniface.constants import ParsingWeights
from uniface.constants import ParsingWeights, XSegWeights
from .base import BaseFaceParser
from .bisenet import BiSeNet
from .xseg import XSeg
__all__ = ['BaseFaceParser', 'BiSeNet', 'create_face_parser']
__all__ = ['BaseFaceParser', 'BiSeNet', 'XSeg', 'create_face_parser']
def create_face_parser(
model_name: str | ParsingWeights = ParsingWeights.RESNET18,
model_name: str | ParsingWeights | XSegWeights = ParsingWeights.RESNET18,
**kwargs,
) -> BaseFaceParser:
"""Factory function to create a face parsing model instance.
This function provides a convenient way to instantiate face parsing models
without directly importing the specific model classes.
"""Factory function to create a face parsing model.
Args:
model_name: The face parsing model to create. Can be either a string
or a ParsingWeights enum value. Available options:
- 'parsing_resnet18' or ParsingWeights.RESNET18 (default)
- 'parsing_resnet34' or ParsingWeights.RESNET34
model_name: Model to create. Options: ParsingWeights.RESNET18/RESNET34 (BiSeNet),
XSegWeights.DEFAULT (XSeg, requires landmarks).
**kwargs: Additional arguments passed to the model constructor.
Returns:
An instance of the requested face parsing model.
@@ -33,20 +31,32 @@ def create_face_parser(
ValueError: If the model_name is not recognized.
Example:
>>> from uniface.parsing import create_face_parser
>>> from uniface.constants import ParsingWeights
>>> parser = create_face_parser(ParsingWeights.RESNET18)
>>> mask = parser.parse(face_crop)
"""
# Handle XSegWeights
if isinstance(model_name, XSegWeights):
return XSeg(model_name=model_name, **kwargs)
# Convert string to enum if necessary
if isinstance(model_name, str):
# Try XSegWeights first
try:
xseg_model = XSegWeights(model_name)
return XSeg(model_name=xseg_model, **kwargs)
except ValueError:
pass
# Try ParsingWeights
try:
model_name = ParsingWeights(model_name)
except ValueError as e:
valid_models = [e.value for e in ParsingWeights]
valid_parsing = [m.value for m in ParsingWeights]
valid_xseg = [m.value for m in XSegWeights]
valid_models = valid_parsing + valid_xseg
raise ValueError(
f"Unknown face parsing model: '{model_name}'. Valid options are: {', '.join(valid_models)}"
) from e
# All parsing models use the same BiSeNet class
return BiSeNet(model_name=model_name)
# BiSeNet models
return BiSeNet(model_name=model_name, **kwargs)

242
uniface/parsing/xseg.py Normal file
View File

@@ -0,0 +1,242 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
import cv2
import numpy as np
from uniface.constants import XSegWeights
from uniface.face_utils import face_alignment
from uniface.log import Logger
from uniface.model_store import verify_model_weights
from uniface.onnx_utils import create_onnx_session
from .base import BaseFaceParser
__all__ = ['XSeg']
class XSeg(BaseFaceParser):
"""
XSeg: Face Segmentation Model from DeepFaceLab with ONNX Runtime.
XSeg outputs a mask for face regions. Unlike BiSeNet which works
on bbox crops, XSeg requires 5-point landmarks for face alignment. The model
uses NHWC input format and outputs values in [0, 1] range.
Reference:
https://github.com/iperov/DeepFaceLab
Args:
model_name (XSegWeights): The enum specifying the XSeg model to load.
Defaults to `XSegWeights.DEFAULT`.
align_size (int): Face alignment output size. Must be multiple of 112 or 128.
Defaults to 256.
blur_sigma (float): Gaussian blur sigma for mask smoothing.
0 = raw output (no blur). Defaults to 0.
providers (list[str] | None): ONNX Runtime execution providers. If None,
auto-detects the best available provider.
Attributes:
align_size (int): Face alignment output size.
blur_sigma (float): Blur sigma for post-processing.
input_size (tuple[int, int]): Model input dimensions (width, height).
Example:
>>> from uniface.parsing import XSeg
>>> from uniface import RetinaFace
>>>
>>> detector = RetinaFace()
>>> parser = XSeg()
>>>
>>> faces = detector.detect(image)
>>> for face in faces:
... if face.landmarks is not None:
... mask = parser.parse(image, face.landmarks)
... print(f'Mask shape: {mask.shape}')
"""
def __init__(
self,
model_name: XSegWeights = XSegWeights.DEFAULT,
align_size: int = 256,
blur_sigma: float = 0,
providers: list[str] | None = None,
) -> None:
Logger.info(f'Initializing XSeg with model={model_name}, align_size={align_size}, blur_sigma={blur_sigma}')
self.align_size = align_size
self.blur_sigma = blur_sigma
self.providers = providers
self.model_path = verify_model_weights(model_name)
self._initialize_model()
def _initialize_model(self) -> None:
"""
Initialize the ONNX model from the stored model path.
Raises:
RuntimeError: If the model fails to load or initialize.
"""
try:
self.session = create_onnx_session(self.model_path, providers=self.providers)
# Get input configuration
input_cfg = self.session.get_inputs()[0]
input_shape = input_cfg.shape
self.input_name = input_cfg.name
# NHWC format: (N, H, W, C)
if isinstance(input_shape[1], int) and isinstance(input_shape[2], int):
self.input_size = (input_shape[2], input_shape[1])
else:
self.input_size = (256, 256)
Logger.info(f'Dynamic input shape detected, using default: {self.input_size}')
# Get output configuration
outputs = self.session.get_outputs()
self.output_names = [output.name for output in outputs]
Logger.info(f'XSeg initialized with input size {self.input_size}')
except Exception as e:
Logger.error(f"Failed to load XSeg model from '{self.model_path}'", exc_info=True)
raise RuntimeError(f'Failed to initialize XSeg model: {e}') from e
def preprocess(self, face_crop: np.ndarray) -> np.ndarray:
"""
Preprocess an aligned face crop for inference.
Args:
face_crop (np.ndarray): An aligned face crop in BGR format.
Returns:
np.ndarray: Preprocessed image tensor with shape (1, H, W, 3).
"""
# Resize to model input size
image = cv2.resize(face_crop, self.input_size, interpolation=cv2.INTER_LINEAR)
# Normalize to [0, 1]
image = image.astype(np.float32) / 255.0
# Add batch dimension (NHWC format)
image = np.expand_dims(image, axis=0)
return image
def postprocess(self, outputs: np.ndarray, crop_size: tuple[int, int]) -> np.ndarray:
"""
Postprocess model output to segmentation mask.
Args:
outputs (np.ndarray): Raw model output.
crop_size (tuple[int, int]): Size to resize mask to (width, height).
Returns:
np.ndarray: Segmentation mask as float32 in range [0, 1].
"""
# Squeeze and clip to valid range
mask = outputs.squeeze().clip(0, 1).astype(np.float32)
# Resize back to crop size
mask = cv2.resize(mask, crop_size, interpolation=cv2.INTER_LINEAR)
# Apply optional blur and threshold
if self.blur_sigma > 0:
mask = cv2.GaussianBlur(mask, (0, 0), self.blur_sigma)
mask = (mask.clip(0.5, 1) - 0.5) * 2
return mask
def parse(self, image: np.ndarray, landmarks: np.ndarray) -> np.ndarray:
"""
Perform face segmentation using 5-point landmarks.
Args:
image (np.ndarray): Input image in BGR format.
landmarks (np.ndarray): 5-point facial landmarks with shape (5, 2).
Returns:
np.ndarray: Segmentation mask in original image space, values in [0, 1].
Raises:
ValueError: If landmarks shape is not (5, 2).
"""
if landmarks.shape != (5, 2):
raise ValueError(f'Landmarks must have shape (5, 2), got {landmarks.shape}')
# Align face using landmarks
face_crop, inverse_matrix = face_alignment(image, landmarks, image_size=self.align_size)
# Run inference
crop_size = (face_crop.shape[1], face_crop.shape[0])
input_tensor = self.preprocess(face_crop)
outputs = self.session.run(self.output_names, {self.input_name: input_tensor})
# Postprocess mask
mask = self.postprocess(outputs[0], crop_size)
# Warp mask back to original image space
h, w = image.shape[:2]
warped_mask = cv2.warpAffine(
mask,
inverse_matrix,
(w, h),
flags=cv2.INTER_LINEAR,
borderMode=cv2.BORDER_CONSTANT,
borderValue=0,
)
return warped_mask
def parse_aligned(self, face_crop: np.ndarray) -> np.ndarray:
"""
Perform segmentation on an already aligned face crop.
Args:
face_crop (np.ndarray): An aligned face crop in BGR format.
Returns:
np.ndarray: Segmentation mask with same size as input, values in [0, 1].
"""
crop_size = (face_crop.shape[1], face_crop.shape[0])
# Run inference
input_tensor = self.preprocess(face_crop)
outputs = self.session.run(self.output_names, {self.input_name: input_tensor})
return self.postprocess(outputs[0], crop_size)
def parse_with_inverse(
self,
image: np.ndarray,
landmarks: np.ndarray,
) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
"""
Parse face and return mask with inverse matrix for custom warping.
Args:
image (np.ndarray): Input image in BGR format.
landmarks (np.ndarray): 5-point facial landmarks with shape (5, 2).
Returns:
Tuple of (mask, face_crop, inverse_matrix).
"""
if landmarks.shape != (5, 2):
raise ValueError(f'Landmarks must have shape (5, 2), got {landmarks.shape}')
# Align face using landmarks
face_crop, inverse_matrix = face_alignment(image, landmarks, image_size=self.align_size)
# Run inference
crop_size = (face_crop.shape[1], face_crop.shape[0])
input_tensor = self.preprocess(face_crop)
outputs = self.session.run(self.output_names, {self.input_name: input_tensor})
# Postprocess mask (in crop space)
mask = self.postprocess(outputs[0], crop_size)
return mask, face_crop, inverse_matrix