33 Commits
v3.0.0 ... main

Author SHA1 Message Date
Yakhyokhuja Valikhujaev
7882ec5cb4 chore: Update docstrings and comments (#119) 2026-05-11 01:07:14 +09:00
dependabot[bot]
d51d030545 chore(deps): bump gitpython from 3.1.49 to 3.1.50 (#118)
Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.49 to 3.1.50.
- [Release notes](https://github.com/gitpython-developers/GitPython/releases)
- [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES)
- [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.49...3.1.50)

---
updated-dependencies:
- dependency-name: gitpython
  dependency-version: 3.1.50
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-09 19:16:08 +09:00
github-actions[bot]
5a767847da chore: Release v3.6.0 2026-05-08 12:25:34 +00:00
github-actions[bot]
4a22f903f0 chore: Release v3.6.0rc2 2026-05-08 12:20:29 +00:00
Yakhyokhuja Valikhujaev
43a46e11df ci: Refresh uv.lock during release pipeline (#117) 2026-05-08 21:13:44 +09:00
github-actions[bot]
025b93ab8b chore: Release v3.6.0rc1 2026-05-08 03:27:30 +00:00
Yakhyokhuja Valikhujaev
8bf87d958f feat: Add PIPNet for facial landmarks detection (#116)
* docs: Add PipNet model documentation

* feat: Add PipNet for face landmark detection
2026-05-08 12:25:00 +09:00
Yakhyokhuja Valikhujaev
b813dc2ee7 ref: Update package mngt and optimize the vector store functions (#115)
* ref: Update download and hash chunk sizes to speed up

* build: Adopt uv with uv.lock and drop requirements.txt

* ref: Centralize softmax helper and minor cleanups
2026-05-06 01:47:27 +09:00
Yakhyokhuja Valikhujaev
73fc291930 ci: Resolve deprecation warnings in pipeline (#114)
* ci: Resolve deprecation warnings in pipeline
2026-04-28 00:52:46 +09:00
github-actions[bot]
400bb72217 chore: Release v3.5.3 2026-04-27 15:24:18 +00:00
Yakhyokhuja Valikhujaev
a0a12d5eca fix: Fix pypi publish re-run issue (#113) 2026-04-28 00:22:12 +09:00
github-actions[bot]
a34f376da0 chore: Release v3.5.2 2026-04-27 15:04:20 +00:00
Yakhyokhuja Valikhujaev
2b29706615 ci: Add end-to-end deployment pipeline and fix docs auto-trigger (#112) 2026-04-27 23:59:09 +09:00
github-actions[bot]
f6d3cf33f0 chore: Release v3.5.1 2026-04-27 11:53:21 +00:00
Yakhyokhuja Valikhujaev
0eb042425c chore: Minor changes to workflow names and docs (#111) 2026-04-27 20:51:50 +09:00
github-actions[bot]
35c0b6d539 chore: Release v3.5.1rc1 2026-04-25 15:17:54 +00:00
Yakhyokhuja Valikhujaev
13c4ac83d8 feat: Update the release workflow and package installation command (#110)
* fix: Fix installation conflict between onnxruntime and onnxruntime-gpu

* fix: Fix CI, notebooks, type hints, and packaging issues found in audit

* feat: Add new release config

* ci: Automate release pipeline and document release process
2026-04-25 23:59:00 +09:00
Yakhyokhuja Valikhujaev
6ce397b811 feat: Add MODNet portrait matting (#108)
* feat: Add MODNet portrait matting

* docs: Update docs and example of portrait matting

* fix: Fix linting issue
2026-04-11 23:30:32 +09:00
Yakhyokhuja Valikhujaev
9bf54f5f78 feat: Add EdgeFace recognition model (#105)
* refactor: Split recognition models into separate files

* feat: Add EdgeFace recognition model

* release: Bump version to v3.4.0
2026-04-04 20:11:28 +09:00
Yakhyokhuja Valikhujaev
c87ec1ad0f docs: Add example images and update MkDocs files (#104)
* chore: Add example inference results

* docs: Update MkDocs and README files
2026-04-04 18:28:27 +09:00
Yakhyokhuja Valikhujaev
9e56a86963 chore: Update docs and clean up notebook outputs before before commit (#102)
* chore: Add links for repo and docs on example notebooks

* ref: Compress jupyter notebook sizes

* ci: Add nbstripout pre-commit hook for notebook output stripping

* docs: Add coding agent docs and commit message tag
2026-04-03 10:10:51 +09:00
Yakhyokhuja Valikhujaev
426bd71505 release: Release UniFace v3.3.0 - Python 3.10 support, stores refactor, docs and examples refresh (#101)
* docs: Update docs and examples

* chore: Update tools folder testing for development

* feat: Update indexing to stores and drawing logic

* chore: Update the release version to 3.3.0

* feat: Add python 3.10 support

* build: Add python support for worklows and publishing

* chore: Update all example notebooks
2026-03-28 22:30:56 +09:00
LiberiFatali
ede8b27091 chore: Add example notebook for face recognition (#100) 2026-03-28 05:27:27 +09:00
Yakhyokhuja Valikhujaev
02c77ce5db feat: Add head pose estimation model (#99)
* feat: Add Head Pose Estimation  with 6 different models

* chore: Update jupyter notebook examples

* docs: Update head pose estimation related docs
2026-03-26 22:57:05 +09:00
Yakhyokhuja Valikhujaev
d70d6a254f ref: Unify attribute/detector base classes and fix tools reliability (#98)
* refactor: unify attribute API, deduplicate detectors, and fix embedding shape

* refactor: unify attribute API and deduplicate detector code

* chore: Update docs page build on tags and frame validation before flip
2026-03-25 23:43:56 +09:00
Yakhyokhuja Valikhujaev
7d37633b1a chore: drop Python 3.10 support, bump scikit-image to >=0.26.0 (#96) 2026-03-19 10:04:52 +09:00
Yakhyokhuja Valikhujaev
bc413df4a8 docs: Add release changelog markdown file (#92) 2026-03-19 09:46:16 +09:00
Marc-Antoine BERTIN
8db0577991 feat: Add Python 3.14 support (#95)
- Relax requires-python upper bound from <3.14 to <3.15
- Add Python 3.14 classifier to pyproject.toml
- Add Python 3.14 to CI test matrix (ubuntu-latest)
- Fix SimilarityTransform.estimate() deprecation warning (scikit-image >=0.26)
  by switching to SimilarityTransform.from_estimate() class constructor

All 147 tests pass on Python 3.14.3 with no warnings.

Co-authored-by: marc-antoine <marcantoine.bertin@storyzy.com>
2026-03-19 09:41:44 +09:00
Yakhyokhuja Valikhujaev
3682a2124f release: Release UniFace version v3.1.0 (#91)
* release: Release UniFace version v3.1.0

* docs: Change classifiers to stable from beta
2026-03-11 12:21:33 +09:00
Yakhyokhuja Valikhujaev
2ef6a1ebe8 refactor: Use dataclass-based model info in model management (#90)
- Refactor model management section: Using data classes for more robust model management.
2026-03-11 12:05:43 +09:00
Yakhyokhuja Valikhujaev
78a2dba7c7 feat: Add FAISS vectore database for fast face search (#88) 2026-03-05 22:46:03 +09:00
Yakhyokhuja Valikhujaev
87e496d1f5 feat: Add FAISS vector DB support for fast search (#86)
* feat: Add FAISS: VectorDB for face embedding search

* docs: Update Documentation
2026-03-03 12:12:05 +09:00
Yakhyokhuja Valikhujaev
5604ebf4f1 docs: Add datasets information in the docs (#85) 2026-02-18 16:02:37 +09:00
151 changed files with 9270 additions and 2371 deletions

Binary file not shown.

Before

Width:  |  Height:  |  Size: 716 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 673 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 826 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 563 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 33 KiB

View File

Before

Width:  |  Height:  |  Size: 427 KiB

After

Width:  |  Height:  |  Size: 427 KiB

View File

Before

Width:  |  Height:  |  Size: 1.7 MiB

After

Width:  |  Height:  |  Size: 1.7 MiB

View File

Before

Width:  |  Height:  |  Size: 1.8 MiB

After

Width:  |  Height:  |  Size: 1.8 MiB

View File

Before

Width:  |  Height:  |  Size: 1.9 MiB

After

Width:  |  Height:  |  Size: 1.9 MiB

View File

Before

Width:  |  Height:  |  Size: 872 KiB

After

Width:  |  Height:  |  Size: 872 KiB

View File

Before

Width:  |  Height:  |  Size: 62 KiB

After

Width:  |  Height:  |  Size: 62 KiB

View File

@@ -1,4 +1,4 @@
name: Build
name: CI
on:
push:
@@ -17,10 +17,10 @@ jobs:
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
- uses: actions/checkout@v5
- uses: actions/setup-python@v6
with:
python-version: "3.10"
python-version: "3.11"
- uses: pre-commit/action@v3.0.1
test:
@@ -35,8 +35,14 @@ jobs:
# Full Python range on Linux (fastest runner)
- os: ubuntu-latest
python-version: "3.10"
- os: ubuntu-latest
python-version: "3.11"
- os: ubuntu-latest
python-version: "3.12"
- os: ubuntu-latest
python-version: "3.13"
- os: ubuntu-latest
python-version: "3.14"
- os: macos-latest
python-version: "3.13"
- os: windows-latest
@@ -44,28 +50,25 @@ jobs:
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@v5
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
enable-cache: true
python-version: ${{ matrix.python-version }}
cache: "pip"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install .[dev]
run: uv sync --locked --extra cpu --extra dev
- name: Check ONNX Runtime providers
run: |
python -c "import onnxruntime as ort; print('Available providers:', ort.get_available_providers())"
run: uv run python -c "import onnxruntime as ort; print('Available providers:', ort.get_available_providers())"
- name: Run tests
run: pytest -v --tb=short
run: uv run pytest -v --tb=short
- name: Test package imports
run: python -c "import uniface; print(f'uniface {uniface.__version__} loaded with {len(uniface.__all__)} exports')"
run: uv run python -c "import uniface; print(f'uniface {uniface.__version__} loaded with {len(uniface.__all__)} exports')"
build:
runs-on: ubuntu-latest
@@ -74,10 +77,10 @@ jobs:
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@v5
- name: Set up Python
uses: actions/setup-python@v5
uses: actions/setup-python@v6
with:
python-version: "3.11"
cache: "pip"

View File

@@ -1,8 +1,6 @@
name: Deploy docs
name: Deploy Documentation
on:
push:
branches: [main]
workflow_dispatch:
permissions:
@@ -12,26 +10,28 @@ jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
with:
fetch-depth: 0 # Fetch full history for git-committers and git-revision-date plugins
fetch-depth: 0
- uses: actions/setup-python@v5
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
enable-cache: true
python-version: "3.11"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install mkdocs-material pymdown-extensions mkdocs-git-committers-plugin-2 mkdocs-git-revision-date-localized-plugin
run: uv sync --locked --extra docs
- name: Build docs
env:
MKDOCS_GIT_COMMITTERS_APIKEY: ${{ secrets.MKDOCS_GIT_COMMITTERS_APIKEY }}
run: mkdocs build --strict
run: uv run mkdocs build --strict
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v4
env:
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./site

229
.github/workflows/pipeline.yml vendored Normal file
View File

@@ -0,0 +1,229 @@
name: Release Pipeline
on:
workflow_dispatch:
inputs:
version:
description: 'Version (e.g. 3.6.0, 3.6.0b1, 3.6.0rc1)'
required: true
concurrency:
group: pipeline
cancel-in-progress: false
jobs:
validate:
runs-on: ubuntu-latest
timeout-minutes: 5
outputs:
is_prerelease: ${{ steps.prerelease.outputs.is_prerelease }}
steps:
- name: Checkout code
uses: actions/checkout@v5
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: "3.11"
- name: Validate version (PEP 440)
run: |
python - <<'EOF'
import re, sys
v = "${{ inputs.version }}"
if not re.fullmatch(r'\d+\.\d+\.\d+((a|b|rc)\d+|\.dev\d+)?', v):
print(f"Invalid version: {v}")
print("Expected forms: 3.6.0, 3.6.0a1, 3.6.0b1, 3.6.0rc1, 3.6.0.dev1")
sys.exit(1)
EOF
- name: Check tag does not exist
run: |
if git rev-parse "v${{ inputs.version }}" >/dev/null 2>&1; then
echo "Tag v${{ inputs.version }} already exists."
exit 1
fi
- name: Detect pre-release
id: prerelease
run: |
if [[ "${{ inputs.version }}" =~ (a|b|rc|\.dev)[0-9]+ ]]; then
echo "is_prerelease=true" >> $GITHUB_OUTPUT
else
echo "is_prerelease=false" >> $GITHUB_OUTPUT
fi
test:
runs-on: ubuntu-latest
timeout-minutes: 15
needs: validate
strategy:
fail-fast: false
matrix:
python-version: ["3.10", "3.11", "3.12", "3.13", "3.14"]
steps:
- name: Checkout code
uses: actions/checkout@v5
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
enable-cache: true
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: uv sync --locked --extra cpu --extra dev
- name: Run tests
run: uv run pytest -v --tb=short
release:
runs-on: ubuntu-latest
timeout-minutes: 5
needs: test
permissions:
contents: write
steps:
- name: Checkout code
uses: actions/checkout@v5
with:
fetch-depth: 0
token: ${{ secrets.RELEASE_TOKEN }}
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: "3.11"
- name: Update pyproject.toml
run: |
python - <<'EOF'
import re, pathlib
p = pathlib.Path('pyproject.toml')
text = p.read_text()
new = re.sub(r'^version\s*=\s*".*"', f'version = "${{ inputs.version }}"', text, count=1, flags=re.M)
if new == text:
raise SystemExit("Failed to update version in pyproject.toml")
p.write_text(new)
EOF
- name: Update uniface/__init__.py
run: |
python - <<'EOF'
import re, pathlib
p = pathlib.Path('uniface/__init__.py')
text = p.read_text()
new = re.sub(r"^__version__\s*=\s*'.*'", f"__version__ = '${{ inputs.version }}'", text, count=1, flags=re.M)
if new == text:
raise SystemExit("Failed to update __version__ in uniface/__init__.py")
p.write_text(new)
EOF
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
enable-cache: true
python-version: "3.11"
- name: Refresh uv.lock with new project version
run: uv lock --upgrade-package uniface
- name: Commit, tag, push
run: |
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
git add pyproject.toml uniface/__init__.py uv.lock
git commit -m "chore: Release v${{ inputs.version }}"
git tag "v${{ inputs.version }}"
git push origin HEAD:${{ github.ref_name }}
git push origin "v${{ inputs.version }}"
publish:
runs-on: ubuntu-latest
timeout-minutes: 10
needs: [validate, release]
permissions:
contents: write
id-token: write
environment:
name: pypi
url: https://pypi.org/project/uniface/
steps:
- name: Checkout tag
uses: actions/checkout@v5
with:
ref: v${{ inputs.version }}
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: "3.11"
cache: 'pip'
- name: Install build tools
run: |
python -m pip install --upgrade pip
python -m pip install build twine
- name: Build package
run: python -m build
- name: Check package
run: twine check dist/*
- name: Publish to PyPI
env:
TWINE_USERNAME: __token__
TWINE_PASSWORD: ${{ secrets.PYPI_API_TOKEN }}
run: twine upload dist/*
- name: Create GitHub Release
uses: softprops/action-gh-release@v2
with:
tag_name: v${{ inputs.version }}
files: dist/*
generate_release_notes: true
prerelease: ${{ needs.validate.outputs.is_prerelease }}
docs:
runs-on: ubuntu-latest
timeout-minutes: 10
needs: [validate, publish]
if: needs.validate.outputs.is_prerelease == 'false'
permissions:
contents: write
steps:
- name: Checkout tag
uses: actions/checkout@v5
with:
ref: v${{ inputs.version }}
fetch-depth: 0
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
enable-cache: true
python-version: "3.11"
- name: Install dependencies
run: uv sync --locked --extra docs
- name: Build docs
env:
MKDOCS_GIT_COMMITTERS_APIKEY: ${{ secrets.MKDOCS_GIT_COMMITTERS_APIKEY }}
run: uv run mkdocs build --strict
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v4
env:
FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./site
destination_dir: docs

View File

@@ -1,119 +0,0 @@
name: Publish to PyPI
on:
push:
tags:
- "v*.*.*" # Trigger only on version tags like v0.1.9
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
validate:
runs-on: ubuntu-latest
timeout-minutes: 5
outputs:
version: ${{ steps.get_version.outputs.version }}
tag_version: ${{ steps.get_version.outputs.tag_version }}
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.11" # Needs 3.11+ for tomllib
- name: Get version from tag and pyproject.toml
id: get_version
run: |
TAG_VERSION=${GITHUB_REF#refs/tags/v}
echo "tag_version=$TAG_VERSION" >> $GITHUB_OUTPUT
PYPROJECT_VERSION=$(python -c "import tomllib; print(tomllib.load(open('pyproject.toml','rb'))['project']['version'])")
echo "version=$PYPROJECT_VERSION" >> $GITHUB_OUTPUT
echo "Tag version: v$TAG_VERSION"
echo "pyproject.toml version: $PYPROJECT_VERSION"
- name: Verify version match
run: |
if [ "${{ steps.get_version.outputs.tag_version }}" != "${{ steps.get_version.outputs.version }}" ]; then
echo "Error: Tag version (${{ steps.get_version.outputs.tag_version }}) does not match pyproject.toml version (${{ steps.get_version.outputs.version }})"
exit 1
fi
echo "Version validation passed: ${{ steps.get_version.outputs.version }}"
test:
runs-on: ubuntu-latest
timeout-minutes: 15
needs: validate
strategy:
fail-fast: false
matrix:
python-version: ["3.10", "3.13"]
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
cache: 'pip'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install .[dev]
- name: Run tests
run: pytest -v
publish:
runs-on: ubuntu-latest
timeout-minutes: 10
needs: [validate, test]
permissions:
contents: write
id-token: write
environment:
name: pypi
url: https://pypi.org/project/uniface/
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.10"
cache: 'pip'
- name: Install build tools
run: |
python -m pip install --upgrade pip
python -m pip install build twine
- name: Build package
run: python -m build
- name: Check package
run: twine check dist/*
- name: Publish to PyPI
env:
TWINE_USERNAME: __token__
TWINE_PASSWORD: ${{ secrets.PYPI_API_TOKEN }}
run: twine upload dist/*
- name: Create GitHub Release
uses: softprops/action-gh-release@v1
with:
files: dist/*
generate_release_notes: true

View File

@@ -18,6 +18,13 @@ repos:
- id: debug-statements
- id: check-ast
# Strip Jupyter notebook outputs
- repo: https://github.com/kynan/nbstripout
rev: 0.9.1
hooks:
- id: nbstripout
files: ^examples/
# Ruff - Fast Python linter and formatter
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.14.10

6
AGENTS.md Normal file
View File

@@ -0,0 +1,6 @@
<!-- Cursor agent instructions — shared with CLAUDE.md -->
<!-- See CLAUDE.md for full project instructions for AI coding agents. -->
# AGENTS.md
Please read and follow all instructions in [CLAUDE.md](./CLAUDE.md).

81
CLAUDE.md Normal file
View File

@@ -0,0 +1,81 @@
# CLAUDE.md
Project instructions for AI coding agents.
## Project Overview
UniFace is a Python library for face detection, recognition, tracking, landmark analysis, face parsing, gaze estimation, age/gender detection. It uses ONNX Runtime for inference.
## Code Style
- Python 3.10+ with type hints
- Line length: 120
- Single quotes for strings, double quotes for docstrings
- Google-style docstrings
- Formatter/linter: Ruff (config in `pyproject.toml`)
- Run `ruff format .` and `ruff check . --fix` before committing
## Commit Messages
Follow [Conventional Commits](https://www.conventionalcommits.org/) with a **capitalized** description:
```
<type>: <Capitalized short description>
```
Types: `feat`, `fix`, `docs`, `style`, `refactor`, `perf`, `test`, `build`, `ci`, `chore`
Examples:
- `feat: Add gaze estimation model`
- `fix: Correct bounding box scaling for non-square images`
- `ci: Add nbstripout pre-commit hook`
- `docs: Update installation instructions`
- `refactor: Unify attribute/detector base classes`
## Testing
```bash
pytest -v --tb=short
```
Tests live in `tests/`. Run the full suite before submitting changes.
## Pre-commit
Pre-commit hooks handle formatting, linting, security checks, and notebook output stripping. Always run:
```bash
pre-commit install
pre-commit run --all-files
```
## Project Structure
```
uniface/ # Main package
detection/ # Face detection models (SCRFD, RetinaFace, YOLOv5, YOLOv8)
recognition/ # Face recognition/verification (AdaFace, ArcFace, EdgeFace, MobileFace, SphereFace)
landmark/ # Facial landmark models
tracking/ # Object tracking (ByteTrack)
parsing/ # Face parsing/segmentation (BiSeNet, XSeg)
gaze/ # Gaze estimation
headpose/ # Head pose estimation
attribute/ # Age, gender, emotion detection
spoofing/ # Anti-spoofing (MiniFASNet)
privacy/ # Face anonymization
stores/ # Vector stores (FAISS)
constants.py # Model weight URLs and checksums
model_store.py # Model download/cache management
analyzer.py # High-level FaceAnalyzer API
types.py # Shared type definitions
tests/ # Unit tests
examples/ # Jupyter notebooks (outputs are auto-stripped)
docs/ # MkDocs documentation
```
## Key Conventions
- New models: add class in submodule, register weights in `constants.py`, export in `__init__.py`
- Dependencies: managed in `pyproject.toml`
- All ONNX models are downloaded on demand with SHA256 verification
- Do not commit notebook outputs; `nbstripout` pre-commit hook handles this

View File

@@ -21,25 +21,31 @@ Thank you for considering contributing to UniFace! We welcome contributions of a
## Development Setup
We use [uv](https://docs.astral.sh/uv/) for reproducible dev installs. The committed `uv.lock` pins every transitive dependency so contributors and CI resolve to identical versions.
```bash
# Install uv (https://docs.astral.sh/uv/getting-started/installation/)
curl -LsSf https://astral.sh/uv/install.sh | sh
git clone https://github.com/yakhyo/uniface.git
cd uniface
pip install -e ".[dev]"
# Sync runtime + cpu + dev extras from uv.lock (use --extra gpu instead of cpu for CUDA)
uv sync --extra cpu --extra dev
```
`uv sync` creates a project-local `.venv/` and installs everything pinned in `uv.lock`. Run commands with `uv run <cmd>` (e.g. `uv run pytest`), or activate the venv with `source .venv/bin/activate`.
### Setting Up Pre-commit Hooks
We use [pre-commit](https://pre-commit.com/) to ensure code quality and consistency. Install and configure it:
We use [pre-commit](https://pre-commit.com/) to ensure code quality and consistency. `pre-commit` is included in the `[dev]` extra, so it's already installed after `uv sync`.
```bash
# Install pre-commit
pip install pre-commit
# Install the git hooks
pre-commit install
uv run pre-commit install
# (Optional) Run against all files
pre-commit run --all-files
uv run pre-commit run --all-files
```
Once installed, pre-commit will automatically run on every commit to check:
@@ -184,6 +190,48 @@ Example notebooks demonstrating library usage:
| Face Parsing | [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) |
| Face Anonymization | [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) |
| Gaze Estimation | [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) |
| Face Segmentation | [09_face_segmentation.ipynb](examples/09_face_segmentation.ipynb) |
| Face Vector Store | [10_face_vector_store.ipynb](examples/10_face_vector_store.ipynb) |
| Head Pose Estimation | [11_head_pose_estimation.ipynb](examples/11_head_pose_estimation.ipynb) |
## Release Process
Releases are fully automated via GitHub Actions. Only maintainers with branch-protection bypass privileges on `main` can trigger a release.
### Cutting a release
1. Go to **Actions → Release Pipeline → Run workflow** on GitHub.
2. Enter the version following [PEP 440](https://peps.python.org/pep-0440/):
- Stable: `0.7.0`, `1.0.0`
- Pre-release: `0.7.0rc1`, `0.7.0b1`, `0.7.0a1`, `0.7.0.dev1`
3. Click **Run workflow**.
### What happens automatically
The `Release Pipeline` workflow runs all stages in sequence:
1. **Validate** — checks the version string against PEP 440 and confirms the tag does not already exist.
2. **Test** — runs the test suite on Python 3.103.14.
3. **Release** — updates `pyproject.toml` and `uniface/__init__.py`, commits `chore: Release vX.Y.Z` to `main`, creates and pushes tag `vX.Y.Z`.
4. **Publish** — builds the package, uploads to PyPI, and creates a GitHub Release (flagged as pre-release for `a`/`b`/`rc`/`.dev` versions).
5. **Deploy docs** — runs only for **stable** versions. Pre-releases do not update the live documentation site.
### Verifying a release
- PyPI: <https://pypi.org/project/uniface/>
- GitHub Releases: <https://github.com/yakhyo/uniface/releases>
- Docs (stable only): <https://yakhyo.github.io/uniface/>
### Installing a pre-release
End users can opt in to pre-releases with the `--pre` flag:
```bash
pip install uniface --pre # latest pre-release
pip install uniface==0.7.0rc1 # specific pre-release
```
Without `--pre`, `pip install uniface` always resolves to the latest stable version.
## Questions?

191
README.md
View File

@@ -1,4 +1,4 @@
<h1 align="center">UniFace: All-in-One Face Analysis Library</h1>
<h1 align="center">UniFace: A Unified Face Analysis Library for Python</h1>
<div align="center">
@@ -14,49 +14,93 @@
</div>
<div align="center">
<img src="https://raw.githubusercontent.com/yakhyo/uniface/main/.github/logos/new/uniface_rounded_q80.webp" width="90%" alt="UniFace - All-in-One Open-Source Face Analysis Library">
<img src="https://raw.githubusercontent.com/yakhyo/uniface/main/.github/logos/uniface_rounded_q80.webp" width="90%" alt="UniFace - A Unified Face Analysis Library for Python">
</div>
---
**UniFace** is a lightweight, production-ready face analysis library built on ONNX Runtime. It provides high-performance face detection, recognition, landmark detection, face parsing, gaze estimation, and attribute analysis with hardware acceleration support across platforms.
**UniFace** is a lightweight, production-ready Python library for face detection, recognition, tracking, landmark analysis, face parsing, gaze estimation, and face attributes.
---
## Features
- **Face Detection** — RetinaFace, SCRFD, YOLOv5-Face, and YOLOv8-Face with 5-point landmarks
- **Face Recognition** — ArcFace, MobileFace, and SphereFace embeddings
- **Face Recognition** — AdaFace, ArcFace, EdgeFace, MobileFace, and SphereFace embeddings
- **Face Tracking** — Multi-object tracking with [BYTETracker](https://github.com/yakhyo/bytetrack-tracker) for persistent IDs across video frames
- **Facial Landmarks** — 106-point landmark localization module (separate from 5-point detector landmarks)
- **Facial Landmarks** — 106-point (2d106det) and 98 / 68-point (PIPNet) landmark localization (separate from the 5-point detector landmarks)
- **Face Parsing** — BiSeNet semantic segmentation (19 classes), XSeg face masking
- **Portrait Matting** — Trimap-free alpha matte with MODNet (background removal, green screen, compositing)
- **Gaze Estimation** — Real-time gaze direction with MobileGaze
- **Head Pose Estimation** — 3D head orientation (pitch, yaw, roll) with 6D rotation representation
- **Attribute Analysis** — Age, gender, race (FairFace), and emotion
- **Vector Store** — FAISS-backed embedding store for fast multi-identity search
- **Anti-Spoofing** — Face liveness detection with MiniFASNet
- **Face Anonymization** — 5 blur methods for privacy protection
- **Hardware Acceleration** — ARM64 (Apple Silicon), CUDA (NVIDIA), CPU
---
## Visual Examples
<table>
<tr>
<td align="center"><b>Face Detection</b><br><img src="https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/detection.jpg" width="100%"></td>
<td align="center"><b>Gaze Estimation</b><br><img src="https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/gaze.jpg" width="100%"></td>
</tr>
<tr>
<td align="center"><b>Head Pose Estimation</b><br><img src="https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/headpose.jpg" width="100%"></td>
<td align="center"><b>Age &amp; Gender</b><br><img src="https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/age_gender.jpg" width="100%"></td>
</tr>
<tr>
<td align="center" colspan="2"><b>Face Verification</b><br><img src="https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/verification.jpg" width="80%"></td>
</tr>
<tr>
<td align="center" colspan="2"><b>106-Point Landmarks</b><br><img src="https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/landmarks.jpg" width="36%"></td>
</tr>
<tr>
<td align="center" colspan="2"><b>Face Parsing</b><br><img src="https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/parsing.jpg" width="80%"></td>
</tr>
<tr>
<td align="center" colspan="2"><b>Face Segmentation</b><br><img src="https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/segmentation.jpg" width="80%"></td>
</tr>
<tr>
<td align="center" colspan="2"><b>Portrait Matting</b><br><img src="https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/matting.jpg" width="100%"></td>
</tr>
<tr>
<td align="center" colspan="2"><b>Face Anonymization</b><br><img src="https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/anonymization.jpg" width="100%"></td>
</tr>
</table>
---
## Installation
**Standard installation**
**CPU / Apple Silicon**
```bash
pip install uniface
pip install uniface[cpu]
```
**GPU support (CUDA)**
**GPU support (NVIDIA CUDA)**
```bash
pip install uniface[gpu]
```
> **Why separate extras?** `onnxruntime` and `onnxruntime-gpu` conflict when both are installed — they own the same Python namespace. Installing only the extra you need prevents that conflict entirely.
**From source (latest version)**
```bash
git clone https://github.com/yakhyo/uniface.git
cd uniface && pip install -e .
cd uniface && pip install -e ".[cpu]" # or .[gpu] for CUDA
```
**FAISS vector store**
```bash
pip install faiss-cpu # or faiss-gpu for CUDA
```
**Optional dependencies**
@@ -119,14 +163,10 @@ for face in faces:
```python
import cv2
from uniface.analyzer import FaceAnalyzer
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface import FaceAnalyzer
detector = RetinaFace()
recognizer = ArcFace()
analyzer = FaceAnalyzer(detector, recognizer=recognizer)
# Zero-config: uses SCRFD (500M) + ArcFace (MobileNet) by default
analyzer = FaceAnalyzer()
image = cv2.imread("photo.jpg")
if image is None:
@@ -138,6 +178,79 @@ for face in faces:
print(face.bbox, face.embedding.shape if face.embedding is not None else None)
```
With attributes:
```python
from uniface import FaceAnalyzer, AgeGender
analyzer = FaceAnalyzer(attributes=[AgeGender()])
faces = analyzer.analyze(image)
for face in faces:
print(f"{face.sex}, {face.age}y, embedding={face.embedding.shape}")
```
---
## Example (Portrait Matting)
```python
import cv2
import numpy as np
from uniface.matting import MODNet
matting = MODNet()
image = cv2.imread("portrait.jpg")
matte = matting.predict(image) # (H, W) float32 in [0, 1]
# Transparent PNG
rgba = cv2.cvtColor(image, cv2.COLOR_BGR2BGRA)
rgba[:, :, 3] = (matte * 255).astype(np.uint8)
cv2.imwrite("transparent.png", rgba)
# Green screen
matte_3ch = matte[:, :, np.newaxis]
bg = np.full_like(image, (0, 177, 64), dtype=np.uint8)
result = (image * matte_3ch + bg * (1 - matte_3ch)).astype(np.uint8)
cv2.imwrite("green_screen.jpg", result)
```
---
## Jupyter Notebooks
| Example | Colab | Description |
|---------|:-----:|-------------|
| [01_face_detection.ipynb](examples/01_face_detection.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/01_face_detection.ipynb) | Face detection and landmarks |
| [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/02_face_alignment.ipynb) | Face alignment for recognition |
| [03_face_verification.ipynb](examples/03_face_verification.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/03_face_verification.ipynb) | Compare faces for identity |
| [04_face_search.ipynb](examples/04_face_search.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/04_face_search.ipynb) | Find a person in group photos |
| [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/05_face_analyzer.ipynb) | Unified face analysis |
| [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | Semantic face segmentation |
| [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | Privacy-preserving blur |
| [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | Gaze direction estimation |
| [09_face_segmentation.ipynb](examples/09_face_segmentation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | Face segmentation with XSeg |
| [10_face_vector_store.ipynb](examples/10_face_vector_store.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/10_face_vector_store.ipynb) | FAISS-backed face database |
| [11_head_pose_estimation.ipynb](examples/11_head_pose_estimation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/11_head_pose_estimation.ipynb) | Head pose estimation (pitch, yaw, roll) |
| [12_face_recognition.ipynb](examples/12_face_recognition.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/12_face_recognition.ipynb) | Standalone face recognition pipeline |
| [13_portrait_matting.ipynb](examples/13_portrait_matting.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/13_portrait_matting.ipynb) | Portrait matting with MODNet |
---
## Documentation
Full documentation: https://yakhyo.github.io/uniface/
| Resource | Description |
|----------|-------------|
| [Quickstart](https://yakhyo.github.io/uniface/quickstart/) | Get up and running in 5 minutes |
| [Model Zoo](https://yakhyo.github.io/uniface/models/) | All models, benchmarks, and selection guide |
| [API Reference](https://yakhyo.github.io/uniface/modules/detection/) | Detailed module documentation |
| [Tutorials](https://yakhyo.github.io/uniface/recipes/image-pipeline/) | Step-by-step workflow examples |
| [Guides](https://yakhyo.github.io/uniface/concepts/overview/) | Architecture and design principles |
| [Datasets](https://yakhyo.github.io/uniface/datasets/) | Training data and evaluation benchmarks |
---
## Execution Providers (ONNX Runtime)
@@ -154,33 +267,22 @@ https://yakhyo.github.io/uniface/concepts/execution-providers/
---
## Documentation
## Datasets
Full documentation: https://yakhyo.github.io/uniface/
| Task | Training Dataset | Models |
|------|-----------------|--------|
| Detection | WIDER FACE | RetinaFace, SCRFD, YOLOv5-Face, YOLOv8-Face |
| Recognition | MS1MV2 | MobileFace, SphereFace |
| Recognition | WebFace600K | ArcFace |
| Recognition | WebFace4M / 12M | AdaFace |
| Recognition | MS1MV2 | EdgeFace |
| Landmarks | WFLW, 300W+CelebA | PIPNet (98 / 68 pts) |
| Gaze | Gaze360 | MobileGaze |
| Head Pose | 300W-LP | HeadPose (ResNet, MobileNet) |
| Parsing | CelebAMask-HQ | BiSeNet |
| Attributes | CelebA, FairFace, AffectNet | AgeGender, FairFace, Emotion |
| Resource | Description |
|----------|-------------|
| [Quickstart](https://yakhyo.github.io/uniface/quickstart/) | Get up and running in 5 minutes |
| [Model Zoo](https://yakhyo.github.io/uniface/models/) | All models, benchmarks, and selection guide |
| [API Reference](https://yakhyo.github.io/uniface/modules/detection/) | Detailed module documentation |
| [Tutorials](https://yakhyo.github.io/uniface/recipes/image-pipeline/) | Step-by-step workflow examples |
| [Guides](https://yakhyo.github.io/uniface/concepts/overview/) | Architecture and design principles |
---
## Jupyter Notebooks
| Example | Colab | Description |
|---------|:-----:|-------------|
| [01_face_detection.ipynb](examples/01_face_detection.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/01_face_detection.ipynb) | Face detection and landmarks |
| [02_face_alignment.ipynb](examples/02_face_alignment.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/02_face_alignment.ipynb) | Face alignment for recognition |
| [03_face_verification.ipynb](examples/03_face_verification.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/03_face_verification.ipynb) | Compare faces for identity |
| [04_face_search.ipynb](examples/04_face_search.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/04_face_search.ipynb) | Find a person in group photos |
| [05_face_analyzer.ipynb](examples/05_face_analyzer.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/05_face_analyzer.ipynb) | All-in-one analysis |
| [06_face_parsing.ipynb](examples/06_face_parsing.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | Semantic face segmentation |
| [07_face_anonymization.ipynb](examples/07_face_anonymization.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | Privacy-preserving blur |
| [08_gaze_estimation.ipynb](examples/08_gaze_estimation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | Gaze direction estimation |
| [09_face_segmentation.ipynb](examples/09_face_segmentation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | Face segmentation with XSeg |
> See [Datasets documentation](https://yakhyo.github.io/uniface/datasets/) for download links, benchmarks, and details.
---
@@ -206,9 +308,13 @@ If you plan commercial use, verify model license compatibility.
| Detection | [yolov8-face-onnx-inference](https://github.com/yakhyo/yolov8-face-onnx-inference) | - | YOLOv8-Face ONNX Inference |
| Tracking | [bytetrack-tracker](https://github.com/yakhyo/bytetrack-tracker) | - | BYTETracker Multi-Object Tracking |
| Recognition | [face-recognition](https://github.com/yakhyo/face-recognition) | ✓ | MobileFace, SphereFace Training |
| Recognition | [edgeface-onnx](https://github.com/yakhyo/edgeface-onnx) | - | EdgeFace ONNX Inference |
| Landmarks | [pipnet-onnx](https://github.com/yakhyo/pipnet-onnx) | - | PIPNet 98 / 68-point ONNX Inference |
| Parsing | [face-parsing](https://github.com/yakhyo/face-parsing) | ✓ | BiSeNet Face Parsing |
| Parsing | [face-segmentation](https://github.com/yakhyo/face-segmentation) | - | XSeg Face Segmentation |
| Gaze | [gaze-estimation](https://github.com/yakhyo/gaze-estimation) | ✓ | MobileGaze Training |
| Head Pose | [head-pose-estimation](https://github.com/yakhyo/head-pose-estimation) | ✓ | Head Pose Training (6DRepNet-style) |
| Matting | [modnet](https://github.com/yakhyo/modnet) | - | MODNet Portrait Matting |
| Anti-Spoofing | [face-anti-spoofing](https://github.com/yakhyo/face-anti-spoofing) | - | MiniFASNet Inference |
| Attributes | [fairface-onnx](https://github.com/yakhyo/fairface-onnx) | - | FairFace ONNX Inference |
@@ -232,3 +338,6 @@ Questions or feedback:
## License
This project is licensed under the [MIT License](LICENSE).
> **Disclaimer:** This project is not affiliated with or related to
> [Uniface](https://uniface.com/) by Rocket Software.

BIN
assets/demos/age_gender.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 206 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.5 MiB

BIN
assets/demos/detection.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 341 KiB

BIN
assets/demos/gaze.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 212 KiB

BIN
assets/demos/headpose.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 233 KiB

BIN
assets/demos/landmarks.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 428 KiB

BIN
assets/demos/matting.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 938 KiB

BIN
assets/demos/parsing.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 712 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 851 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 171 KiB

BIN
assets/demos/src_man1.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 63 KiB

BIN
assets/demos/src_man2.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 220 KiB

BIN
assets/demos/src_man3.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 146 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 208 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 121 KiB

BIN
assets/einstein/img_0.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 99 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.8 KiB

View File

@@ -110,6 +110,28 @@ landmarks = landmarker.get_landmarks(image, face.bbox)
| 63-86 | Eyes | 24 |
| 87-105 | Mouth | 19 |
### 98 / 68-Point Landmarks (PIPNet)
Returned by `PIPNet`. The variant determines the layout:
```python
from uniface.constants import PIPNetWeights
from uniface.landmark import PIPNet
# 98-point WFLW layout (default)
landmarks = PIPNet().get_landmarks(image, face.bbox)
# Shape: (98, 2)
# 68-point 300W layout
landmarks = PIPNet(model_name=PIPNetWeights.DW300_CELEBA_68).get_landmarks(image, face.bbox)
# Shape: (68, 2)
```
The 98-point output follows the standard [WFLW](https://wywu.github.io/projects/LAB/WFLW.html) layout
(33 face-contour points, eyebrow/eye/nose/mouth groups). The 68-point output follows the standard
[300W / iBUG](https://ibug.doc.ic.ac.uk/resources/300-W/) layout. Coordinates are in original-image
pixel space, identical in convention to `Landmark106`.
---
## Face Crop

View File

@@ -39,16 +39,20 @@ recognizer = ArcFace(providers=['CPUExecutionProvider'])
detector = RetinaFace(providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
```
All model classes accept the `providers` parameter:
All **ONNX-based** model classes accept the `providers` parameter:
- Detection: `RetinaFace`, `SCRFD`, `YOLOv5Face`, `YOLOv8Face`
- Recognition: `ArcFace`, `AdaFace`, `MobileFace`, `SphereFace`
- Landmarks: `Landmark106`
- Landmarks: `Landmark106`, `PIPNet`
- Gaze: `MobileGaze`
- Parsing: `BiSeNet`
- Parsing: `BiSeNet`, `XSeg`
- Attributes: `AgeGender`, `FairFace`
- Anti-Spoofing: `MiniFASNet`
!!! note "Non-ONNX components"
- **Emotion** uses TorchScript and selects its device automatically (`mps` / `cuda` / `cpu`). It does **not** accept the `providers` parameter.
- **BlurFace** is a pure OpenCV utility and does not load any model.
---
## Check Available Providers
@@ -89,7 +93,7 @@ print("Available providers:", providers)
No additional setup required. ARM64 optimizations are built into `onnxruntime`:
```bash
pip install uniface
pip install uniface[cpu]
```
Verify ARM64:
@@ -106,7 +110,7 @@ python -c "import platform; print(platform.machine())"
### NVIDIA GPU (CUDA)
Install with GPU support:
Install with GPU support (this installs `onnxruntime-gpu`, which already includes CPU fallback):
```bash
pip install uniface[gpu]
@@ -136,7 +140,7 @@ else:
CPU execution is always available:
```bash
pip install uniface
pip install uniface[cpu]
```
Works on all platforms without additional configuration.
@@ -211,7 +215,7 @@ for image_path in image_paths:
3. Reinstall with GPU support:
```bash
pip uninstall onnxruntime onnxruntime-gpu
pip uninstall onnxruntime onnxruntime-gpu -y
pip install uniface[gpu]
```

View File

@@ -43,7 +43,7 @@ class Face:
# Required (from detection)
bbox: np.ndarray # [x1, y1, x2, y2]
confidence: float # 0.0 to 1.0
landmarks: np.ndarray # (5, 2) or (106, 2)
landmarks: np.ndarray # (5, 2) from detectors. Dense landmarkers return (106, 2), (98, 2), or (68, 2).
# Optional (enriched by analyzers)
embedding: np.ndarray | None = None
@@ -106,6 +106,27 @@ print(f"Yaw: {np.degrees(result.yaw):.1f}°")
---
### HeadPoseResult
```python
@dataclass(frozen=True)
class HeadPoseResult:
pitch: float # Rotation around X-axis (degrees), + = looking down
yaw: float # Rotation around Y-axis (degrees), + = looking right
roll: float # Rotation around Z-axis (degrees), + = tilting clockwise
```
**Usage:**
```python
result = head_pose.estimate(face_crop)
print(f"Pitch: {result.pitch:.1f}°")
print(f"Yaw: {result.yaw:.1f}°")
print(f"Roll: {result.roll:.1f}°")
```
---
### SpoofingResult
```python
@@ -144,11 +165,11 @@ class AttributeResult:
```python
# AgeGender model
result = age_gender.predict(image, face.bbox)
result = age_gender.predict(image, face)
print(f"{result.sex}, {result.age} years old")
# FairFace model
result = fairface.predict(image, face.bbox)
result = fairface.predict(image, face)
print(f"{result.sex}, {result.age_group}, {result.race}")
```
@@ -171,7 +192,7 @@ Face recognition models return normalized 512-dimensional embeddings:
```python
embedding = recognizer.get_normalized_embedding(image, landmarks)
print(f"Shape: {embedding.shape}") # (1, 512)
print(f"Shape: {embedding.shape}") # (512,)
print(f"Norm: {np.linalg.norm(embedding):.4f}") # ~1.0
```

View File

@@ -194,6 +194,8 @@ If a model fails verification, it's re-downloaded automatically.
| Model | Size | Download |
|-------|------|----------|
| Landmark106 | 14 MB | ✅ |
| PIPNet WFLW-98 | 47 MB | ✅ |
| PIPNet 300W+CelebA-68 | 46 MB | ✅ |
| AgeGender | 8 MB | ✅ |
| FairFace | 44 MB | ✅ |
| Gaze ResNet34 | 82 MB | ✅ |

View File

@@ -23,8 +23,10 @@ graph TB
LMK[Landmarks]
ATTR[Attributes]
GAZE[Gaze]
HPOSE[Head Pose]
PARSE[Parsing]
SPOOF[Anti-Spoofing]
MATT[Matting]
PRIV[Privacy]
end
@@ -32,19 +34,26 @@ graph TB
TRK[BYTETracker]
end
subgraph Stores
IDX[FAISS Vector Store]
end
subgraph Output
FACE[Face Objects]
end
IMG --> DET
IMG --> MATT
DET --> REC
DET --> LMK
DET --> ATTR
DET --> GAZE
DET --> HPOSE
DET --> PARSE
DET --> SPOOF
DET --> PRIV
DET --> TRK
REC --> IDX
REC --> FACE
LMK --> FACE
ATTR --> FACE
@@ -55,9 +64,9 @@ graph TB
## Design Principles
### 1. ONNX-First
### 1. Cross-Platform Inference
All models use ONNX Runtime for inference:
UniFace uses portable model runtimes to provide consistent inference across macOS, Linux, and Windows. Most core components run through ONNX Runtime, while optional components may use PyTorch where appropriate.
- **Cross-platform**: Same models work on macOS, Linux, Windows
- **Hardware acceleration**: Automatic selection of optimal provider
@@ -106,15 +115,18 @@ def detect(self, image: np.ndarray) -> list[Face]:
```
uniface/
├── detection/ # Face detection (RetinaFace, SCRFD, YOLOv5Face, YOLOv8Face)
├── recognition/ # Face recognition (AdaFace, ArcFace, MobileFace, SphereFace)
├── recognition/ # Face recognition (AdaFace, ArcFace, EdgeFace, MobileFace, SphereFace)
├── tracking/ # Multi-object tracking (BYTETracker)
├── landmark/ # 106-point landmarks
├── landmark/ # Dense landmarks (Landmark106 = 106 pts, PIPNet = 98 / 68 pts)
├── attribute/ # Age, gender, emotion, race
├── parsing/ # Face semantic segmentation
├── matting/ # Portrait matting (MODNet)
├── gaze/ # Gaze estimation
├── headpose/ # Head pose estimation
├── spoofing/ # Anti-spoofing
├── privacy/ # Face anonymization
├── types.py # Dataclasses (Face, GazeResult, etc.)
├── stores/ # Vector stores (FAISS)
├── types.py # Dataclasses (Face, GazeResult, HeadPoseResult, etc.)
├── constants.py # Model weights and URLs
├── model_store.py # Model download and caching
├── onnx_utils.py # ONNX Runtime utilities
@@ -150,7 +162,7 @@ for face in faces:
embedding = recognizer.get_normalized_embedding(image, face.landmarks)
# Attributes
attrs = age_gender.predict(image, face.bbox)
attrs = age_gender.predict(image, face)
print(f"Face: {attrs.sex}, {attrs.age} years")
```
@@ -175,8 +187,7 @@ fairface = FairFace()
analyzer = FaceAnalyzer(
detector,
recognizer=recognizer,
age_gender=age_gender,
fairface=fairface,
attributes=[age_gender, fairface],
)
faces = analyzer.analyze(image)

View File

@@ -201,17 +201,11 @@ For drawing detections, filter by confidence:
```python
from uniface.draw import draw_detections
# Only draw high-confidence detections
bboxes = [f.bbox for f in faces if f.confidence > 0.7]
scores = [f.confidence for f in faces if f.confidence > 0.7]
landmarks = [f.landmarks for f in faces if f.confidence > 0.7]
# Only draw high-confidence detections (confidence ≥ vis_threshold)
draw_detections(
image=image,
bboxes=bboxes,
scores=scores,
landmarks=landmarks,
vis_threshold=0.6 # Additional visualization filter
faces=faces,
vis_threshold=0.7,
)
```

View File

@@ -6,16 +6,20 @@ Thank you for contributing to UniFace!
## Quick Start
We use [uv](https://docs.astral.sh/uv/) for reproducible dev installs (lockfile-pinned).
```bash
# Install uv first: https://docs.astral.sh/uv/getting-started/installation/
# Clone
git clone https://github.com/yakhyo/uniface.git
cd uniface
# Install dev dependencies
pip install -e ".[dev]"
# Install runtime + cpu + dev extras from uv.lock (--extra gpu for CUDA)
uv sync --extra cpu --extra dev
# Run tests
pytest
uv run pytest
```
---
@@ -39,10 +43,43 @@ ruff check . --fix
## Pre-commit Hooks
`pre-commit` is included in the `[dev]` extra, so `uv sync` already installs it.
```bash
pip install pre-commit
pre-commit install
pre-commit run --all-files
uv run pre-commit install
uv run pre-commit run --all-files
```
---
## Commit Messages
We follow [Conventional Commits](https://www.conventionalcommits.org/):
```
<type>: <short description>
```
| Type | When to use |
|--------------|--------------------------------------------------|
| **feat** | New feature or capability |
| **fix** | Bug fix |
| **docs** | Documentation changes |
| **style** | Formatting, whitespace (no logic change) |
| **refactor** | Code restructuring without changing behavior |
| **perf** | Performance improvement |
| **test** | Adding or updating tests |
| **build** | Build system or dependencies |
| **ci** | CI/CD and pre-commit configuration |
| **chore** | Routine maintenance and tooling |
**Examples:**
```
feat: Add gaze estimation model
fix: Correct bounding box scaling for non-square images
ci: Add nbstripout pre-commit hook
docs: Update installation instructions
```
---
@@ -67,6 +104,14 @@ pre-commit run --all-files
---
## Releases
Releases are automated via GitHub Actions. Maintainers trigger **Actions → Release Pipeline → Run workflow** with a [PEP 440](https://peps.python.org/pep-0440/) version (e.g. `0.7.0`, `0.7.0rc1`). The pipeline runs tests, bumps `pyproject.toml` + `uniface/__init__.py`, tags the commit, publishes to PyPI, and creates a GitHub Release. Docs redeploy only for stable releases.
See [CONTRIBUTING.md](https://github.com/yakhyo/uniface/blob/main/CONTRIBUTING.md#release-process) for the full process.
---
## Questions?
Open an issue on [GitHub](https://github.com/yakhyo/uniface/issues).

384
docs/datasets.md Normal file
View File

@@ -0,0 +1,384 @@
# Datasets
Overview of all training datasets and evaluation benchmarks used by UniFace models.
---
## Quick Reference
| Task | Dataset | Scale | Models |
| ----------- | ------------------------------------------------ | ---------------------- | ------------------------------------------- |
| Detection | [WIDER FACE](#wider-face) | 32K images | RetinaFace, SCRFD, YOLOv5-Face, YOLOv8-Face |
| Recognition | [MS1MV2](#ms1mv2) | 5.8M images, 85.7K IDs | MobileFace, SphereFace |
| Recognition | [WebFace600K](#webface600k) | 600K images | ArcFace |
| Recognition | [WebFace4M / WebFace12M](#webface4m--webface12m) | 4M / 12M images | AdaFace |
| Landmarks | [WFLW](#wflw) / [300W+CelebA](#300w--celeba) | 10K / 3.8K labeled + 202.6K unlabeled | PIPNet (98 / 68 pts) |
| Gaze | [Gaze360](#gaze360) | 238 subjects | MobileGaze |
| Parsing | [CelebAMask-HQ](#celebamask-hq) | 30K images | BiSeNet |
| Attributes | [CelebA](#celeba) | 200K images | AgeGender |
| Attributes | [FairFace](#fairface) | Balanced demographics | FairFace |
| Attributes | [AffectNet](#affectnet) | Emotion labels | Emotion |
---
## Training Datasets
### Face Detection
#### WIDER FACE
Large-scale face detection benchmark with images across 61 event categories. Contains faces with a high degree of variability in scale, pose, occlusion, expression, and illumination.
| Property | Value |
| -------- | ------------------------------------------- |
| Images | ~32,000 (train/val/test split) |
| Faces | ~394,000 annotated |
| Subsets | Easy, Medium, Hard |
| Used by | RetinaFace, SCRFD, YOLOv5-Face, YOLOv8-Face |
!!! info "Download & References"
**Paper**: [WIDER FACE: A Face Detection Benchmark](https://arxiv.org/abs/1511.06523)
**Download**: [http://shuoyang1213.me/WIDERFACE/](http://shuoyang1213.me/WIDERFACE/)
---
### Face Recognition
#### MS1MV2
Refined version of the MS-Celeb-1M dataset, cleaned by InsightFace. Widely used for training face recognition models.
| Property | Value |
| ---------- | ------------------------------ |
| Identities | 85.7K |
| Images | 5.8M |
| Format | Aligned and cropped to 112x112 |
| Used by | MobileFace, SphereFace |
!!! info "Download"
**Kaggle (aligned 112x112)**: [ms1m-arcface-dataset](https://www.kaggle.com/datasets/yakhyokhuja/ms1m-arcface-dataset) (from InsightFace)
**Training code**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition)
---
#### WebFace600K
Medium-scale face recognition dataset from the WebFace series.
| Property | Value |
| -------- | ------- |
| Images | ~600K |
| Used by | ArcFace |
!!! info "Source"
**Origin**: [InsightFace](https://github.com/deepinsight/insightface)
**Paper**: [ArcFace: Additive Angular Margin Loss for Deep Face Recognition](https://arxiv.org/abs/1801.07698)
---
#### WebFace4M / WebFace12M
Large-scale face recognition datasets from the WebFace260M collection. Used for training AdaFace models with adaptive quality-aware margin.
| Property | WebFace4M | WebFace12M |
| -------- | ------------- | -------------- |
| Images | ~4M | ~12M |
| Used by | AdaFace IR_18 | AdaFace IR_101 |
!!! info "Source"
**Paper**: [AdaFace: Quality Adaptive Margin for Face Recognition](https://arxiv.org/abs/2204.00964)
**Original code**: [mk-minchul/AdaFace](https://github.com/mk-minchul/AdaFace)
---
#### CASIA-WebFace
Smaller-scale face recognition dataset suitable for academic research and lighter training runs.
| Property | Value |
| ---------- | ------------------------------ |
| Identities | 10.6K |
| Images | 491K |
| Format | Aligned and cropped to 112x112 |
| Used by | Alternative training set |
!!! info "Download"
**Kaggle (aligned 112x112)**: [webface-112x112](https://www.kaggle.com/datasets/yakhyokhuja/webface-112x112) (from OpenSphere)
---
#### VGGFace2
Large-scale dataset with wide variations in pose, age, illumination, ethnicity, and profession.
| Property | Value |
| ---------- | ------------------------------ |
| Identities | 8.6K |
| Images | 3.1M |
| Format | Aligned and cropped to 112x112 |
| Used by | Alternative training set |
!!! info "Download"
**Kaggle (aligned 112x112)**: [vggface2-112x112](https://www.kaggle.com/datasets/yakhyokhuja/vggface2-112x112) (from OpenSphere)
---
### Facial Landmarks
#### WFLW
Wider Facial Landmarks in-the-Wild — a 98-point landmark dataset whose images come from
WIDER FACE. Used to train the supervised PIPNet 98-point variant shipped with UniFace.
| Property | Value |
| ---------- | -------------------------------------- |
| Images | 10,000 (7,500 train / 2,500 test) |
| Annotation | 98 manually labeled landmarks per face |
| Used by | PIPNet WFLW-98 |
!!! info "Reference"
**Project page**: [WFLW dataset](https://wywu.github.io/projects/LAB/WFLW.html)
---
#### 300W + CelebA
The 68-point PIPNet variant is trained in a generalizable semi-supervised setting (GSSL):
labeled images come from 300W and unlabeled images come from CelebA.
| Property | Value |
| --------------- | -------------------------------------------------------------------------------- |
| Labeled images | 3,837 (3,148 train: LFPW train + HELEN train + AFW; 689 test: LFPW test + HELEN test + iBUG) |
| Unlabeled | 202,599 (full CelebA; bounding boxes from RetinaFace per the PIPNet paper) |
| Annotation | 68-point iBUG layout |
| Used by | PIPNet 300W+CelebA-68 |
!!! info "Reference"
**Paper**: [PIPNet (Pixel-in-Pixel Net)](https://arxiv.org/abs/2003.03771) (IJCV 2021)
---
### Gaze Estimation
#### Gaze360
Large-scale gaze estimation dataset collected in indoor and outdoor environments with diverse head poses and wide gaze ranges (up to 360 degrees).
| Property | Value |
| ----------- | --------------------- |
| Subjects | 238 |
| Environment | Indoor and outdoor |
| Used by | All MobileGaze models |
!!! info "Download & Preprocessing"
**Download**: [gaze360.csail.mit.edu/download.php](https://gaze360.csail.mit.edu/download.php)
**Preprocessing**: [GazeHub - Gaze360](https://phi-ai.buaa.edu.cn/Gazehub/3D-dataset/#gaze360)
!!! note "UniFace Models"
All MobileGaze models shipped with UniFace are trained exclusively on Gaze360 for 200 epochs.
**Dataset structure:**
```
data/
└── Gaze360/
├── Image/
└── Label/
```
---
#### MPIIFaceGaze
Dataset for appearance-based gaze estimation from laptop webcam images of participants during everyday laptop usage. Supported by the gaze estimation training code but not used for the UniFace pretrained weights.
| Property | Value |
| ----------- | ---------------------------------------- |
| Subjects | 15 |
| Environment | Everyday laptop usage |
| Used by | Supported (not used for UniFace weights) |
!!! info "Download & Preprocessing"
**Download**: [MPIIFaceGaze download page](https://www.mpi-inf.mpg.de/departments/computer-vision-and-machine-learning/research/gaze-based-human-computer-interaction/its-written-all-over-your-face-full-face-appearance-based-gaze-estimation)
**Preprocessing**: [GazeHub - MPIIFaceGaze](https://phi-ai.buaa.edu.cn/Gazehub/3D-dataset/#mpiifacegaze)
**Dataset structure:**
```
data/
└── MPIIFaceGaze/
├── Image/
└── Label/
```
---
### Head Pose Estimation
#### 300W-LP
Large-scale synthesized face dataset with large pose variations, generated from 300W by face profiling. Used for training head pose estimation models.
| Property | Value |
| ----------- | ----------------------------- |
| Images | ~122,000 (synthesized) |
| Source | 300W (profiled) |
| Pose range | ±90° yaw |
| Evaluation | AFLW2000 |
| Used by | All HeadPose models |
!!! info "Download & Reference"
**Paper**: [Face Alignment Across Large Poses: A 3D Solution](https://arxiv.org/abs/1511.07212)
**Training code**: [yakhyo/head-pose-estimation](https://github.com/yakhyo/head-pose-estimation)
!!! note "UniFace Models"
All HeadPose models shipped with UniFace are trained on 300W-LP and evaluated on AFLW2000.
---
### Face Parsing
#### CelebAMask-HQ
High-quality face parsing dataset with pixel-level annotations for 19 facial component classes.
| Property | Value |
| ---------- | ---------------------------- |
| Images | 30,000 |
| Classes | 19 facial components |
| Resolution | High quality |
| Used by | BiSeNet (ResNet18, ResNet34) |
!!! info "Source"
**GitHub**: [switchablenorms/CelebAMask-HQ](https://github.com/switchablenorms/CelebAMask-HQ)
**Training code**: [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing)
**Dataset structure:**
```
dataset/
├── images/ # Input face images
│ ├── image1.jpg
│ └── ...
└── labels/ # Segmentation masks
├── image1.png
└── ...
```
---
### Attribute Analysis
#### CelebA
Large-scale face attributes dataset widely used for training age and gender prediction models.
| Property | Value |
| ---------- | -------------------- |
| Images | ~200K |
| Attributes | 40 binary attributes |
| Used by | AgeGender |
!!! info "Reference"
**Paper**: [Deep Learning Face Attributes in the Wild](https://arxiv.org/abs/1411.7766)
---
#### FairFace
Face attribute dataset designed for balanced representation across race, gender, and age groups. Provides more equitable predictions compared to imbalanced datasets.
| Property | Value |
| ---------- | ----------------------------------- |
| Attributes | Race (7), Gender (2), Age Group (9) |
| Used by | FairFace |
| License | CC BY 4.0 |
!!! info "Reference"
**Paper**: [FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age](https://arxiv.org/abs/1908.04913)
**ONNX inference**: [yakhyo/fairface-onnx](https://github.com/yakhyo/fairface-onnx)
---
#### AffectNet
Large-scale facial expression dataset for emotion recognition training.
| Property | Value |
| -------- | ----------------------------------------------------------------------- |
| Classes | 7 or 8 (Neutral, Happy, Sad, Surprise, Fear, Disgust, Angry + Contempt) |
| Used by | Emotion (AFFECNET7, AFFECNET8) |
!!! info "Reference"
**Paper**: [AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild](https://ieeexplore.ieee.org/document/8013713)
---
## Evaluation Benchmarks
### Face Detection
#### WIDER FACE Validation Set
The standard benchmark for face detection models. Results are reported across three difficulty subsets.
| Subset | Criteria |
| ------ | --------------------------------------------- |
| Easy | Large, clear, unoccluded faces |
| Medium | Moderate scale and occlusion |
| Hard | Small, heavily occluded, or challenging faces |
See [Model Zoo - Detection](models.md#face-detection-models) for per-model accuracy on each subset.
---
### Face Recognition
Recognition models are evaluated across multiple benchmarks. Aligned 112x112 validation datasets are available as a single download.
!!! info "Download"
**Kaggle**: [agedb-30-calfw-cplfw-lfw-aligned-112x112](https://www.kaggle.com/datasets/yakhyokhuja/agedb-30-calfw-cplfw-lfw-aligned-112x112)
| Benchmark | Description | Used by |
| ------------ | ----------------------------------------------------------------- | ------------------------------- |
| **LFW** | Labeled Faces in the Wild - standard face verification benchmark | ArcFace, MobileFace, SphereFace |
| **CALFW** | Cross-Age LFW - face verification across age gaps | MobileFace, SphereFace |
| **CPLFW** | Cross-Pose LFW - face verification across pose variations | MobileFace, SphereFace |
| **AgeDB-30** | Age database with 30-year age gaps | ArcFace, MobileFace, SphereFace |
| **CFP-FP** | Celebrities in Frontal-Profile - frontal vs. profile verification | ArcFace |
| **IJB-B** | IARPA Janus Benchmark B - TAR@FAR=0.01% | AdaFace |
| **IJB-C** | IARPA Janus Benchmark C - TAR@FAR=1e-4 | AdaFace, ArcFace |
See [Model Zoo - Recognition](models.md#face-recognition-models) for per-model accuracy on each benchmark.
---
### Gaze Estimation
| Benchmark | Metric | Description |
| -------------------- | ------------- | -------------------------------------------- |
| **Gaze360 test set** | MAE (degrees) | Mean Absolute Error in gaze angle prediction |
See [Model Zoo - Gaze](models.md#gaze-estimation-models) for per-model MAE scores.
---
## Training Repositories
For training your own models or reproducing results, see the following repositories:
| Task | Repository | Datasets Supported |
| ----------- | ------------------------------------------------------------------------- | ------------------------------- |
| Detection | [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) | WIDER FACE |
| Recognition | [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) | MS1MV2, CASIA-WebFace, VGGFace2 |
| Gaze | [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) | Gaze360, MPIIFaceGaze |
| Parsing | [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) | CelebAMask-HQ |

View File

@@ -10,7 +10,7 @@ template: home.html
# UniFace { .hero-title }
<p class="hero-subtitle">All-in-One Open-Source Face Analysis Library</p>
<p class="hero-subtitle">A Unified Face Analysis Library for Python</p>
[![PyPI Version](https://img.shields.io/pypi/v/uniface.svg?label=Version)](https://pypi.org/project/uniface/)
[![Python Version](https://img.shields.io/badge/Python-3.10%2B-blue)](https://www.python.org/)
@@ -20,7 +20,7 @@ template: home.html
[![Kaggle Badge](https://img.shields.io/badge/Notebooks-Kaggle?label=Kaggle&color=blue)](https://www.kaggle.com/yakhyokhuja/code)
[![Discord](https://img.shields.io/badge/Discord-Join%20Server-5865F2?logo=discord&logoColor=white)](https://discord.gg/wdzrjr7R5j)
<!-- <img src="https://raw.githubusercontent.com/yakhyo/uniface/main/.github/logos/new/uniface_rounded_q80.webp" alt="UniFace - All-in-One Open-Source Face Analysis Library" style="max-width: 70%; margin: 1rem 0;"> -->
<!-- <img src="https://raw.githubusercontent.com/yakhyo/uniface/main/.github/logos/uniface_rounded_q80.webp" alt="UniFace - A Unified Face Analysis Library for Python" style="max-width: 70%; margin: 1rem 0;"> -->
[Get Started](quickstart.md){ .md-button .md-button--primary }
[View on GitHub](https://github.com/yakhyo/uniface){ .md-button }
@@ -31,17 +31,17 @@ template: home.html
<div class="feature-card" markdown>
### :material-face-recognition: Face Detection
ONNX-optimized detectors (RetinaFace, SCRFD, YOLO) with 5-point landmarks.
RetinaFace, SCRFD, and YOLO detectors with 5-point landmarks.
</div>
<div class="feature-card" markdown>
### :material-account-check: Face Recognition
AdaFace, ArcFace, MobileFace, and SphereFace embeddings for identity verification.
AdaFace, ArcFace, EdgeFace, MobileFace, and SphereFace embeddings for identity verification.
</div>
<div class="feature-card" markdown>
### :material-map-marker: Landmarks
Accurate 106-point facial landmark localization for detailed face analysis.
Dense facial landmark localization — 106-point (2d106det) and 98 / 68-point (PIPNet) variants.
</div>
<div class="feature-card" markdown>
@@ -59,6 +59,11 @@ BiSeNet semantic segmentation with 19 facial component classes.
Real-time gaze direction prediction with MobileGaze models.
</div>
<div class="feature-card" markdown>
### :material-axis-arrow: Head Pose
3D head orientation (pitch, yaw, roll) estimation with 6D rotation models.
</div>
<div class="feature-card" markdown>
### :material-motion-play: Tracking
Multi-object tracking with BYTETracker for persistent face IDs across video frames.
@@ -74,31 +79,35 @@ Face liveness detection with MiniFASNet to prevent fraud.
Face anonymization with 5 blur methods for privacy protection.
</div>
<div class="feature-card" markdown>
### :material-database-search: Vector Indexing
FAISS-backed embedding store for fast multi-identity face search.
</div>
</div>
---
## Installation
=== "Standard"
UniFace uses portable model runtimes for consistent inference across macOS, Linux, and Windows. Most core components run through **ONNX Runtime**, while optional components may use **PyTorch** where appropriate.
```bash
pip install uniface
```
**CPU / Apple Silicon**
```bash
pip install uniface[cpu]
```
=== "GPU (CUDA)"
**GPU (NVIDIA CUDA)**
```bash
pip install uniface[gpu]
```
```bash
pip install uniface[gpu]
```
=== "From Source"
```bash
git clone https://github.com/yakhyo/uniface.git
cd uniface
pip install -e .
```
**From Source**
```bash
git clone https://github.com/yakhyo/uniface.git
cd uniface
pip install -e ".[cpu]" # or .[gpu] for CUDA
```
---

View File

@@ -11,15 +11,27 @@ This guide covers all installation options for UniFace.
---
## Why Two Extras?
`onnxruntime` (CPU) and `onnxruntime-gpu` (CUDA) both own the same Python namespace.
Installing both at the same time causes file conflicts and silent provider mismatches.
UniFace exposes them as separate, mutually exclusive extras so you install exactly one.
---
## Quick Install
The simplest way to install UniFace:
=== "CPU / Apple Silicon"
```bash
pip install uniface
```
```bash
pip install uniface[cpu]
```
This installs the CPU version with all core dependencies.
=== "NVIDIA GPU (CUDA)"
```bash
pip install uniface[gpu]
```
---
@@ -27,14 +39,16 @@ This installs the CPU version with all core dependencies.
### macOS (Apple Silicon - M1/M2/M3/M4)
For Apple Silicon Macs, the standard installation automatically includes ARM64 optimizations:
The `[cpu]` extra pulls in the standard `onnxruntime` package, which has native ARM64 support
built in since version 1.13. No additional setup is needed for CoreML acceleration.
```bash
pip install uniface
pip install uniface[cpu]
```
!!! tip "Native Performance"
The base `onnxruntime` package has native Apple Silicon support with ARM64 optimizations built-in since version 1.13+. No additional configuration needed.
`onnxruntime` 1.13+ includes ARM64 optimizations out of the box.
UniFace automatically detects and enables `CoreMLExecutionProvider` on Apple Silicon.
Verify ARM64 installation:
@@ -47,19 +61,22 @@ python -c "import platform; print(platform.machine())"
### Linux/Windows with NVIDIA GPU
For CUDA acceleration on NVIDIA GPUs:
```bash
pip install uniface[gpu]
```
This installs `onnxruntime-gpu`, which includes both `CUDAExecutionProvider` and
`CPUExecutionProvider` — no separate CPU package is needed.
**Requirements:**
- CUDA 11.x or 12.x
- NVIDIA driver compatible with your CUDA version
- CUDA 11.x or 12.x toolkit
- cuDNN 8.x
!!! info "CUDA Compatibility"
See [ONNX Runtime GPU requirements](https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html) for detailed compatibility matrix.
See the [ONNX Runtime GPU compatibility matrix](https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html)
for matching CUDA and cuDNN versions.
Verify GPU installation:
@@ -74,7 +91,7 @@ print("Available providers:", ort.get_available_providers())
### CPU-Only (All Platforms)
```bash
pip install uniface
pip install uniface[cpu]
```
Works on all platforms with automatic CPU fallback.
@@ -88,31 +105,60 @@ For development or the latest features:
```bash
git clone https://github.com/yakhyo/uniface.git
cd uniface
pip install -e .
pip install -e ".[cpu]" # CPU / Apple Silicon
pip install -e ".[gpu]" # NVIDIA GPU
```
With development dependencies:
```bash
pip install -e ".[dev]"
pip install -e ".[cpu,dev]"
```
---
## FAISS Vector Store
For fast multi-identity face search using a FAISS vector store:
```bash
pip install faiss-cpu # CPU
pip install faiss-gpu # NVIDIA GPU (CUDA)
```
See the [Stores module](modules/stores.md) for usage.
---
## Dependencies
UniFace has minimal dependencies:
UniFace has minimal core dependencies:
| Package | Purpose |
|---------|---------|
| `numpy` | Array operations |
| `opencv-python` | Image processing |
| `onnx` | ONNX model format support |
| `onnxruntime` | Model inference |
| `scikit-image` | Geometric transforms |
| `scipy` | Signal processing |
| `requests` | Model download |
| `tqdm` | Progress bars |
**Runtime extras (install exactly one):**
| Extra | Package | Use case |
|-------|---------|---------|
| `uniface[cpu]` | `onnxruntime` | CPU inference, Apple Silicon |
| `uniface[gpu]` | `onnxruntime-gpu` | NVIDIA CUDA inference |
**Other optional packages:**
| Package | Install | Purpose |
|---------|---------|---------|
| `faiss-cpu` / `faiss-gpu` | `pip install faiss-cpu` | FAISS vector store |
| `torch` | `pip install torch` | Emotion model (TorchScript) |
| `torchvision` | `pip install torchvision` | Faster NMS for YOLO detectors |
---
## Verify Installation
@@ -135,17 +181,81 @@ print("Installation successful!")
---
## Upgrading
When upgrading UniFace, stay consistent with your runtime extra:
```bash
pip install --upgrade uniface[cpu] # or uniface[gpu]
```
If you are switching from CPU to GPU (or vice versa):
```bash
pip uninstall onnxruntime onnxruntime-gpu -y
pip install uniface[gpu] # install the one you want
```
---
## Pre-release Versions
UniFace ships release candidates and betas to PyPI ahead of stable releases (versions like `0.7.0rc1`, `0.7.0b1`, `0.7.0a1`). These let you try upcoming features before they're finalized.
`pip install uniface` always installs the latest **stable** release. To opt in to pre-releases:
```bash
# Latest pre-release (if newer than latest stable)
pip install uniface[cpu] --pre
# A specific pre-release
pip install uniface[cpu]==0.7.0rc1
```
Pre-releases are not recommended for production — APIs may still change before the stable release.
---
## Troubleshooting
### onnxruntime Not Found
If you see:
```
ImportError: onnxruntime is not installed. Install it with one of:
pip install uniface[cpu] # CPU / Apple Silicon
pip install uniface[gpu] # NVIDIA GPU (CUDA)
```
You installed uniface without an extra. Run the appropriate command above.
---
### Both onnxruntime and onnxruntime-gpu Installed
If you previously ran `pip install uniface[gpu]` on top of a `pip install uniface[cpu]`
(or vice versa), you may have both packages installed simultaneously, which causes conflicts.
Fix it with:
```bash
pip uninstall onnxruntime onnxruntime-gpu -y
pip install uniface[gpu] # or uniface[cpu]
```
---
### Import Errors
If you encounter import errors, ensure you're using Python 3.10+:
Ensure you're using Python 3.10+:
```bash
python --version
# Should show: Python 3.10.x or higher
```
---
### Model Download Issues
Models are automatically downloaded on first use. If downloads fail:
@@ -159,6 +269,25 @@ model_path = verify_model_weights(RetinaFaceWeights.MNET_V2)
print(f"Model downloaded to: {model_path}")
```
---
### CUDA Not Detected
1. Verify CUDA installation:
```bash
nvidia-smi
```
2. Check CUDA version compatibility with ONNX Runtime.
3. Reinstall the GPU extra cleanly:
```bash
pip uninstall onnxruntime onnxruntime-gpu -y
pip install uniface[gpu]
```
---
### Performance Issues on Mac
Verify you're using the ARM64 build (not x86_64 via Rosetta):

View File

@@ -20,5 +20,7 @@ UniFace is released under the [MIT License](https://opensource.org/licenses/MIT)
| SphereFace | [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) | MIT |
| BiSeNet | [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) | MIT |
| MobileGaze | [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) | MIT |
| MODNet | [yakhyo/modnet](https://github.com/yakhyo/modnet) | Apache-2.0 |
| MiniFASNet | [yakhyo/face-anti-spoofing](https://github.com/yakhyo/face-anti-spoofing) | Apache-2.0 |
| FairFace | [yakhyo/fairface-onnx](https://github.com/yakhyo/fairface-onnx) | CC BY 4.0 |
| PIPNet | [yakhyo/pipnet-onnx](https://github.com/yakhyo/pipnet-onnx) — meanface tables vendored from [jhb86253817/PIPNet](https://github.com/jhb86253817/PIPNet) | MIT |

View File

@@ -8,7 +8,7 @@ Complete guide to all available models and their performance characteristics.
### RetinaFace Family
RetinaFace models are trained on the WIDER FACE dataset.
RetinaFace models are trained on the [WIDER FACE](datasets.md#wider-face) dataset.
| Model Name | Params | Size | Easy | Medium | Hard |
| -------------- | ------ | ----- | ------ | ------ | ------ |
@@ -28,7 +28,7 @@ RetinaFace models are trained on the WIDER FACE dataset.
### SCRFD Family
SCRFD (Sample and Computation Redistribution for Efficient Face Detection) models trained on WIDER FACE dataset.
SCRFD (Sample and Computation Redistribution for Efficient Face Detection) models trained on [WIDER FACE](datasets.md#wider-face) dataset.
| Model Name | Params | Size | Easy | Medium | Hard |
| ---------------- | ------ | ----- | ------ | ------ | ------ |
@@ -44,7 +44,7 @@ SCRFD (Sample and Computation Redistribution for Efficient Face Detection) model
### YOLOv5-Face Family
YOLOv5-Face models provide detection with 5-point facial landmarks, trained on WIDER FACE dataset.
YOLOv5-Face models provide detection with 5-point facial landmarks, trained on [WIDER FACE](datasets.md#wider-face) dataset.
| Model Name | Size | Easy | Medium | Hard |
| -------------- | ---- | ------ | ------ | ------ |
@@ -93,7 +93,7 @@ Face recognition using adaptive margin based on image quality.
| `IR_101` | IR-101 | WebFace12M | 249 MB | - | 97.66% |
!!! info "Training Data & Accuracy"
**Dataset**: WebFace4M (4M images) / WebFace12M (12M images)
**Dataset**: [WebFace4M / WebFace12M](datasets.md#webface4m--webface12m) (4M / 12M images)
**Accuracy**: IJB-B and IJB-C benchmarks, TAR@FAR=0.01%
@@ -113,7 +113,7 @@ Face recognition using additive angular margin loss.
| `RESNET` | ResNet50 | 43.6M | 166MB | 99.83% | 99.33% | 98.23% | 97.25% |
!!! info "Training Data"
**Dataset**: Trained on WebFace600K (600K images)
**Dataset**: Trained on [WebFace600K](datasets.md#webface600k) (600K images)
**Accuracy**: IJB-C accuracy reported as TAR@FAR=1e-4
@@ -131,7 +131,7 @@ Lightweight face recognition models with MobileNet backbones.
| `MNET_V3_LARGE` | MobileNetV3-L | 3.52M | 10MB | 99.53% | 94.56% | 86.79% | 95.13% |
!!! info "Training Data"
**Dataset**: Trained on MS1M-V2 (5.8M images, 85K identities)
**Dataset**: Trained on [MS1MV2](datasets.md#ms1mv2) (5.8M images, 85K identities)
**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks
@@ -147,7 +147,7 @@ Face recognition using angular softmax loss.
| `SPHERE36` | Sphere36 | 34.6M | 92MB | 99.72% | 95.64% | 89.92% | 96.83% |
!!! info "Training Data"
**Dataset**: Trained on MS1M-V2 (5.8M images, 85K identities)
**Dataset**: Trained on [MS1MV2](datasets.md#ms1mv2) (5.8M images, 85K identities)
**Accuracy**: Evaluated on LFW, CALFW, CPLFW, and AgeDB-30 benchmarks
@@ -156,6 +156,24 @@ Face recognition using angular softmax loss.
---
### EdgeFace
Efficient face recognition designed for edge devices, using EdgeNeXt backbone with optional LoRA compression.
| Model Name | Backbone | Params | MFLOPs | Size | LFW | CALFW | CPLFW | CFP-FP | AgeDB-30 |
| --------------- | -------- | ------ | ------ | ----- | ------ | ------ | ------ | ------ | -------- |
| `XXS` :material-check-circle: | EdgeNeXt | 1.24M | 94 | ~5 MB | 99.57% | 94.83% | 90.27% | 93.63% | 94.92% |
| `XS_GAMMA_06` | EdgeNeXt | 1.77M | 154 | ~7 MB | 99.73% | 95.28% | 91.58% | 94.71% | 96.08% |
| `S_GAMMA_05` | EdgeNeXt | 3.65M | 306 | ~14 MB | 99.78% | 95.55% | 92.48% | 95.74% | 97.03% |
| `BASE` | EdgeNeXt | 18.2M | 1399 | ~70 MB | 99.83% | 96.07% | 93.75% | 97.01% | 97.60% |
!!! info "Training Data & Reference"
**Paper**: [EdgeFace: Efficient Face Recognition Model for Edge Devices](https://arxiv.org/abs/2307.01838v2) (IEEE T-BIOM 2024)
**Source**: [github.com/otroshi/edgeface](https://github.com/otroshi/edgeface) | [github.com/yakhyo/edgeface-onnx](https://github.com/yakhyo/edgeface-onnx)
---
## Facial Landmark Models
### 106-Point Landmark Detection
@@ -178,6 +196,26 @@ Facial landmark localization model.
---
### PIPNet (98 / 68 points)
PIPNet (Pixel-in-Pixel Net) facial landmark detector. ResNet-18 backbone, 256×256 input.
| Model Name | Points | Backbone | Dataset | Size |
| ---------- | ------ | -------- | ------- | ---- |
| `WFLW_98` :material-check-circle: | 98 | ResNet-18 | WFLW (supervised) | 47 MB |
| `DW300_CELEBA_68` | 68 | ResNet-18 | 300W+CelebA (GSSL) | 46 MB |
!!! info "Reference"
**Paper**: [PIPNet: Towards Efficient Facial Landmark Detection in the Wild](https://arxiv.org/abs/2003.03771) (IJCV 2021)
**Source**: [yakhyo/pipnet-onnx](https://github.com/yakhyo/pipnet-onnx) — ONNX export from [jhb86253817/PIPNet](https://github.com/jhb86253817/PIPNet)
!!! note "Auto-selected meanface"
Both variants share the same architecture; the number of landmarks (and the matching
meanface table) is inferred from the ONNX output channel count.
---
## Attribute Analysis Models
### Age & Gender Detection
@@ -187,7 +225,7 @@ Facial landmark localization model.
| `AgeGender` | Age, Gender | 2.1M | 8MB |
!!! info "Training Data"
**Dataset**: Trained on CelebA
**Dataset**: Trained on [CelebA](datasets.md#celeba)
!!! warning "Accuracy Note"
Accuracy varies by demographic and image quality. Test on your specific use case.
@@ -201,7 +239,7 @@ Facial landmark localization model.
| `FairFace` | Race, Gender, Age Group | - | 44MB |
!!! info "Training Data"
**Dataset**: Trained on FairFace dataset with balanced demographics
**Dataset**: Trained on [FairFace](datasets.md#fairface) dataset with balanced demographics
!!! tip "Equitable Predictions"
FairFace provides more equitable predictions across different racial and gender groups.
@@ -224,7 +262,7 @@ Facial landmark localization model.
**Classes (8)**: Above + Contempt
!!! info "Training Data"
**Dataset**: Trained on AffectNet
**Dataset**: Trained on [AffectNet](datasets.md#affectnet)
!!! note "Accuracy Note"
Emotion detection accuracy depends heavily on facial expression clarity and cultural context.
@@ -235,7 +273,7 @@ Facial landmark localization model.
### MobileGaze Family
Gaze direction prediction models trained on Gaze360 dataset. Returns pitch (vertical) and yaw (horizontal) angles in radians.
Gaze direction prediction models trained on [Gaze360](datasets.md#gaze360) dataset. Returns pitch (vertical) and yaw (horizontal) angles in radians.
| Model Name | Params | Size | MAE* |
| -------------- | ------ | ------- | ----- |
@@ -248,7 +286,7 @@ Gaze direction prediction models trained on Gaze360 dataset. Returns pitch (vert
*MAE (Mean Absolute Error) in degrees on Gaze360 test set - lower is better
!!! info "Training Data"
**Dataset**: Trained on Gaze360 (indoor/outdoor scenes with diverse head poses)
**Dataset**: Trained on [Gaze360](datasets.md#gaze360) (indoor/outdoor scenes with diverse head poses)
**Training**: 200 epochs with classification-based approach (binned angles)
@@ -257,6 +295,33 @@ Gaze direction prediction models trained on Gaze360 dataset. Returns pitch (vert
---
## Head Pose Estimation Models
### HeadPose Family
Head pose estimation models using 6D rotation representation. Trained on [300W-LP](datasets.md#300w-lp) dataset, evaluated on AFLW2000. Returns pitch, yaw, and roll angles in degrees.
| Model Name | Backbone | Size | MAE* |
| -------------- | -------- | ------- | ----- |
| `RESNET18` :material-check-circle: | ResNet18 | 43 MB | 5.22° |
| `RESNET34` | ResNet34 | 82 MB | 5.07° |
| `RESNET50` | ResNet50 | 91 MB | 4.83° |
| `MOBILENET_V2` | MobileNetV2 | 9.6 MB | 5.72° |
| `MOBILENET_V3_SMALL` | MobileNetV3-Small | 4.8 MB | 6.31° |
| `MOBILENET_V3_LARGE` | MobileNetV3-Large | 16 MB | 5.58° |
*MAE (Mean Absolute Error) in degrees on AFLW2000 test set — lower is better
!!! info "Training Data"
**Dataset**: Trained on [300W-LP](datasets.md#300w-lp) (synthesized large-pose faces from 300W)
**Method**: 6D rotation representation (rotation matrix → Euler angles)
!!! note "Input Requirements"
Requires face crop as input. Use face detection first to obtain bounding boxes.
---
## Face Parsing Models
### BiSeNet Family
@@ -269,7 +334,7 @@ BiSeNet (Bilateral Segmentation Network) models for semantic face parsing. Segme
| `RESNET34` | 24.1M | 89.2 MB | 19 |
!!! info "Training Data"
**Dataset**: Trained on CelebAMask-HQ
**Dataset**: Trained on [CelebAMask-HQ](datasets.md#celebamask-hq)
**Architecture**: BiSeNet with ResNet backbone
@@ -326,6 +391,36 @@ XSeg from DeepFaceLab outputs masks for face regions. Requires 5-point landmarks
---
## Portrait Matting Models
### MODNet
MODNet (Real-Time Trimap-Free Portrait Matting) produces soft alpha mattes from full images without requiring a trimap. Uses MobileNetV2 backbone with low-resolution, high-resolution, and fusion branches.
| Model Name | Variant | Size | Use Case |
| ---------- | ------- | ---- | -------- |
| `PHOTOGRAPHIC` :material-check-circle: | High-quality | 25 MB | Portrait photos |
| `WEBCAM` | Real-time | 25 MB | Webcam feeds |
!!! info "Model Details"
**Paper**: [MODNet: Real-Time Trimap-Free Portrait Matting via Objective Decomposition](https://arxiv.org/abs/2011.11961) (AAAI 2022)
**Source**: [yakhyo/modnet](https://github.com/yakhyo/modnet) — ported weights and clean inference codebase
**Output**: Alpha matte `(H, W)` in `[0, 1]`
**Applications:**
- Background removal / replacement
- Green screen compositing
- Video conferencing virtual backgrounds
- Portrait editing
!!! note "Input Requirements"
Operates on full images (not face crops). No trimap or face detection required.
---
## Anti-Spoofing Models
### MiniFASNet Family
@@ -372,10 +467,13 @@ See [Model Cache & Offline Use](concepts/model-cache-offline.md) for full detail
- **AdaFace ONNX**: [yakhyo/adaface-onnx](https://github.com/yakhyo/adaface-onnx) - ONNX export and inference
- **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) - ArcFace, MobileFace, SphereFace training code
- **Gaze Estimation Training**: [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) - MobileGaze training code and pretrained weights
- **Head Pose Estimation**: [yakhyo/head-pose-estimation](https://github.com/yakhyo/head-pose-estimation) - 6D rotation head pose estimation training and ONNX models
- **Face Parsing Training**: [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) - BiSeNet training code and pretrained weights
- **Face Segmentation**: [yakhyo/face-segmentation](https://github.com/yakhyo/face-segmentation) - XSeg ONNX Inference
- **Portrait Matting**: [yakhyo/modnet](https://github.com/yakhyo/modnet) - MODNet ported weights and inference (from [ZHKKKe/MODNet](https://github.com/ZHKKKe/MODNet))
- **Face Anti-Spoofing**: [yakhyo/face-anti-spoofing](https://github.com/yakhyo/face-anti-spoofing) - MiniFASNet ONNX inference (weights from [minivision-ai/Silent-Face-Anti-Spoofing](https://github.com/minivision-ai/Silent-Face-Anti-Spoofing))
- **FairFace**: [yakhyo/fairface-onnx](https://github.com/yakhyo/fairface-onnx) - FairFace ONNX inference for race, gender, age prediction
- **PIPNet**: [yakhyo/pipnet-onnx](https://github.com/yakhyo/pipnet-onnx) - PIPNet ONNX export and inference (from [jhb86253817/PIPNet](https://github.com/jhb86253817/PIPNet))
- **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface) - Model architectures and pretrained weights
### Papers
@@ -386,4 +484,6 @@ See [Model Cache & Offline Use](concepts/model-cache-offline.md) for full detail
- **AdaFace**: [AdaFace: Quality Adaptive Margin for Face Recognition](https://arxiv.org/abs/2204.00964)
- **ArcFace**: [Additive Angular Margin Loss for Deep Face Recognition](https://arxiv.org/abs/1801.07698)
- **SphereFace**: [Deep Hypersphere Embedding for Face Recognition](https://arxiv.org/abs/1704.08063)
- **MODNet**: [Real-Time Trimap-Free Portrait Matting via Objective Decomposition](https://arxiv.org/abs/2011.11961)
- **BiSeNet**: [Bilateral Segmentation Network for Real-time Semantic Segmentation](https://arxiv.org/abs/1808.00897)
- **PIPNet**: [Towards Efficient Facial Landmark Detection in the Wild](https://arxiv.org/abs/2003.03771)

View File

@@ -2,6 +2,11 @@
Facial attribute analysis for age, gender, race, and emotion detection.
<figure markdown="span">
![Age & Gender Prediction](https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/age_gender.jpg){ width="100%" }
<figcaption>Age and gender prediction with detection bounding boxes</figcaption>
</figure>
---
## Available Models
@@ -30,9 +35,10 @@ age_gender = AgeGender()
faces = detector.detect(image)
for face in faces:
result = age_gender.predict(image, face.bbox)
result = age_gender.predict(image, face)
print(f"Gender: {result.sex}") # "Female" or "Male"
print(f"Age: {result.age} years")
# face.gender and face.age are also set automatically
```
### Output
@@ -64,10 +70,11 @@ fairface = FairFace()
faces = detector.detect(image)
for face in faces:
result = fairface.predict(image, face.bbox)
result = fairface.predict(image, face)
print(f"Gender: {result.sex}")
print(f"Age Group: {result.age_group}")
print(f"Race: {result.race}")
# face.gender, face.age_group, face.race are also set automatically
```
### Output
@@ -132,7 +139,7 @@ emotion = Emotion(model_name=DDAMFNWeights.AFFECNET7)
faces = detector.detect(image)
for face in faces:
result = emotion.predict(image, face.landmarks)
result = emotion.predict(image, face)
print(f"Emotion: {result.emotion}")
print(f"Confidence: {result.confidence:.2%}")
```
@@ -179,6 +186,22 @@ emotion = Emotion(model_name=DDAMFNWeights.AFFECNET8)
---
## Factory Function
Use `create_attribute_predictor()` for dynamic model selection:
```python
from uniface import create_attribute_predictor
age_gender = create_attribute_predictor('age_gender')
fairface = create_attribute_predictor('fairface')
emotion = create_attribute_predictor('emotion')
```
Available model names: `'age_gender'`, `'fairface'`, `'emotion'`.
---
## Combining Models
### Full Attribute Analysis
@@ -195,10 +218,10 @@ faces = detector.detect(image)
for face in faces:
# Get exact age from AgeGender
ag_result = age_gender.predict(image, face.bbox)
ag_result = age_gender.predict(image, face)
# Get race from FairFace
ff_result = fairface.predict(image, face.bbox)
ff_result = fairface.predict(image, face)
print(f"Gender: {ag_result.sex}")
print(f"Exact Age: {ag_result.age}")
@@ -215,7 +238,7 @@ from uniface.detection import RetinaFace
analyzer = FaceAnalyzer(
RetinaFace(),
age_gender=AgeGender(),
attributes=[AgeGender()],
)
faces = analyzer.analyze(image)
@@ -257,7 +280,7 @@ def draw_attributes(image, face, result):
# Usage
for face in faces:
result = age_gender.predict(image, face.bbox)
result = age_gender.predict(image, face)
image = draw_attributes(image, face, result)
cv2.imwrite("attributes.jpg", image)

View File

@@ -2,6 +2,11 @@
Face detection is the first step in any face analysis pipeline. UniFace provides four detection models.
<figure markdown="span">
![Face Detection](https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/detection.jpg){ width="100%" }
<figcaption>SCRFD detection with corner-style bounding boxes and 5-point landmarks</figcaption>
</figure>
---
## Available Models
@@ -264,10 +269,8 @@ from uniface.draw import draw_detections
draw_detections(
image=image,
bboxes=[f.bbox for f in faces],
scores=[f.confidence for f in faces],
landmarks=[f.landmarks for f in faces],
vis_threshold=0.6
faces=faces,
vis_threshold=0.6,
)
cv2.imwrite("result.jpg", image)
@@ -288,6 +291,6 @@ python tools/detect.py --source image.jpg
## See Also
- [Recognition Module](recognition.md) - Extract embeddings from detected faces
- [Landmarks Module](landmarks.md) - Get 106-point landmarks
- [Landmarks Module](landmarks.md) - Get 106 / 98 / 68-point dense landmarks
- [Image Pipeline Recipe](../recipes/image-pipeline.md) - Complete detection workflow
- [Concepts: Thresholds](../concepts/thresholds-calibration.md) - Tuning detection parameters

View File

@@ -2,6 +2,11 @@
Gaze estimation predicts where a person is looking (pitch and yaw angles).
<figure markdown="span">
![Gaze Estimation](https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/gaze.jpg){ width="100%" }
<figcaption>Gaze direction arrows with pitch/yaw angle labels</figcaption>
</figure>
---
## Available Models
@@ -267,6 +272,7 @@ gaze = create_gaze_estimator() # Returns MobileGaze
## Next Steps
- [Head Pose Estimation](headpose.md) - 3D head orientation
- [Anti-Spoofing](spoofing.md) - Face liveness detection
- [Privacy](privacy.md) - Face anonymization
- [Video Recipe](../recipes/video-webcam.md) - Real-time processing

237
docs/modules/headpose.md Normal file
View File

@@ -0,0 +1,237 @@
# Head Pose Estimation
Head pose estimation predicts the 3D orientation of a person's head (pitch, yaw, and roll angles).
<figure markdown="span">
![Head Pose Estimation](https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/headpose.jpg){ width="100%" }
<figcaption>3D head pose visualization with pitch, yaw, and roll angles</figcaption>
</figure>
---
## Available Models
| Model | Backbone | Size | MAE* |
|-------|----------|------|------|
| **ResNet18** :material-check-circle: | ResNet18 | 43 MB | 5.22° |
| ResNet34 | ResNet34 | 82 MB | 5.07° |
| ResNet50 | ResNet50 | 91 MB | 4.83° |
| MobileNetV2 | MobileNetV2 | 9.6 MB | 5.72° |
| MobileNetV3-Small | MobileNetV3 | 4.8 MB | 6.31° |
| MobileNetV3-Large | MobileNetV3 | 16 MB | 5.58° |
*MAE = Mean Absolute Error on AFLW2000 test set (lower is better)
---
## Basic Usage
```python
import cv2
from uniface.detection import RetinaFace
from uniface.headpose import HeadPose
detector = RetinaFace()
head_pose = HeadPose()
image = cv2.imread("photo.jpg")
faces = detector.detect(image)
for face in faces:
# Crop face
x1, y1, x2, y2 = map(int, face.bbox)
face_crop = image[y1:y2, x1:x2]
if face_crop.size > 0:
# Estimate head pose
result = head_pose.estimate(face_crop)
print(f"Pitch: {result.pitch:.1f}°, Yaw: {result.yaw:.1f}°, Roll: {result.roll:.1f}°")
```
---
## Model Variants
```python
from uniface.headpose import HeadPose
from uniface.constants import HeadPoseWeights
# Default (ResNet18, recommended balance of speed and accuracy)
hp = HeadPose()
# Lightweight for mobile/edge
hp = HeadPose(model_name=HeadPoseWeights.MOBILENET_V3_SMALL)
# Higher accuracy
hp = HeadPose(model_name=HeadPoseWeights.RESNET50)
```
---
## Output Format
```python
result = head_pose.estimate(face_crop)
# HeadPoseResult dataclass
result.pitch # Rotation around X-axis in degrees
result.yaw # Rotation around Y-axis in degrees
result.roll # Rotation around Z-axis in degrees
```
### Angle Convention
```
pitch > 0 (looking down)
yaw < 0 ─────┼───── yaw > 0
(looking left) │ (looking right)
pitch < 0 (looking up)
roll > 0 = clockwise tilt
roll < 0 = counter-clockwise tilt
```
- **Pitch**: Rotation around X-axis (positive = looking down)
- **Yaw**: Rotation around Y-axis (positive = looking right)
- **Roll**: Rotation around Z-axis (positive = tilting clockwise)
---
## Visualization
### 3D Cube (default)
The default visualization draws a wireframe cube oriented to match the head pose.
```python
from uniface.draw import draw_head_pose
faces = detector.detect(image)
for face in faces:
x1, y1, x2, y2 = map(int, face.bbox)
face_crop = image[y1:y2, x1:x2]
if face_crop.size > 0:
result = head_pose.estimate(face_crop)
# Draw cube on image (default)
draw_head_pose(image, face.bbox, result.pitch, result.yaw, result.roll)
cv2.imwrite("headpose_output.jpg", image)
```
### Axis Visualization
```python
from uniface.draw import draw_head_pose
# X/Y/Z coordinate axes
draw_head_pose(image, face.bbox, result.pitch, result.yaw, result.roll, draw_type='axis')
```
### Low-Level Drawing Functions
```python
from uniface.draw import draw_head_pose_cube, draw_head_pose_axis
# Draw cube directly
draw_head_pose_cube(image, yaw=10.0, pitch=-5.0, roll=2.0, bbox=[100, 100, 250, 280])
# Draw axes directly
draw_head_pose_axis(image, yaw=10.0, pitch=-5.0, roll=2.0, bbox=[100, 100, 250, 280])
```
---
## Real-Time Head Pose Tracking
```python
import cv2
from uniface.detection import RetinaFace
from uniface.headpose import HeadPose
from uniface.draw import draw_head_pose
detector = RetinaFace()
head_pose = HeadPose()
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
if not ret:
break
faces = detector.detect(frame)
for face in faces:
x1, y1, x2, y2 = map(int, face.bbox)
face_crop = frame[y1:y2, x1:x2]
if face_crop.size > 0:
result = head_pose.estimate(face_crop)
draw_head_pose(frame, face.bbox, result.pitch, result.yaw, result.roll)
cv2.imshow("Head Pose Estimation", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
```
---
## Use Cases
### Driver Drowsiness Detection
```python
def is_head_drooping(result, pitch_threshold=-15):
"""Check if the head is drooping (looking down significantly)."""
return result.pitch < pitch_threshold
result = head_pose.estimate(face_crop)
if is_head_drooping(result):
print("Warning: Head drooping detected")
```
### Attention Monitoring
```python
def is_facing_forward(result, threshold=20):
"""Check if the person is facing roughly forward."""
return (
abs(result.pitch) < threshold
and abs(result.yaw) < threshold
and abs(result.roll) < threshold
)
result = head_pose.estimate(face_crop)
if is_facing_forward(result):
print("Facing forward")
else:
print("Looking away")
```
---
## Factory Function
```python
from uniface.headpose import create_head_pose_estimator
hp = create_head_pose_estimator() # Returns HeadPose
```
---
## Next Steps
- [Gaze Estimation](gaze.md) - Eye gaze direction
- [Anti-Spoofing](spoofing.md) - Face liveness detection
- [Video Recipe](../recipes/video-webcam.md) - Real-time processing

View File

@@ -2,6 +2,11 @@
Facial landmark detection provides precise localization of facial features.
<figure markdown="span">
![106-Point Landmarks](https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/landmarks.jpg){ width="50%" }
<figcaption>106-point facial landmark localization</figcaption>
</figure>
---
## Available Models
@@ -9,6 +14,8 @@ Facial landmark detection provides precise localization of facial features.
| Model | Points | Size |
|-------|--------|------|
| **Landmark106** | 106 | 14 MB |
| **PIPNet (WFLW-98)** | 98 | 47 MB |
| **PIPNet (300W+CelebA-68)** | 68 | 46 MB |
!!! info "5-Point Landmarks"
Basic 5-point landmarks are included with all detection models (RetinaFace, SCRFD, YOLOv5-Face, YOLOv8-Face).
@@ -74,6 +81,48 @@ mouth = landmarks[87:106]
---
## PIPNet (98 / 68 points)
PIPNet (Pixel-in-Pixel Net) is a high-accuracy facial landmark detector. UniFace ships
two ONNX variants that share a ResNet-18 backbone and 256×256 input — the only difference
is the number of points and the dataset they were trained on.
### Basic Usage
```python
from uniface.detection import RetinaFace
from uniface.landmark import PIPNet
detector = RetinaFace()
landmarker = PIPNet() # Default: 98 points (WFLW)
faces = detector.detect(image)
if faces:
landmarks = landmarker.get_landmarks(image, faces[0].bbox)
print(f"Landmarks shape: {landmarks.shape}") # (98, 2)
```
### 68-Point Variant (300W+CelebA, GSSL)
```python
from uniface.constants import PIPNetWeights
from uniface.landmark import PIPNet
landmarker = PIPNet(model_name=PIPNetWeights.DW300_CELEBA_68)
landmarks = landmarker.get_landmarks(image, face.bbox)
print(landmarks.shape) # (68, 2)
```
### Notes
- The number of landmarks is read from the ONNX output and the matching meanface
table is selected automatically — there is no `num_lms=` argument.
- PIPNet uses an asymmetric crop around the bbox (+10% left / right / bottom,
10% top) and ImageNet normalization. This is handled internally.
- Output landmarks are in original-image pixel coordinates as `float32`.
---
## 5-Point Landmarks (Detection)
All detection models provide 5-point landmarks:
@@ -237,9 +286,17 @@ def estimate_head_pose(landmarks, image_shape):
## Factory Function
```python
from uniface.constants import PIPNetWeights
from uniface.landmark import create_landmarker
landmarker = create_landmarker() # Returns Landmark106
# Default: 106-point InsightFace model
landmarker = create_landmarker()
# 98-point PIPNet (WFLW)
landmarker = create_landmarker('pipnet')
# 68-point PIPNet (300W+CelebA)
landmarker = create_landmarker('pipnet', model_name=PIPNetWeights.DW300_CELEBA_68)
```
---

157
docs/modules/matting.md Normal file
View File

@@ -0,0 +1,157 @@
# Portrait Matting
Portrait matting produces a soft alpha matte separating the foreground (person) from the background — no trimap needed.
<figure markdown="span">
![Portrait Matting](https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/matting.jpg){ width="100%" }
<figcaption>MODNet: Input → Matte → Green Screen</figcaption>
</figure>
---
## Available Models
| Model | Variant | Size | Use Case |
|-------|---------|------|----------|
| **MODNet Photographic** :material-check-circle: | PHOTOGRAPHIC | 25 MB | High-quality portrait photos |
| MODNet Webcam | WEBCAM | 25 MB | Real-time webcam feeds |
---
## Basic Usage
```python
import cv2
from uniface.matting import MODNet
matting = MODNet()
image = cv2.imread("photo.jpg")
matte = matting.predict(image)
print(f"Matte shape: {matte.shape}") # (H, W)
print(f"Matte dtype: {matte.dtype}") # float32
print(f"Matte range: [{matte.min():.2f}, {matte.max():.2f}]") # [0, 1]
```
---
## Model Variants
```python
from uniface.matting import MODNet
from uniface.constants import MODNetWeights
# Photographic (default) — best for photos
matting = MODNet()
# Webcam — optimized for real-time
matting = MODNet(model_name=MODNetWeights.WEBCAM)
# Custom input size
matting = MODNet(input_size=256)
```
| Parameter | Default | Description |
|-----------|---------|-------------|
| `model_name` | `PHOTOGRAPHIC` | Model variant to load |
| `input_size` | `512` | Target shorter-side size for preprocessing |
| `providers` | `None` | ONNX Runtime execution providers |
---
## Applications
### Transparent Background (RGBA)
```python
import cv2
import numpy as np
matting = MODNet()
image = cv2.imread("photo.jpg")
matte = matting.predict(image)
rgba = cv2.cvtColor(image, cv2.COLOR_BGR2BGRA)
rgba[:, :, 3] = (matte * 255).astype(np.uint8)
cv2.imwrite("transparent.png", rgba)
```
### Green Screen
```python
import numpy as np
matte_3ch = matte[:, :, np.newaxis]
bg = np.full_like(image, (0, 177, 64), dtype=np.uint8)
green = (image * matte_3ch + bg * (1 - matte_3ch)).astype(np.uint8)
cv2.imwrite("green_screen.jpg", green)
```
### Custom Background
```python
import cv2
import numpy as np
background = cv2.imread("beach.jpg")
background = cv2.resize(background, (image.shape[1], image.shape[0]))
matte_3ch = matte[:, :, np.newaxis]
result = (image * matte_3ch + background * (1 - matte_3ch)).astype(np.uint8)
cv2.imwrite("custom_bg.jpg", result)
```
### Webcam Matting
```python
import cv2
import numpy as np
from uniface.matting import MODNet
matting = MODNet(model_name="modnet_webcam")
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
if not ret:
break
matte = matting.predict(frame)
matte_3ch = matte[:, :, np.newaxis]
bg = np.full_like(frame, (0, 177, 64), dtype=np.uint8)
result = (frame * matte_3ch + bg * (1 - matte_3ch)).astype(np.uint8)
cv2.imshow("Matting", np.hstack([frame, result]))
if cv2.waitKey(1) & 0xFF == ord("q"):
break
cap.release()
cv2.destroyAllWindows()
```
---
## Factory Function
```python
from uniface.matting import create_matting_model
from uniface.constants import MODNetWeights
# Default (Photographic)
matting = create_matting_model()
# With enum
matting = create_matting_model(MODNetWeights.WEBCAM)
# With string
matting = create_matting_model("modnet_webcam")
```
---
## Next Steps
- [Parsing](parsing.md) - Face semantic segmentation
- [Privacy](privacy.md) - Face anonymization
- [Detection](detection.md) - Face detection

View File

@@ -2,6 +2,16 @@
Face parsing segments faces into semantic components or face regions.
<figure markdown="span">
![Face Parsing](https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/parsing.jpg){ width="80%" }
<figcaption>BiSeNet face parsing with 19 semantic component classes</figcaption>
</figure>
<figure markdown="span">
![Face Segmentation](https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/segmentation.jpg){ width="80%" }
<figcaption>XSeg face region segmentation mask</figcaption>
</figure>
---
## Available Models

View File

@@ -2,6 +2,11 @@
Face anonymization protects privacy by blurring or obscuring faces in images and videos.
<figure markdown="span">
![Face Anonymization](https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/anonymization.jpg){ width="100%" }
<figcaption>Five anonymization methods: pixelate, gaussian, blackout, elliptical, and median</figcaption>
</figure>
---
## Available Methods

View File

@@ -2,6 +2,11 @@
Face recognition extracts embeddings for identity verification and face search.
<figure markdown="span">
![Face Verification](https://raw.githubusercontent.com/yakhyo/uniface/main/assets/demos/verification.jpg){ width="80%" }
<figcaption>Pairwise face verification with cosine similarity scores</figcaption>
</figure>
---
## Available Models
@@ -10,6 +15,7 @@ Face recognition extracts embeddings for identity verification and face search.
|-------|----------|------|---------------|
| **AdaFace** | IR-18/IR-101 | 92-249 MB | 512 |
| **ArcFace** | MobileNet/ResNet | 8-166 MB | 512 |
| **EdgeFace** | EdgeNeXt/LoRA | 5-70 MB | 512 |
| **MobileFace** | MobileNet V2/V3 | 1-10 MB | 512 |
| **SphereFace** | Sphere20/36 | 50-92 MB | 512 |
@@ -113,6 +119,64 @@ recognizer = ArcFace(providers=['CPUExecutionProvider'])
---
## EdgeFace
Efficient face recognition designed for edge devices, using an EdgeNeXt backbone with optional LoRA low-rank compression. Competition-winning entry (compact track) at EFaR 2023, IJCB.
### Basic Usage
```python
from uniface.detection import RetinaFace
from uniface.recognition import EdgeFace
detector = RetinaFace()
recognizer = EdgeFace()
# Detect face
faces = detector.detect(image)
# Extract embedding
if faces:
embedding = recognizer.get_normalized_embedding(image, faces[0].landmarks)
print(f"Embedding shape: {embedding.shape}") # (512,)
```
### Model Variants
```python
from uniface.recognition import EdgeFace
from uniface.constants import EdgeFaceWeights
# Ultra-compact (default)
recognizer = EdgeFace(model_name=EdgeFaceWeights.XXS)
# Compact with LoRA
recognizer = EdgeFace(model_name=EdgeFaceWeights.XS_GAMMA_06)
# Small with LoRA
recognizer = EdgeFace(model_name=EdgeFaceWeights.S_GAMMA_05)
# Full-size
recognizer = EdgeFace(model_name=EdgeFaceWeights.BASE)
# Force CPU execution
recognizer = EdgeFace(providers=['CPUExecutionProvider'])
```
| Variant | Params | MFLOPs | Size | LFW | CALFW | CPLFW | CFP-FP | AgeDB-30 |
|---------|--------|--------|------|-----|-------|-------|--------|----------|
| **XXS** :material-check-circle: | 1.24M | 94 | ~5 MB | 99.57% | 94.83% | 90.27% | 93.63% | 94.92% |
| XS_GAMMA_06 | 1.77M | 154 | ~7 MB | 99.73% | 95.28% | 91.58% | 94.71% | 96.08% |
| S_GAMMA_05 | 3.65M | 306 | ~14 MB | 99.78% | 95.55% | 92.48% | 95.74% | 97.03% |
| BASE | 18.2M | 1399 | ~70 MB | 99.83% | 96.07% | 93.75% | 97.01% | 97.60% |
!!! info "Reference"
**Paper**: [EdgeFace: Efficient Face Recognition Model for Edge Devices](https://arxiv.org/abs/2307.01838v2) (IEEE T-BIOM 2024)
**Source**: [github.com/otroshi/edgeface](https://github.com/otroshi/edgeface)
---
## MobileFace
Lightweight face recognition models with MobileNet backbones.
@@ -287,9 +351,10 @@ else:
```python
from uniface.recognition import create_recognizer
# Available methods: 'arcface', 'adaface', 'mobileface', 'sphereface'
# Available methods: 'arcface', 'adaface', 'edgeface', 'mobileface', 'sphereface'
recognizer = create_recognizer('arcface')
recognizer = create_recognizer('adaface')
recognizer = create_recognizer('edgeface')
```
---

172
docs/modules/stores.md Normal file
View File

@@ -0,0 +1,172 @@
# Stores
FAISS-backed vector store for fast similarity search over embeddings.
!!! info "Optional dependency"
```bash
pip install faiss-cpu
```
---
## FAISS
```python
from uniface.stores import FAISS
```
A thin wrapper around a FAISS `IndexFlatIP` (inner-product) index. Vectors
**must** be L2-normalised before adding so that inner product equals cosine
similarity. The store does not normalise internally.
Each vector is paired with a metadata `dict` that can carry any
JSON-serialisable payload (person ID, name, source path, etc.).
### Constructor
```python
store = FAISS(embedding_size=512, db_path="./vector_index")
```
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `embedding_size` | `int` | `512` | Dimension of embedding vectors |
| `db_path` | `str` | `"./vector_index"` | Directory for persisting index and metadata |
---
### Methods
#### `add(embedding, metadata)`
Add a single embedding with associated metadata.
```python
store.add(embedding, {"person_id": "alice", "source": "photo.jpg"})
```
| Parameter | Type | Description |
|-----------|------|-------------|
| `embedding` | `np.ndarray` | L2-normalised embedding vector |
| `metadata` | `dict[str, Any]` | Arbitrary JSON-serialisable key-value pairs |
---
#### `search(embedding, threshold=0.4)`
Find the closest match for a query embedding.
```python
result, similarity = store.search(query_embedding, threshold=0.4)
if result:
print(result["person_id"], similarity)
```
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `embedding` | `np.ndarray` | — | L2-normalised query vector |
| `threshold` | `float` | `0.4` | Minimum cosine similarity to accept a match |
**Returns:** `(metadata, similarity)` if a match is found, or `(None, similarity)` when below threshold or the index is empty.
---
#### `remove(key, value)`
Remove all entries where `metadata[key] == value` and rebuild the index.
```python
removed = store.remove("person_id", "bob")
print(f"Removed {removed} entries")
```
| Parameter | Type | Description |
|-----------|------|-------------|
| `key` | `str` | Metadata key to match |
| `value` | `Any` | Value to match |
**Returns:** Number of entries removed.
---
#### `save()`
Persist the FAISS index and metadata to disk.
```python
store.save()
```
Writes two files to `db_path`:
- `faiss_index.bin` — binary FAISS index
- `metadata.json` — JSON array of metadata dicts
---
#### `load()`
Load a previously saved index and metadata.
```python
store = FAISS(db_path="./vector_index")
loaded = store.load() # True if files exist
```
**Returns:** `True` if loaded successfully, `False` if files are missing.
**Raises:** `RuntimeError` if files exist but cannot be read.
---
### Properties
| Property | Type | Description |
|----------|------|-------------|
| `size` | `int` | Number of vectors in the index |
| `len(store)` | `int` | Same as `size` |
---
## Example: End-to-End
```python
import cv2
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.stores import FAISS
detector = RetinaFace()
recognizer = ArcFace()
# Build
store = FAISS(db_path="./my_index")
image = cv2.imread("alice.jpg")
faces = detector.detect(image)
embedding = recognizer.get_normalized_embedding(image, faces[0].landmarks)
store.add(embedding, {"person_id": "alice"})
store.save()
# Search
store2 = FAISS(db_path="./my_index")
store2.load()
query = cv2.imread("unknown.jpg")
faces = detector.detect(query)
emb = recognizer.get_normalized_embedding(query, faces[0].landmarks)
result, sim = store2.search(emb)
if result:
print(f"Matched: {result['person_id']} (similarity: {sim:.3f})")
else:
print(f"No match (similarity: {sim:.3f})")
```
---
## See Also
- [Face Search Recipe](../recipes/face-search.md) - Building and querying indexes
- [Recognition Module](recognition.md) - Embedding extraction
- [Thresholds Guide](../concepts/thresholds-calibration.md) - Tuning similarity thresholds

View File

@@ -12,11 +12,15 @@ Run UniFace examples directly in your browser with Google Colab, or download and
| [Face Alignment](https://github.com/yakhyo/uniface/blob/main/examples/02_face_alignment.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/02_face_alignment.ipynb) | Align faces for recognition |
| [Face Verification](https://github.com/yakhyo/uniface/blob/main/examples/03_face_verification.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/03_face_verification.ipynb) | Compare faces for identity |
| [Face Search](https://github.com/yakhyo/uniface/blob/main/examples/04_face_search.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/04_face_search.ipynb) | Find a person in group photos |
| [Face Analyzer](https://github.com/yakhyo/uniface/blob/main/examples/05_face_analyzer.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/05_face_analyzer.ipynb) | All-in-one face analysis |
| [Face Analyzer](https://github.com/yakhyo/uniface/blob/main/examples/05_face_analyzer.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/05_face_analyzer.ipynb) | Unified face analysis |
| [Face Parsing](https://github.com/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/06_face_parsing.ipynb) | Semantic face segmentation |
| [Face Anonymization](https://github.com/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/07_face_anonymization.ipynb) | Privacy-preserving blur |
| [Gaze Estimation](https://github.com/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/08_gaze_estimation.ipynb) | Gaze direction estimation |
| [Face Segmentation](https://github.com/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/09_face_segmentation.ipynb) | Face segmentation with XSeg |
| [Face Vector Store](https://github.com/yakhyo/uniface/blob/main/examples/10_face_vector_store.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/10_face_vector_store.ipynb) | FAISS-backed face database |
| [Head Pose Estimation](https://github.com/yakhyo/uniface/blob/main/examples/11_head_pose_estimation.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/11_head_pose_estimation.ipynb) | 3D head orientation estimation |
| [Face Recognition](https://github.com/yakhyo/uniface/blob/main/examples/12_face_recognition.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/12_face_recognition.ipynb) | Standalone face recognition pipeline |
| [Portrait Matting](https://github.com/yakhyo/uniface/blob/main/examples/13_portrait_matting.ipynb) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/yakhyo/uniface/blob/main/examples/13_portrait_matting.ipynb) | Portrait matting with MODNet |
---
@@ -30,7 +34,7 @@ git clone https://github.com/yakhyo/uniface.git
cd uniface
# Install dependencies
pip install uniface jupyter
pip install "uniface[cpu]" jupyter # or uniface[gpu] for CUDA
# Launch Jupyter
jupyter notebook examples/

View File

@@ -54,19 +54,8 @@ detector = RetinaFace()
image = cv2.imread("photo.jpg")
faces = detector.detect(image)
# Extract visualization data
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
# Draw on image
draw_detections(
image=image,
bboxes=bboxes,
scores=scores,
landmarks=landmarks,
vis_threshold=0.6,
)
draw_detections(image=image, faces=faces, vis_threshold=0.6)
# Save result
cv2.imwrite("output.jpg", image)
@@ -80,7 +69,6 @@ Compare two faces:
```python
import cv2
import numpy as np
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
@@ -97,12 +85,13 @@ faces1 = detector.detect(image1)
faces2 = detector.detect(image2)
if faces1 and faces2:
# Extract embeddings
# Extract embeddings (normalized 1-D vectors)
emb1 = recognizer.get_normalized_embedding(image1, faces1[0].landmarks)
emb2 = recognizer.get_normalized_embedding(image2, faces2[0].landmarks)
# Compute similarity (cosine similarity)
similarity = np.dot(emb1, emb2.T)[0][0]
# Compute cosine similarity
from uniface import compute_similarity
similarity = compute_similarity(emb1, emb2, normalized=True)
# Interpret result
if similarity > 0.6:
@@ -135,7 +124,7 @@ faces = detector.detect(image)
# Predict attributes
for i, face in enumerate(faces):
result = age_gender.predict(image, face.bbox)
result = age_gender.predict(image, face)
print(f"Face {i+1}: {result.sex}, {result.age} years old")
```
@@ -164,7 +153,7 @@ image = cv2.imread("photo.jpg")
faces = detector.detect(image)
for i, face in enumerate(faces):
result = fairface.predict(image, face.bbox)
result = fairface.predict(image, face)
print(f"Face {i+1}: {result.sex}, {result.age_group}, {result.race}")
```
@@ -177,7 +166,9 @@ Face 2: Female, 20-29, White
---
## Facial Landmarks (106 Points)
## Facial Landmarks (106 / 98 / 68 Points)
UniFace ships two dense-landmark families. Pick whichever fits your downstream task:
```python
import cv2
@@ -185,14 +176,14 @@ from uniface.detection import RetinaFace
from uniface.landmark import Landmark106
detector = RetinaFace()
landmarker = Landmark106()
landmarker = Landmark106() # 106-point InsightFace 2d106det model
image = cv2.imread("photo.jpg")
faces = detector.detect(image)
if faces:
landmarks = landmarker.get_landmarks(image, faces[0].bbox)
print(f"Detected {len(landmarks)} landmarks")
print(f"Detected {len(landmarks)} landmarks") # 106
# Draw landmarks
for x, y in landmarks.astype(int):
@@ -201,6 +192,21 @@ if faces:
cv2.imwrite("landmarks.jpg", image)
```
**PIPNet (98 / 68 points)** — ResNet-18 backbone trained on WFLW (98 pts) or 300W+CelebA (68 pts):
```python
from uniface.constants import PIPNetWeights
from uniface.landmark import PIPNet
# 98-point WFLW model (default)
landmarker_98 = PIPNet()
# 68-point 300W+CelebA model
landmarker_68 = PIPNet(model_name=PIPNetWeights.DW300_CELEBA_68)
landmarks = landmarker_98.get_landmarks(image, faces[0].bbox) # (98, 2)
```
---
## Gaze Estimation
@@ -234,6 +240,36 @@ cv2.imwrite("gaze_output.jpg", image)
---
## Head Pose Estimation
```python
import cv2
from uniface.detection import RetinaFace
from uniface.headpose import HeadPose
from uniface.draw import draw_head_pose
detector = RetinaFace()
head_pose = HeadPose()
image = cv2.imread("photo.jpg")
faces = detector.detect(image)
for i, face in enumerate(faces):
x1, y1, x2, y2 = map(int, face.bbox[:4])
face_crop = image[y1:y2, x1:x2]
if face_crop.size > 0:
result = head_pose.estimate(face_crop)
print(f"Face {i+1}: pitch={result.pitch:.1f}°, yaw={result.yaw:.1f}°, roll={result.roll:.1f}°")
# Draw 3D cube visualization
draw_head_pose(image, face.bbox, result.pitch, result.yaw, result.roll)
cv2.imwrite("headpose_output.jpg", image)
```
---
## Face Parsing
Segment face into semantic components:
@@ -261,6 +297,34 @@ print(f"Detected {len(np.unique(mask))} facial components")
---
## Portrait Matting
Remove backgrounds without a trimap:
```python
import cv2
import numpy as np
from uniface.matting import MODNet
matting = MODNet()
image = cv2.imread("portrait.jpg")
matte = matting.predict(image) # (H, W) float32 in [0, 1]
# Transparent PNG
rgba = cv2.cvtColor(image, cv2.COLOR_BGR2BGRA)
rgba[:, :, 3] = (matte * 255).astype(np.uint8)
cv2.imwrite("transparent.png", rgba)
# Green screen
matte_3ch = matte[:, :, np.newaxis]
bg = np.full_like(image, (0, 177, 64), dtype=np.uint8)
result = (image * matte_3ch + bg * (1 - matte_3ch)).astype(np.uint8)
cv2.imwrite("green_screen.jpg", result)
```
---
## Face Anonymization
Blur faces for privacy protection:
@@ -342,10 +406,7 @@ while True:
faces = detector.detect(frame)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks)
draw_detections(image=frame, faces=faces)
cv2.imshow("UniFace - Press 'q' to quit", frame)
@@ -421,9 +482,11 @@ For detailed model comparisons and benchmarks, see the [Model Zoo](models.md).
| Task | Available Models |
|------|------------------|
| Detection | `RetinaFace`, `SCRFD`, `YOLOv5Face`, `YOLOv8Face` |
| Recognition | `ArcFace`, `AdaFace`, `MobileFace`, `SphereFace` |
| Recognition | `ArcFace`, `AdaFace`, `EdgeFace`, `MobileFace`, `SphereFace` |
| Landmarks | `Landmark106` (106 pts), `PIPNet` (98 / 68 pts) |
| Tracking | `BYTETracker` |
| Gaze | `MobileGaze` (ResNet18/34/50, MobileNetV2, MobileOneS0) |
| Head Pose | `HeadPose` (ResNet18/34/50, MobileNetV2/V3) |
| Parsing | `BiSeNet` (ResNet18/34) |
| Attributes | `AgeGender`, `FairFace`, `Emotion` |
| Anti-Spoofing | `MiniFASNet` (V1SE, V2) |
@@ -468,13 +531,15 @@ python -c "import platform; print(platform.machine())"
from uniface.detection import RetinaFace, SCRFD
from uniface.recognition import ArcFace, AdaFace
from uniface.attribute import AgeGender, FairFace
from uniface.landmark import Landmark106
from uniface.landmark import Landmark106, PIPNet
from uniface.gaze import MobileGaze
from uniface.headpose import HeadPose
from uniface.parsing import BiSeNet, XSeg
from uniface.privacy import BlurFace
from uniface.spoofing import MiniFASNet
from uniface.tracking import BYTETracker
from uniface.analyzer import FaceAnalyzer
from uniface.stores import FAISS # pip install faiss-cpu
from uniface.draw import draw_detections, draw_tracks
```

View File

@@ -1,179 +1,166 @@
# Face Search
Build a face search system for finding people in images.
Find and identify people in images and video streams.
!!! note "Work in Progress"
This page contains example code patterns. Test thoroughly before using in production.
UniFace supports two search approaches:
| Approach | Use case | Tool |
| -------------------- | ------------------------------------------------ | ----------------------- |
| **Reference search** | "Is this specific person in the video?" | `tools/search.py` |
| **Vector search** | "Who is this?" against a database of known faces | `tools/faiss_search.py` |
---
## Basic Face Database
## Reference Search (single image)
Compare every detected face against a single reference photo:
```python
import cv2
import numpy as np
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.face_utils import compute_similarity
detector = RetinaFace()
recognizer = ArcFace()
ref_image = cv2.imread("reference.jpg")
ref_faces = detector.detect(ref_image)
ref_embedding = recognizer.get_normalized_embedding(ref_image, ref_faces[0].landmarks)
query_image = cv2.imread("group_photo.jpg")
faces = detector.detect(query_image)
for face in faces:
embedding = recognizer.get_normalized_embedding(query_image, face.landmarks)
sim = compute_similarity(ref_embedding, embedding)
label = f"Match ({sim:.2f})" if sim > 0.4 else f"Unknown ({sim:.2f})"
print(label)
```
**CLI tool:**
```bash
python tools/search.py --reference ref.jpg --source video.mp4
python tools/search.py --reference ref.jpg --source 0 # webcam
```
---
## Vector Search (FAISS index)
For identifying faces against a database of many known people, use the
[`FAISS`](../modules/stores.md) vector store.
!!! info "Install extra"
`bash
pip install faiss-cpu
`
### Build an index
Organise face images in person sub-folders:
```
dataset/
├── alice/
│ ├── 001.jpg
│ └── 002.jpg
├── bob/
│ └── 001.jpg
└── charlie/
├── 001.jpg
└── 002.jpg
```
```python
import cv2
from pathlib import Path
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.stores import FAISS
class FaceDatabase:
def __init__(self):
self.detector = RetinaFace()
self.recognizer = ArcFace()
self.embeddings = {}
detector = RetinaFace()
recognizer = ArcFace()
store = FAISS(db_path="./my_index")
def add_face(self, person_id, image):
"""Add a face to the database."""
faces = self.detector.detect(image)
if not faces:
raise ValueError(f"No face found for {person_id}")
for person_dir in sorted(Path("dataset").iterdir()):
if not person_dir.is_dir():
continue
for img_path in person_dir.glob("*.jpg"):
image = cv2.imread(str(img_path))
faces = detector.detect(image)
if faces:
emb = recognizer.get_normalized_embedding(image, faces[0].landmarks)
store.add(emb, {"person_id": person_dir.name, "source": str(img_path)})
face = max(faces, key=lambda f: f.confidence)
embedding = self.recognizer.get_normalized_embedding(image, face.landmarks)
self.embeddings[person_id] = embedding
return True
def search(self, image, threshold=0.6):
"""Search for faces in an image."""
faces = self.detector.detect(image)
results = []
for face in faces:
embedding = self.recognizer.get_normalized_embedding(image, face.landmarks)
best_match = None
best_similarity = -1
for person_id, db_embedding in self.embeddings.items():
similarity = np.dot(embedding, db_embedding.T)[0][0]
if similarity > best_similarity:
best_similarity = similarity
best_match = person_id
results.append({
'bbox': face.bbox,
'match': best_match if best_similarity >= threshold else None,
'similarity': best_similarity
})
return results
def save(self, path):
"""Save database to file."""
np.savez(path, embeddings=dict(self.embeddings))
def load(self, path):
"""Load database from file."""
data = np.load(path, allow_pickle=True)
self.embeddings = data['embeddings'].item()
# Usage
db = FaceDatabase()
# Add faces
for image_path in Path("known_faces/").glob("*.jpg"):
person_id = image_path.stem
image = cv2.imread(str(image_path))
try:
db.add_face(person_id, image)
print(f"Added: {person_id}")
except ValueError as e:
print(f"Skipped: {e}")
# Save database
db.save("face_database.npz")
# Search
query_image = cv2.imread("group_photo.jpg")
results = db.search(query_image)
for r in results:
if r['match']:
print(f"Found: {r['match']} (similarity: {r['similarity']:.3f})")
store.save()
print(f"Index saved: {store}")
```
---
**CLI tool:**
## Visualization
```bash
python tools/faiss_search.py build --faces-dir dataset/ --db-path ./my_index
```
### Search against the index
```python
import cv2
from uniface.detection import RetinaFace
from uniface.recognition import ArcFace
from uniface.stores import FAISS
def visualize_search_results(image, results):
"""Draw search results on image."""
for r in results:
x1, y1, x2, y2 = map(int, r['bbox'])
detector = RetinaFace()
recognizer = ArcFace()
if r['match']:
color = (0, 255, 0) # Green for match
label = f"{r['match']} ({r['similarity']:.2f})"
else:
color = (0, 0, 255) # Red for unknown
label = f"Unknown ({r['similarity']:.2f})"
store = FAISS(db_path="./my_index")
store.load()
cv2.rectangle(image, (x1, y1), (x2, y2), color, 2)
cv2.putText(image, label, (x1, y1 - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
image = cv2.imread("query.jpg")
faces = detector.detect(image)
return image
for face in faces:
embedding = recognizer.get_normalized_embedding(image, face.landmarks)
result, similarity = store.search(embedding, threshold=0.4)
# Usage
results = db.search(image)
annotated = visualize_search_results(image.copy(), results)
cv2.imwrite("search_result.jpg", annotated)
if result:
print(f"Matched: {result['person_id']} ({similarity:.2f})")
else:
print(f"Unknown ({similarity:.2f})")
```
---
**CLI tool:**
## Real-Time Search
```bash
python tools/faiss_search.py run --db-path ./my_index --source video.mp4
python tools/faiss_search.py run --db-path ./my_index --source 0 # webcam
```
### Manage the index
```python
import cv2
from uniface.stores import FAISS
def realtime_search(db):
"""Real-time face search from webcam."""
cap = cv2.VideoCapture(0)
store = FAISS(db_path="./my_index")
store.load()
while True:
ret, frame = cap.read()
if not ret:
break
print(f"Total vectors: {len(store)}")
results = db.search(frame, threshold=0.5)
removed = store.remove("person_id", "bob")
print(f"Removed {removed} entries")
for r in results:
x1, y1, x2, y2 = map(int, r['bbox'])
if r['match']:
color = (0, 255, 0)
label = r['match']
else:
color = (0, 0, 255)
label = "Unknown"
cv2.rectangle(frame, (x1, y1), (x2, y2), color, 2)
cv2.putText(frame, label, (x1, y1 - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)
cv2.imshow("Face Search", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
# Usage
db = FaceDatabase()
db.load("face_database.npz")
realtime_search(db)
store.save()
```
---
## See Also
- [Stores Module](../modules/stores.md) - Full `FAISS` API reference
- [Recognition Module](../modules/recognition.md) - Face recognition details
- [Batch Processing](batch-processing.md) - Process multiple files
- [Video & Webcam](video-webcam.md) - Real-time processing
- [Concepts: Thresholds](../concepts/thresholds-calibration.md) - Tuning similarity thresholds

View File

@@ -34,7 +34,7 @@ def process_image(image_path):
embedding = recognizer.get_normalized_embedding(image, face.landmarks)
# Step 3: Predict attributes
attrs = age_gender.predict(image, face.bbox)
attrs = age_gender.predict(image, face)
results.append({
'face_id': i,
@@ -48,12 +48,7 @@ def process_image(image_path):
print(f" Face {i+1}: {attrs.sex}, {attrs.age} years old")
# Visualize
draw_detections(
image=image,
bboxes=[f.bbox for f in faces],
scores=[f.confidence for f in faces],
landmarks=[f.landmarks for f in faces]
)
draw_detections(image=image, faces=faces)
return image, results
@@ -83,7 +78,7 @@ age_gender = AgeGender()
analyzer = FaceAnalyzer(
detector,
recognizer=recognizer,
age_gender=age_gender,
attributes=[age_gender],
)
# Process image
@@ -109,11 +104,12 @@ import numpy as np
from uniface.attribute import AgeGender, FairFace
from uniface.detection import RetinaFace
from uniface.gaze import MobileGaze
from uniface.headpose import HeadPose
from uniface.landmark import Landmark106
from uniface.recognition import ArcFace
from uniface.parsing import BiSeNet
from uniface.spoofing import MiniFASNet
from uniface.draw import draw_detections, draw_gaze
from uniface.draw import draw_detections, draw_gaze, draw_head_pose
class FaceAnalysisPipeline:
def __init__(self):
@@ -124,6 +120,7 @@ class FaceAnalysisPipeline:
self.fairface = FairFace()
self.landmarker = Landmark106()
self.gaze = MobileGaze()
self.head_pose = HeadPose()
self.parser = BiSeNet()
self.spoofer = MiniFASNet()
@@ -145,12 +142,12 @@ class FaceAnalysisPipeline:
)
# Attributes
ag_result = self.age_gender.predict(image, face.bbox)
ag_result = self.age_gender.predict(image, face)
result['age'] = ag_result.age
result['gender'] = ag_result.sex
# FairFace attributes
ff_result = self.fairface.predict(image, face.bbox)
ff_result = self.fairface.predict(image, face)
result['age_group'] = ff_result.age_group
result['race'] = ff_result.race
@@ -167,6 +164,13 @@ class FaceAnalysisPipeline:
result['gaze_pitch'] = gaze_result.pitch
result['gaze_yaw'] = gaze_result.yaw
# Head pose estimation
if face_crop.size > 0:
hp_result = self.head_pose.estimate(face_crop)
result['head_pitch'] = hp_result.pitch
result['head_yaw'] = hp_result.yaw
result['head_roll'] = hp_result.roll
# Face parsing
if face_crop.size > 0:
result['parsing_mask'] = self.parser.parse(face_crop)
@@ -189,6 +193,7 @@ for i, r in enumerate(results):
print(f" Gender: {r['gender']}, Age: {r['age']}")
print(f" Race: {r['race']}, Age Group: {r['age_group']}")
print(f" Gaze: pitch={np.degrees(r['gaze_pitch']):.1f}°")
print(f" Head Pose: P={r['head_pitch']:.1f}° Y={r['head_yaw']:.1f}° R={r['head_roll']:.1f}°")
print(f" Real: {r['is_real']} ({r['spoof_confidence']:.1%})")
```
@@ -220,7 +225,7 @@ def visualize_analysis(image_path, output_path):
cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Age and gender
attrs = age_gender.predict(image, face.bbox)
attrs = age_gender.predict(image, face)
label = f"{attrs.sex}, {attrs.age}y"
cv2.putText(image, label, (x1, y1 - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)
@@ -268,6 +273,11 @@ def results_to_json(results):
'gaze': {
'pitch_deg': float(np.degrees(r['gaze_pitch'])) if 'gaze_pitch' in r else None,
'yaw_deg': float(np.degrees(r['gaze_yaw'])) if 'gaze_yaw' in r else None
},
'head_pose': {
'pitch': float(r['head_pitch']) if 'head_pitch' in r else None,
'yaw': float(r['head_yaw']) if 'head_yaw' in r else None,
'roll': float(r['head_roll']) if 'head_roll' in r else None
}
}
output.append(item)
@@ -291,3 +301,4 @@ with open('results.json', 'w') as f:
- [Face Search](face-search.md) - Build a search system
- [Detection Module](../modules/detection.md) - Detection options
- [Recognition Module](../modules/recognition.md) - Recognition details
- [Head Pose Module](../modules/headpose.md) - Head orientation estimation

View File

@@ -26,12 +26,7 @@ while True:
faces = detector.detect(frame)
draw_detections(
image=frame,
bboxes=[f.bbox for f in faces],
scores=[f.confidence for f in faces],
landmarks=[f.landmarks for f in faces]
)
draw_detections(image=frame, faces=faces)
cv2.imshow("Face Detection", frame)
@@ -175,3 +170,4 @@ while True:
- [Batch Processing](batch-processing.md) - Process multiple files
- [Detection Module](../modules/detection.md) - Detection options
- [Gaze Module](../modules/gaze.md) - Gaze estimation
- [Head Pose Module](../modules/headpose.md) - Head orientation estimation

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,291 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Face Vector Store with FAISS\n",
"\n",
"<div style=\"display:flex; flex-wrap:wrap; align-items:center;\">\n",
" <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://pepy.tech/projects/uniface\"><img alt=\"PyPI Downloads\" src=\"https://static.pepy.tech/personalized-badge/uniface?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads\"></a>\n",
" <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://pypi.org/project/uniface/\"><img alt=\"PyPI Version\" src=\"https://img.shields.io/pypi/v/uniface.svg\"></a>\n",
" <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://opensource.org/licenses/MIT\"><img alt=\"License\" src=\"https://img.shields.io/badge/License-MIT-blue.svg\"></a>\n",
" <a style=\"margin-bottom:6px;\" href=\"https://github.com/yakhyo/uniface\"><img alt=\"GitHub Stars\" src=\"https://img.shields.io/github/stars/yakhyo/uniface.svg?style=social\"></a>\n",
"</div>\n",
"\n",
"**UniFace** is a lightweight, production-ready Python library for face detection, recognition, tracking, landmark analysis, face parsing, gaze estimation, and face attributes.\n",
"\n",
"🔗 **GitHub**: [github.com/yakhyo/uniface](https://github.com/yakhyo/uniface) | 📚 **Docs**: [yakhyo.github.io/uniface](https://yakhyo.github.io/uniface)\n",
"\n",
"---\n",
"\n",
"This notebook demonstrates how to build a persistent face database using the **FAISS** vector store in UniFace.\n",
"\n",
"Unlike direct pairwise comparison (see `04_face_search`), a vector store lets you efficiently index\n",
"thousands of face embeddings and retrieve the closest match in sub-millisecond time.\n",
"\n",
"## 1. Install UniFace"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install -q \"uniface[cpu]\" faiss-cpu\n",
"\n",
"# Clone repo for assets (Colab only)\n",
"import os\n",
"if 'COLAB_GPU' in os.environ or 'COLAB_RELEASE_TAG' in os.environ:\n",
" if not os.path.exists('uniface'):\n",
" !git clone --depth 1 https://github.com/yakhyo/uniface.git\n",
" os.chdir('uniface/examples')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Import Libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import cv2\n",
"import matplotlib.pyplot as plt\n",
"import shutil\n",
"\n",
"import uniface\n",
"from uniface.analyzer import FaceAnalyzer\n",
"from uniface.detection import RetinaFace\n",
"from uniface.recognition import ArcFace\n",
"from uniface.stores import FAISS\n",
"\n",
"print(f'UniFace version: {uniface.__version__}')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Initialize Models and Vector Store"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"analyzer = FaceAnalyzer(\n",
" detector=RetinaFace(confidence_threshold=0.5),\n",
" recognizer=ArcFace(),\n",
")\n",
"\n",
"DB_PATH = './demo_face_index'\n",
"store = FAISS(embedding_size=512, db_path=DB_PATH)\n",
"print(store)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Enroll Faces into the Store\n",
"\n",
"We detect faces in the test images and add each embedding with metadata."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"enrollment_images = {\n",
" '../assets/test_images/image0.jpg': 'person_0',\n",
" '../assets/test_images/image1.jpg': 'person_1',\n",
" '../assets/test_images/image2.jpg': 'person_2',\n",
" '../assets/test_images/image3.jpg': 'person_3',\n",
" '../assets/test_images/image4.jpg': 'person_4',\n",
"}\n",
"\n",
"for path, label in enrollment_images.items():\n",
" image = cv2.imread(path)\n",
" faces = analyzer.analyze(image)\n",
" if faces:\n",
" store.add(\n",
" embedding=faces[0].embedding,\n",
" metadata={'label': label, 'source': path},\n",
" )\n",
" print(f'Enrolled {label} from {path}')\n",
"\n",
"print(f'\\nStore size: {store.size} vectors')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. Search the Store\n",
"\n",
"Use a query image to find the closest match in the database."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"query_image = cv2.imread('../assets/test_images/image0.jpg')\n",
"query_faces = analyzer.analyze(query_image)\n",
"\n",
"if query_faces:\n",
" result, similarity = store.search(query_faces[0].embedding, threshold=0.4)\n",
"\n",
" if result:\n",
" print(f'Match found: {result[\"label\"]} (similarity: {similarity:.4f})')\n",
" print(f'Source: {result[\"source\"]}')\n",
" else:\n",
" print(f'No match above threshold (best similarity: {similarity:.4f})')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"if query_faces and result:\n",
" matched_image = cv2.imread(result['source'])\n",
"\n",
" fig, axes = plt.subplots(1, 2, figsize=(10, 4))\n",
" axes[0].imshow(cv2.cvtColor(query_image, cv2.COLOR_BGR2RGB))\n",
" axes[0].set_title('Query', fontsize=12)\n",
" axes[1].imshow(cv2.cvtColor(matched_image, cv2.COLOR_BGR2RGB))\n",
" axes[1].set_title(f'Match: {result[\"label\"]} ({similarity:.3f})', fontsize=12)\n",
" for ax in axes:\n",
" ax.axis('off')\n",
" plt.tight_layout()\n",
" plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. Save and Reload the Index\n",
"\n",
"The index and metadata can be persisted to disk and loaded later."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"store.save()\n",
"\n",
"# Create a fresh store and load the saved data\n",
"store_reloaded = FAISS(embedding_size=512, db_path=DB_PATH)\n",
"loaded = store_reloaded.load()\n",
"print(f'Load successful: {loaded}')\n",
"print(f'Reloaded store size: {store_reloaded.size} vectors')\n",
"\n",
"# Verify search still works after reload\n",
"if query_faces:\n",
" result, similarity = store_reloaded.search(query_faces[0].embedding, threshold=0.4)\n",
" if result:\n",
" print(f'Search after reload: {result[\"label\"]} ({similarity:.4f})')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7. Remove Entries\n",
"\n",
"Remove all entries matching a metadata key-value pair."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(f'Before removal: {store.size} vectors')\n",
"\n",
"removed = store.remove(key='label', value='person_0')\n",
"print(f'Removed {removed} entry')\n",
"print(f'After removal: {store.size} vectors')\n",
"\n",
"# Searching for the removed person should now return a different (lower) match\n",
"if query_faces:\n",
" result, similarity = store.search(query_faces[0].embedding, threshold=0.4)\n",
" if result:\n",
" print(f'\\nClosest remaining match: {result[\"label\"]} ({similarity:.4f})')\n",
" else:\n",
" print(f'\\nNo match above threshold (best similarity: {similarity:.4f})')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 8. Cleanup"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"shutil.rmtree(DB_PATH, ignore_errors=True)\n",
"print('Cleaned up demo index.')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Notes\n",
"\n",
"- Embeddings **must** be L2-normalised before adding (ArcFace already produces normalised embeddings)\n",
"- The default threshold of `0.4` works for most cases; raise it for stricter matching\n",
"- `save()` / `load()` persist the FAISS index and metadata as files in `db_path`\n",
"- For GPU-accelerated search install `faiss-gpu` instead of `faiss-cpu`\n",
"- The store uses `IndexFlatIP` (inner product = cosine similarity for normalised vectors)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -0,0 +1,223 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Head Pose Estimation with UniFace\n",
"\n",
"<div style=\"display:flex; flex-wrap:wrap; align-items:center;\">\n",
" <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://pepy.tech/projects/uniface\"><img alt=\"PyPI Downloads\" src=\"https://static.pepy.tech/personalized-badge/uniface?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads\"></a>\n",
" <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://pypi.org/project/uniface/\"><img alt=\"PyPI Version\" src=\"https://img.shields.io/pypi/v/uniface.svg\"></a>\n",
" <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://opensource.org/licenses/MIT\"><img alt=\"License\" src=\"https://img.shields.io/badge/License-MIT-blue.svg\"></a>\n",
" <a style=\"margin-bottom:6px;\" href=\"https://github.com/yakhyo/uniface\"><img alt=\"GitHub Stars\" src=\"https://img.shields.io/github/stars/yakhyo/uniface.svg?style=social\"></a>\n",
"</div>\n",
"\n",
"**UniFace** is a lightweight, production-ready Python library for face detection, recognition, tracking, landmark analysis, face parsing, gaze estimation, and face attributes.\n",
"\n",
"🔗 **GitHub**: [github.com/yakhyo/uniface](https://github.com/yakhyo/uniface) | 📚 **Docs**: [yakhyo.github.io/uniface](https://yakhyo.github.io/uniface)\n",
"\n",
"---\n",
"\n",
"This notebook demonstrates head pose estimation using the **UniFace** library.\n",
"\n",
"## 1. Install UniFace"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install -q \"uniface[cpu]\"\n",
"\n",
"# Clone repo for assets (Colab only)\n",
"import os\n",
"if 'COLAB_GPU' in os.environ or 'COLAB_RELEASE_TAG' in os.environ:\n",
" if not os.path.exists('uniface'):\n",
" !git clone --depth 1 https://github.com/yakhyo/uniface.git\n",
" os.chdir('uniface/examples')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Import Libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import cv2\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"from pathlib import Path\n",
"from PIL import Image\n",
"\n",
"import uniface\n",
"from uniface.detection import RetinaFace\n",
"from uniface.headpose import HeadPose\n",
"from uniface.draw import draw_head_pose\n",
"\n",
"print(f\"UniFace version: {uniface.__version__}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Initialize Models"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Initialize face detector\n",
"detector = RetinaFace(confidence_threshold=0.5)\n",
"\n",
"# Initialize head pose estimator (default: ResNet18 backbone)\n",
"head_pose = HeadPose()\n",
"\n",
"print(\"Models initialized successfully!\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Process All Test Images\n",
"\n",
"Display original images in the first row and head-pose-annotated images in the second row."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Get all test images\n",
"test_images_dir = Path('../assets/test_images')\n",
"test_images = sorted(test_images_dir.glob('*.jpg'))\n",
"\n",
"original_images = []\n",
"annotated_images = []\n",
"\n",
"for img_path in test_images:\n",
" image = cv2.imread(str(img_path))\n",
" if image is None:\n",
" continue\n",
"\n",
" # Store original (BGR -> RGB for display)\n",
" original_images.append(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))\n",
"\n",
" # Detect faces and estimate head pose\n",
" faces = detector.detect(image)\n",
"\n",
" for face in faces:\n",
" x1, y1, x2, y2 = map(int, face.bbox)\n",
" face_crop = image[y1:y2, x1:x2]\n",
"\n",
" if face_crop.size == 0:\n",
" continue\n",
"\n",
" result = head_pose.estimate(face_crop)\n",
" draw_head_pose(image, face.bbox, result.pitch, result.yaw, result.roll)\n",
"\n",
" print(f\"{img_path.name}: pitch={result.pitch:.1f}°, yaw={result.yaw:.1f}°, roll={result.roll:.1f}°\")\n",
"\n",
" annotated_images.append(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))\n",
"\n",
"print(f\"\\nProcessed {len(original_images)} images\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. Visualize Results\n",
"\n",
"**First row**: Original images \n",
"**Second row**: Images with head pose 3D cube overlay"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"num_images = len(original_images)\n",
"\n",
"# Create figure with 2 rows\n",
"fig, axes = plt.subplots(2, num_images, figsize=(5 * num_images, 10))\n",
"\n",
"if num_images == 1:\n",
" axes = axes.reshape(2, 1)\n",
"\n",
"for i in range(num_images):\n",
" axes[0, i].imshow(original_images[i])\n",
" axes[0, i].set_title('Original', fontsize=12)\n",
" axes[0, i].axis('off')\n",
"\n",
" axes[1, i].imshow(annotated_images[i])\n",
" axes[1, i].set_title('Head Pose', fontsize=12)\n",
" axes[1, i].axis('off')\n",
"\n",
"plt.tight_layout()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Notes\n",
"\n",
"- **Input**: Head pose estimation requires a face crop (obtained from face detection)\n",
"- **Output**: `HeadPoseResult` with pitch, yaw, and roll angles in **degrees**\n",
"- **Visualization**: Two modes available — `'cube'` (3D wireframe) and `'axis'` (X/Y/Z coordinate axes)\n",
"- **Models**: 6 backbone variants available via `HeadPoseWeights` enum\n",
"- **Method**: Uses 6D rotation representation converted to Euler angles\n",
"\n",
"### Available Backbones\n",
"\n",
"```python\n",
"from uniface.constants import HeadPoseWeights\n",
"\n",
"# Options: RESNET18, RESNET34, RESNET50, MOBILENET_V2, MOBILENET_V3_SMALL, MOBILENET_V3_LARGE\n",
"head_pose = HeadPose(model_name=HeadPoseWeights.RESNET50)\n",
"```"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -0,0 +1,356 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "0",
"metadata": {},
"source": [
"# Face Recognition: RetinaFace → Align → ArcFace\n",
"\n",
"<div style=\"display:flex; flex-wrap:wrap; align-items:center;\">\n",
" <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://pepy.tech/projects/uniface\"><img alt=\"PyPI Downloads\" src=\"https://static.pepy.tech/personalized-badge/uniface?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads\"></a>\n",
" <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://pypi.org/project/uniface/\"><img alt=\"PyPI Version\" src=\"https://img.shields.io/pypi/v/uniface.svg\"></a>\n",
" <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://opensource.org/licenses/MIT\"><img alt=\"License\" src=\"https://img.shields.io/badge/License-MIT-blue.svg\"></a>\n",
" <a style=\"margin-bottom:6px;\" href=\"https://github.com/yakhyo/uniface\"><img alt=\"GitHub Stars\" src=\"https://img.shields.io/github/stars/yakhyo/uniface.svg?style=social\"></a>\n",
"</div>\n",
"\n",
"**UniFace** is a lightweight, production-ready Python library for face detection, recognition, tracking, landmark analysis, face parsing, gaze estimation, and face attributes.\n",
"\n",
"🔗 **GitHub**: [github.com/yakhyo/uniface](https://github.com/yakhyo/uniface) | 📚 **Docs**: [yakhyo.github.io/uniface](https://yakhyo.github.io/uniface)\n",
"\n",
"---\n",
"\n",
"This notebook demonstrates face recognition **without** the high-level `FaceAnalyzer` wrapper. Each step is handled manually:\n",
"\n",
"1. **RetinaFace**: Detects faces and extracts 5-point landmarks.\n",
"2. **Face Alignment**: Warps each face into a standardized 112x112 crop using the landmarks.\n",
"3. **ArcFace**: Generates a 512-D L2-normalized embedding from the aligned crop.\n",
"\n",
"We compare three test images: `image0.jpg`, `image1.jpg`, and `image5.jpg`."
]
},
{
"cell_type": "markdown",
"id": "1",
"metadata": {},
"source": [
"## 1. Install UniFace"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2",
"metadata": {},
"outputs": [],
"source": [
"%pip install -q \"uniface[cpu]\"\n",
"\n",
"# Clone repo for assets (Colab only)\n",
"import os\n",
"if 'COLAB_GPU' in os.environ or 'COLAB_RELEASE_TAG' in os.environ:\n",
" if not os.path.exists('uniface'):\n",
" !git clone --depth 1 https://github.com/yakhyo/uniface.git\n",
" os.chdir('uniface/examples')"
]
},
{
"cell_type": "markdown",
"id": "3",
"metadata": {},
"source": [
"## 2. Import Libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4",
"metadata": {},
"outputs": [],
"source": [
"import cv2\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import matplotlib.patches as patches\n",
"\n",
"import uniface\n",
"from uniface.detection import RetinaFace\n",
"from uniface.recognition import ArcFace\n",
"from uniface.face_utils import face_alignment\n",
"\n",
"print(f\"UniFace version: {uniface.__version__}\")"
]
},
{
"cell_type": "markdown",
"id": "5",
"metadata": {},
"source": [
"## 3. Configuration"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6",
"metadata": {},
"outputs": [],
"source": [
"IMAGE_PATHS = {\n",
" \"image0\": \"../assets/test_images/image0.jpg\",\n",
" \"image1\": \"../assets/test_images/image1.jpg\",\n",
" \"image5\": \"../assets/test_images/image5.jpg\",\n",
"}\n",
"THRESHOLD = 0.4 # Cosine similarity threshold for \"same person\""
]
},
{
"cell_type": "markdown",
"id": "7",
"metadata": {},
"source": [
"## 4. Initialize Models"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8",
"metadata": {},
"outputs": [],
"source": [
"detector = RetinaFace(confidence_threshold=0.5)\n",
"recognizer = ArcFace()"
]
},
{
"cell_type": "markdown",
"id": "9",
"metadata": {},
"source": [
"## 5. Load Images & Detect Faces\n",
"\n",
"We use the detector to find faces and their landmarks in each image."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "10",
"metadata": {},
"outputs": [],
"source": [
"images = {}\n",
"faces = {}\n",
"\n",
"for name, path in IMAGE_PATHS.items():\n",
" img = cv2.imread(path)\n",
" if img is None:\n",
" raise FileNotFoundError(f\"Cannot read: {path}\")\n",
"\n",
" detected = detector.detect(img)\n",
" if not detected:\n",
" raise RuntimeError(f\"No face detected in: {path}\")\n",
"\n",
" images[name] = img\n",
" faces[name] = detected[0] # Keep highest-confidence face\n",
" print(f\"{name:8s} | {len(detected)} face(s) detected | confidence={faces[name].confidence:.3f}\")"
]
},
{
"cell_type": "markdown",
"id": "11",
"metadata": {},
"source": [
"## 6. Visualize Detections"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "12",
"metadata": {},
"outputs": [],
"source": [
"LM_COLORS = [\"red\", \"blue\", \"green\", \"cyan\", \"magenta\"]\n",
"\n",
"fig, axes = plt.subplots(1, 3, figsize=(15, 5))\n",
"fig.suptitle(\"Detected Faces & 5-Point Landmarks\", fontweight=\"bold\", fontsize=16)\n",
"\n",
"for ax, (name, img) in zip(axes, images.items()):\n",
" face = faces[name]\n",
" ax.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))\n",
" ax.set_title(f\"{name}\\nconf={face.confidence:.3f}\", fontsize=12)\n",
" ax.axis(\"off\")\n",
"\n",
" # Bounding box\n",
" x1, y1, x2, y2 = face.bbox.astype(int)\n",
" ax.add_patch(patches.Rectangle(\n",
" (x1, y1), x2 - x1, y2 - y1,\n",
" linewidth=2, edgecolor=\"lime\", facecolor=\"none\"))\n",
"\n",
" # Landmarks\n",
" for (lx, ly), c in zip(face.landmarks, LM_COLORS):\n",
" ax.plot(lx, ly, \"o\", color=c, markersize=6)\n",
"\n",
"plt.tight_layout()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "13",
"metadata": {},
"source": [
"## 7. Face Alignment\n",
"\n",
"We warp the detected faces into a standardized 112x112 size. This improves recognition accuracy."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "14",
"metadata": {},
"outputs": [],
"source": [
"aligned = {}\n",
"\n",
"for name, img in images.items():\n",
" lm = faces[name].landmarks\n",
" crop, _ = face_alignment(img, lm, image_size=(112, 112))\n",
" aligned[name] = crop\n",
"\n",
"fig, axes = plt.subplots(1, 3, figsize=(12, 4))\n",
"fig.suptitle(\"Aligned Face Crops (112x112)\", fontweight=\"bold\", fontsize=14)\n",
"\n",
"for ax, (name, crop) in zip(axes, aligned.items()):\n",
" ax.imshow(cv2.cvtColor(crop, cv2.COLOR_BGR2RGB))\n",
" ax.set_title(name, fontsize=12)\n",
" ax.axis(\"off\")\n",
"\n",
"plt.tight_layout()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "15",
"metadata": {},
"source": [
"## 8. Extract Embeddings\n",
"\n",
"We pass the aligned crops to ArcFace to get the 512-D vectors."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "16",
"metadata": {},
"outputs": [],
"source": [
"embeddings = {}\n",
"\n",
"for name, crop in aligned.items():\n",
" # landmarks=None because image is already aligned\n",
" emb = recognizer.get_normalized_embedding(crop, landmarks=None)\n",
" embeddings[name] = emb\n",
" print(f\"{name:8s} | embedding shape={emb.shape} | L2-norm={np.linalg.norm(emb):.4f}\")"
]
},
{
"cell_type": "markdown",
"id": "17",
"metadata": {},
"source": [
"## 9. Pairwise Cosine Similarity\n",
"\n",
"Since embeddings are normalized, cosine similarity is just the dot product."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "18",
"metadata": {},
"outputs": [],
"source": [
"names = list(embeddings.keys())\n",
"n = len(names)\n",
"sim_matrix = np.zeros((n, n))\n",
"\n",
"for i, ni in enumerate(names):\n",
" for j, nj in enumerate(names):\n",
" # Use squeeze() to handle (1, 512) shapes if present\n",
" sim_matrix[i, j] = float(np.dot(embeddings[ni].squeeze(), embeddings[nj].squeeze()))\n",
"\n",
"# Print comparison results\n",
"pairs = [(names[i], names[j]) for i in range(n) for j in range(i + 1, n)]\n",
"for a, b in pairs:\n",
" s = float(np.dot(embeddings[a].squeeze(), embeddings[b].squeeze()))\n",
" verdict = \"✓ Same person\" if s >= THRESHOLD else \"✗ Different people\"\n",
" print(f\"{a} vs {b}: similarity={s:.4f} → {verdict}\")"
]
},
{
"cell_type": "markdown",
"id": "19",
"metadata": {},
"source": [
"## 10. Similarity Heatmap"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "20",
"metadata": {},
"outputs": [],
"source": [
"fig, ax = plt.subplots(figsize=(8, 6))\n",
"im = ax.imshow(sim_matrix, vmin=0, vmax=1, cmap=\"viridis\")\n",
"plt.colorbar(im, ax=ax, label=\"Cosine similarity\")\n",
"\n",
"ax.set_xticks(range(n))\n",
"ax.set_yticks(range(n))\n",
"ax.set_xticklabels(names, rotation=30, ha=\"right\")\n",
"ax.set_yticklabels(names)\n",
"ax.set_title(\"Pairwise Face Similarity (ArcFace)\", fontweight=\"bold\")\n",
"\n",
"for i in range(n):\n",
" for j in range(n):\n",
" val = sim_matrix[i, j]\n",
" ax.text(j, i, f\"{val:.2f}\",\n",
" ha=\"center\", va=\"center\",\n",
" color=\"black\" if val >= 0.6 else \"white\",\n",
" fontsize=12, fontweight=\"bold\")\n",
"\n",
"plt.tight_layout()\n",
"plt.show()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,265 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Portrait Matting with MODNet\n",
"\n",
"<div style=\"display:flex; flex-wrap:wrap; align-items:center;\">\n",
" <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://pepy.tech/projects/uniface\"><img alt=\"PyPI Downloads\" src=\"https://static.pepy.tech/personalized-badge/uniface?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads\"></a>\n",
" <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://pypi.org/project/uniface/\"><img alt=\"PyPI Version\" src=\"https://img.shields.io/pypi/v/uniface.svg\"></a>\n",
" <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://opensource.org/licenses/MIT\"><img alt=\"License\" src=\"https://img.shields.io/badge/License-MIT-blue.svg\"></a>\n",
" <a style=\"margin-bottom:6px;\" href=\"https://github.com/yakhyo/uniface\"><img alt=\"GitHub Stars\" src=\"https://img.shields.io/github/stars/yakhyo/uniface.svg?style=social\"></a>\n",
"</div>\n",
"\n",
"**UniFace** is a lightweight, production-ready Python library for face detection, recognition, tracking, landmark analysis, face parsing, gaze estimation, and face attributes.\n",
"\n",
"🔗 **GitHub**: [github.com/yakhyo/uniface](https://github.com/yakhyo/uniface) | 📚 **Docs**: [yakhyo.github.io/uniface](https://yakhyo.github.io/uniface)\n",
"\n",
"---\n",
"\n",
"This notebook demonstrates portrait matting using **MODNet** — a trimap-free model that produces soft alpha mattes from full images. No face detection or cropping required.\n",
"\n",
"## 1. Install UniFace"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install -q \"uniface[cpu]\"\n",
"\n",
"# Clone repo for assets (Colab only)\n",
"import os\n",
"if 'COLAB_GPU' in os.environ or 'COLAB_RELEASE_TAG' in os.environ:\n",
" if not os.path.exists('uniface'):\n",
" !git clone --depth 1 https://github.com/yakhyo/uniface.git\n",
" os.chdir('uniface/examples')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Import Libraries"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import cv2\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"import uniface\n",
"from uniface.matting import MODNet\n",
"\n",
"print(f\"UniFace version: {uniface.__version__}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Initialize Model\n",
"\n",
"MODNet has two variants:\n",
"- **PHOTOGRAPHIC** (default): optimized for high-quality portrait photos\n",
"- **WEBCAM**: optimized for real-time webcam feeds"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"matting = MODNet()\n",
"\n",
"print(f\"Input size: {matting.input_size}\")\n",
"print(f\"Input name: {matting.input_name}\")\n",
"print(f\"Output names: {matting.output_names}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Helper Functions"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def compose(image, matte, background=None):\n",
" \"\"\"Composite foreground over a background using the alpha matte.\"\"\"\n",
" h, w = image.shape[:2]\n",
" matte_3ch = matte[:, :, np.newaxis]\n",
"\n",
" if background is None:\n",
" bg = np.full_like(image, (0, 177, 64), dtype=np.uint8)\n",
" else:\n",
" bg = cv2.resize(background, (w, h), interpolation=cv2.INTER_AREA)\n",
"\n",
" return (image * matte_3ch + bg * (1 - matte_3ch)).astype(np.uint8)\n",
"\n",
"\n",
"def show_results(image, matte):\n",
" \"\"\"Display original, matte, and green screen as a single merged image.\"\"\"\n",
" matte_vis = cv2.cvtColor((matte * 255).astype(np.uint8), cv2.COLOR_GRAY2BGR)\n",
" green = compose(image, matte)\n",
" merged = np.hstack([image, matte_vis, green])\n",
"\n",
" plt.figure(figsize=(18, 6))\n",
" plt.imshow(cv2.cvtColor(merged, cv2.COLOR_BGR2RGB))\n",
" plt.axis(\"off\")\n",
" plt.tight_layout()\n",
" plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. Basic Matting"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"image = cv2.imread(\"../assets/demos/src_portrait1.jpg\")\n",
"print(f\"Image shape: {image.shape}\")\n",
"\n",
"matte = matting.predict(image)\n",
"print(f\"Matte shape: {matte.shape}\")\n",
"print(f\"Matte dtype: {matte.dtype}\")\n",
"print(f\"Matte range: [{matte.min():.3f}, {matte.max():.3f}]\")\n",
"\n",
"show_results(image, matte)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. Transparent Background (RGBA)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"alpha = (matte * 255).astype(np.uint8)\n",
"rgba = cv2.cvtColor(image, cv2.COLOR_BGR2BGRA)\n",
"rgba[:, :, 3] = alpha\n",
"\n",
"# Checkerboard background to visualize transparency\n",
"h, w = image.shape[:2]\n",
"checker = np.zeros((h, w, 3), dtype=np.uint8)\n",
"block = 20\n",
"for y in range(0, h, block):\n",
" for x in range(0, w, block):\n",
" if (y // block + x // block) % 2 == 0:\n",
" checker[y:y+block, x:x+block] = 200\n",
" else:\n",
" checker[y:y+block, x:x+block] = 255\n",
"\n",
"matte_3ch = matte[:, :, np.newaxis]\n",
"rgba_vis = (image * matte_3ch + checker * (1 - matte_3ch)).astype(np.uint8)\n",
"\n",
"merged = np.hstack([image, rgba_vis])\n",
"\n",
"plt.figure(figsize=(16, 5))\n",
"plt.imshow(cv2.cvtColor(merged, cv2.COLOR_BGR2RGB))\n",
"plt.axis(\"off\")\n",
"plt.tight_layout()\n",
"plt.show()\n",
"\n",
"print(f\"RGBA shape: {rgba.shape}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7. Custom Background"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create a gradient background\n",
"h, w = image.shape[:2]\n",
"gradient = np.zeros((h, w, 3), dtype=np.uint8)\n",
"for y in range(h):\n",
" ratio = y / h\n",
" gradient[y, :] = [int(180 * (1 - ratio)), int(100 + 80 * ratio), int(220 * ratio)]\n",
"\n",
"custom_bg = compose(image, matte, gradient)\n",
"green_bg = compose(image, matte)\n",
"\n",
"merged = np.hstack([image, green_bg, custom_bg])\n",
"\n",
"plt.figure(figsize=(18, 6))\n",
"plt.imshow(cv2.cvtColor(merged, cv2.COLOR_BGR2RGB))\n",
"plt.axis(\"off\")\n",
"plt.tight_layout()\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Summary\n",
"\n",
"MODNet provides trimap-free portrait matting:\n",
"\n",
"- **`predict(image)`** — returns `(H, W)` float32 alpha matte in `[0, 1]`\n",
"- **No face detection needed** — works on full images directly\n",
"- **Two variants** — `PHOTOGRAPHIC` for photos, `WEBCAM` for real-time\n",
"- **Compositing** — use the matte for transparent PNGs, green screen, or custom backgrounds\n",
"\n",
"For more details, see the [Matting docs](https://yakhyo.github.io/uniface/modules/matting/)."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -1,5 +1,5 @@
site_name: UniFace
site_description: All-in-One Face Analysis Library with ONNX Runtime
site_description: A Unified Face Analysis Library for Python
site_author: Yakhyokhuja Valikhujaev
site_url: https://yakhyo.github.io/uniface
@@ -135,6 +135,7 @@ nav:
- Quickstart: quickstart.md
- Notebooks: notebooks.md
- Model Zoo: models.md
- Datasets: datasets.md
- Tutorials:
- Image Pipeline: recipes/image-pipeline.md
- Video & Webcam: recipes/video-webcam.md
@@ -149,9 +150,12 @@ nav:
- Landmarks: modules/landmarks.md
- Attributes: modules/attributes.md
- Parsing: modules/parsing.md
- Matting: modules/matting.md
- Gaze: modules/gaze.md
- Head Pose: modules/headpose.md
- Anti-Spoofing: modules/spoofing.md
- Privacy: modules/privacy.md
- Stores: modules/stores.md
- Guides:
- Overview: concepts/overview.md
- Inputs & Outputs: concepts/inputs-outputs.md

View File

@@ -1,7 +1,7 @@
[project]
name = "uniface"
version = "3.0.0"
description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Tracking, Landmark Analysis, Face Parsing, Gaze Estimation, Age, and Gender Detection"
version = "3.6.0"
description = "UniFace: A Unified Face Analysis Library for Python"
readme = "README.md"
license = "MIT"
authors = [{ name = "Yakhyokhuja Valikhujaev", email = "yakhyo9696@gmail.com" }]
@@ -9,7 +9,7 @@ maintainers = [
{ name = "Yakhyokhuja Valikhujaev", email = "yakhyo9696@gmail.com" },
]
requires-python = ">=3.10,<3.14"
requires-python = ">=3.10,<3.15"
keywords = [
"face-detection",
"face-recognition",
@@ -29,7 +29,7 @@ keywords = [
]
classifiers = [
"Development Status :: 4 - Beta",
"Development Status :: 5 - Production/Stable",
"Intended Audience :: Developers",
"Intended Audience :: Science/Research",
"Operating System :: OS Independent",
@@ -38,21 +38,34 @@ classifiers = [
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Programming Language :: Python :: 3.14",
]
dependencies = [
"numpy>=1.21.0",
"opencv-python>=4.5.0",
"onnxruntime>=1.16.0",
"scikit-image>=0.19.0",
"scikit-image>=0.22.0",
"scipy>=1.7.0",
"requests>=2.28.0",
"tqdm>=4.64.0",
]
[project.optional-dependencies]
dev = ["pytest>=7.0.0", "ruff>=0.4.0"]
gpu = ["onnxruntime-gpu>=1.16.0"]
cpu = [
"onnxruntime>=1.16.0; python_version >= '3.11'",
"onnxruntime>=1.16.0,<1.24; python_version < '3.11'",
]
gpu = [
"onnxruntime-gpu>=1.16.0; python_version >= '3.11'",
"onnxruntime-gpu>=1.16.0,<1.24; python_version < '3.11'",
]
dev = ["pytest>=7.0.0", "ruff>=0.4.0", "pre-commit>=3.0.0"]
docs = [
"mkdocs-material",
"pymdown-extensions",
"mkdocs-git-committers-plugin-2",
"mkdocs-git-revision-date-localized-plugin",
]
[project.urls]
Homepage = "https://github.com/yakhyo/uniface"

View File

@@ -1,7 +0,0 @@
numpy>=1.21.0
opencv-python>=4.5.0
onnxruntime>=1.16.0
scikit-image>=0.19.0
scipy>=1.7.0
requests>=2.28.0
tqdm>=4.64.0

View File

@@ -9,6 +9,14 @@ import numpy as np
import pytest
from uniface.attribute import AgeGender, AttributeResult
from uniface.types import Face
def _make_face(bbox: list[int] | np.ndarray) -> Face:
"""Helper: build a minimal Face from a bounding box."""
bbox = np.asarray(bbox)
landmarks = np.zeros((5, 2), dtype=np.float32)
return Face(bbox=bbox, confidence=0.99, landmarks=landmarks)
@pytest.fixture
@@ -22,30 +30,30 @@ def mock_image():
@pytest.fixture
def mock_bbox():
return [100, 100, 300, 300]
def mock_face():
return _make_face([100, 100, 300, 300])
def test_model_initialization(age_gender_model):
assert age_gender_model is not None, 'AgeGender model initialization failed.'
def test_prediction_output_format(age_gender_model, mock_image, mock_bbox):
result = age_gender_model.predict(mock_image, mock_bbox)
def test_prediction_output_format(age_gender_model, mock_image, mock_face):
result = age_gender_model.predict(mock_image, mock_face)
assert isinstance(result, AttributeResult), f'Result should be AttributeResult, got {type(result)}'
assert isinstance(result.gender, int), f'Gender should be int, got {type(result.gender)}'
assert isinstance(result.age, int), f'Age should be int, got {type(result.age)}'
assert isinstance(result.sex, str), f'Sex should be str, got {type(result.sex)}'
def test_gender_values(age_gender_model, mock_image, mock_bbox):
result = age_gender_model.predict(mock_image, mock_bbox)
def test_gender_values(age_gender_model, mock_image, mock_face):
result = age_gender_model.predict(mock_image, mock_face)
assert result.gender in [0, 1], f'Gender should be 0 (Female) or 1 (Male), got {result.gender}'
assert result.sex in ['Female', 'Male'], f'Sex should be Female or Male, got {result.sex}'
def test_age_range(age_gender_model, mock_image, mock_bbox):
result = age_gender_model.predict(mock_image, mock_bbox)
def test_age_range(age_gender_model, mock_image, mock_face):
result = age_gender_model.predict(mock_image, mock_face)
assert 0 <= result.age <= 120, f'Age should be between 0 and 120, got {result.age}'
@@ -57,39 +65,52 @@ def test_different_bbox_sizes(age_gender_model, mock_image):
]
for bbox in test_bboxes:
result = age_gender_model.predict(mock_image, bbox)
face = _make_face(bbox)
result = age_gender_model.predict(mock_image, face)
assert result.gender in [0, 1], f'Failed for bbox {bbox}'
assert 0 <= result.age <= 120, f'Age out of range for bbox {bbox}'
def test_different_image_sizes(age_gender_model, mock_bbox):
def test_different_image_sizes(age_gender_model):
test_sizes = [(480, 640, 3), (720, 1280, 3), (1080, 1920, 3)]
face = _make_face([100, 100, 300, 300])
for size in test_sizes:
mock_image = np.random.randint(0, 255, size, dtype=np.uint8)
result = age_gender_model.predict(mock_image, mock_bbox)
result = age_gender_model.predict(mock_image, face)
assert result.gender in [0, 1], f'Failed for image size {size}'
assert 0 <= result.age <= 120, f'Age out of range for image size {size}'
def test_consistency(age_gender_model, mock_image, mock_bbox):
result1 = age_gender_model.predict(mock_image, mock_bbox)
result2 = age_gender_model.predict(mock_image, mock_bbox)
def test_consistency(age_gender_model, mock_image, mock_face):
result1 = age_gender_model.predict(mock_image, mock_face)
result2 = age_gender_model.predict(mock_image, mock_face)
assert result1.gender == result2.gender, 'Same input should produce same gender prediction'
assert result1.age == result2.age, 'Same input should produce same age prediction'
def test_face_enrichment(age_gender_model, mock_image, mock_face):
"""predict() must write gender & age back to the Face object."""
assert mock_face.gender is None
assert mock_face.age is None
result = age_gender_model.predict(mock_image, mock_face)
assert mock_face.gender == result.gender
assert mock_face.age == result.age
def test_bbox_list_format(age_gender_model, mock_image):
bbox_list = [100, 100, 300, 300]
result = age_gender_model.predict(mock_image, bbox_list)
face = _make_face([100, 100, 300, 300])
result = age_gender_model.predict(mock_image, face)
assert result.gender in [0, 1], 'Should work with bbox as list'
assert 0 <= result.age <= 120, 'Age should be in valid range'
def test_bbox_array_format(age_gender_model, mock_image):
bbox_array = np.array([100, 100, 300, 300])
result = age_gender_model.predict(mock_image, bbox_array)
face = _make_face(np.array([100, 100, 300, 300]))
result = age_gender_model.predict(mock_image, face)
assert result.gender in [0, 1], 'Should work with bbox as numpy array'
assert 0 <= result.age <= 120, 'Age should be in valid range'
@@ -103,7 +124,8 @@ def test_multiple_predictions(age_gender_model, mock_image):
results = []
for bbox in bboxes:
result = age_gender_model.predict(mock_image, bbox)
face = _make_face(bbox)
result = age_gender_model.predict(mock_image, face)
results.append(result)
assert len(results) == 3, 'Should have 3 predictions'
@@ -112,28 +134,26 @@ def test_multiple_predictions(age_gender_model, mock_image):
assert 0 <= result.age <= 120
def test_age_is_positive(age_gender_model, mock_image, mock_bbox):
def test_age_is_positive(age_gender_model, mock_image, mock_face):
for _ in range(5):
result = age_gender_model.predict(mock_image, mock_bbox)
result = age_gender_model.predict(mock_image, mock_face)
assert result.age >= 0, f'Age should be non-negative, got {result.age}'
def test_output_format_for_visualization(age_gender_model, mock_image, mock_bbox):
result = age_gender_model.predict(mock_image, mock_bbox)
def test_output_format_for_visualization(age_gender_model, mock_image, mock_face):
result = age_gender_model.predict(mock_image, mock_face)
text = f'{result.sex}, {result.age}y'
assert isinstance(text, str), 'Should be able to format as string'
assert 'Male' in text or 'Female' in text, 'Text should contain gender'
assert 'y' in text, "Text should contain 'y' for years"
def test_attribute_result_fields(age_gender_model, mock_image, mock_bbox):
def test_attribute_result_fields(age_gender_model, mock_image, mock_face):
"""Test that AttributeResult has correct fields for AgeGender model."""
result = age_gender_model.predict(mock_image, mock_bbox)
result = age_gender_model.predict(mock_image, mock_face)
# AgeGender should set gender and age
assert result.gender is not None
assert result.age is not None
# AgeGender should NOT set race and age_group (FairFace only)
assert result.race is None
assert result.age_group is None

View File

@@ -1,61 +0,0 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
import numpy as np
from uniface.draw import draw_gaze
def _compute_gaze_delta(bbox: np.ndarray, pitch: float, yaw: float) -> tuple[int, int]:
"""Replicate draw_gaze dx/dy math for verification."""
x_min, _, x_max, _ = map(int, bbox[:4])
length = x_max - x_min
dx = int(-length * np.sin(yaw) * np.cos(pitch))
dy = int(-length * np.sin(pitch))
return dx, dy
def test_draw_gaze_yaw_only_moves_horizontally():
"""Yaw-only input (pitch=0) should produce horizontal displacement only."""
image = np.zeros((200, 200, 3), dtype=np.uint8)
bbox = np.array([50, 50, 150, 150], dtype=np.float32)
yaw = 0.5
pitch = 0.0
dx, dy = _compute_gaze_delta(bbox, pitch, yaw)
assert dx != 0, 'Yaw-only should produce horizontal displacement'
assert dy == 0, 'Yaw-only should produce zero vertical displacement'
# Should not raise
draw_gaze(image, bbox, pitch, yaw, draw_bbox=False, draw_angles=False)
def test_draw_gaze_pitch_only_moves_vertically():
"""Pitch-only input (yaw=0) should produce vertical displacement only."""
image = np.zeros((200, 200, 3), dtype=np.uint8)
bbox = np.array([50, 50, 150, 150], dtype=np.float32)
yaw = 0.0
pitch = 0.5
dx, dy = _compute_gaze_delta(bbox, pitch, yaw)
assert dx == 0, 'Pitch-only should produce zero horizontal displacement'
assert dy != 0, 'Pitch-only should produce vertical displacement'
# Should not raise
draw_gaze(image, bbox, pitch, yaw, draw_bbox=False, draw_angles=False)
def test_draw_gaze_modifies_image():
"""draw_gaze should modify the image in place."""
image = np.zeros((200, 200, 3), dtype=np.uint8)
bbox = np.array([50, 50, 150, 150], dtype=np.float32)
original = image.copy()
draw_gaze(image, bbox, 0.3, 0.3)
assert not np.array_equal(image, original), 'draw_gaze should modify the image'

View File

@@ -9,12 +9,14 @@ import numpy as np
import pytest
from uniface import (
create_attribute_predictor,
create_detector,
create_landmarker,
create_recognizer,
list_available_detectors,
)
from uniface.constants import RetinaFaceWeights, SCRFDWeights
from uniface.attribute import AgeGender, FairFace
from uniface.constants import AgeGenderWeights, FairFaceWeights, RetinaFaceWeights, SCRFDWeights
from uniface.spoofing import MiniFASNet, create_spoofer
@@ -89,6 +91,12 @@ def test_create_recognizer_sphereface():
assert recognizer is not None, 'Failed to create SphereFace recognizer'
def test_create_recognizer_edgeface():
"""Test creating an EdgeFace recognizer using factory function."""
recognizer = create_recognizer('edgeface')
assert recognizer is not None, 'Failed to create EdgeFace recognizer'
def test_create_recognizer_invalid_method():
"""
Test that invalid recognizer method raises an error.
@@ -122,6 +130,25 @@ def test_create_landmarker_invalid_method():
create_landmarker('invalid_method')
def test_create_landmarker_pipnet_default():
"""create_landmarker('pipnet') returns a PIPNet (98 points by default)."""
from uniface.landmark import PIPNet
landmarker = create_landmarker('pipnet')
assert isinstance(landmarker, PIPNet), 'Should return PIPNet instance'
assert landmarker.num_lms == 98
def test_create_landmarker_pipnet_68():
"""create_landmarker('pipnet', model_name=...) selects the 68-point variant."""
from uniface.constants import PIPNetWeights
from uniface.landmark import PIPNet
landmarker = create_landmarker('pipnet', model_name=PIPNetWeights.DW300_CELEBA_68)
assert isinstance(landmarker, PIPNet), 'Should return PIPNet instance'
assert landmarker.num_lms == 68
# list_available_detectors tests
def test_list_available_detectors():
"""
@@ -165,7 +192,7 @@ def test_recognizer_inference_from_factory():
embedding = recognizer.get_embedding(mock_image)
assert embedding is not None, 'Recognizer should return embedding'
assert embedding.shape[1] == 512, 'Should return 512-dimensional embedding'
assert embedding.shape == (1, 512), 'get_embedding should return (1, 512) with batch dimension'
def test_landmarker_inference_from_factory():
@@ -181,6 +208,17 @@ def test_landmarker_inference_from_factory():
assert landmarks.shape == (106, 2), 'Should return 106 landmarks'
def test_pipnet_landmarker_inference_from_factory():
"""PIPNet landmarker created from factory can perform inference."""
landmarker = create_landmarker('pipnet')
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
mock_bbox = [100, 100, 300, 300]
landmarks = landmarker.get_landmarks(mock_image, mock_bbox)
assert landmarks is not None, 'Landmarker should return landmarks'
assert landmarks.shape == (98, 2), 'Should return 98 landmarks'
def test_multiple_detector_creation():
"""
Test that multiple detectors can be created independently.
@@ -236,3 +274,19 @@ def test_create_spoofer_with_providers():
"""Test that create_spoofer forwards providers kwarg without TypeError."""
spoofer = create_spoofer(providers=['CPUExecutionProvider'])
assert isinstance(spoofer, MiniFASNet), 'Should return MiniFASNet instance'
# create_attribute_predictor tests
def test_create_attribute_predictor_age_gender():
predictor = create_attribute_predictor(AgeGenderWeights.DEFAULT)
assert isinstance(predictor, AgeGender), 'Should return AgeGender instance'
def test_create_attribute_predictor_fairface():
predictor = create_attribute_predictor(FairFaceWeights.DEFAULT)
assert isinstance(predictor, FairFace), 'Should return FairFace instance'
def test_create_attribute_predictor_invalid():
with pytest.raises(ValueError, match='Unsupported attribute model'):
create_attribute_predictor('invalid_model')

115
tests/test_headpose.py Normal file
View File

@@ -0,0 +1,115 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
import numpy as np
import pytest
from uniface import HeadPose, HeadPoseResult, create_head_pose_estimator
from uniface.headpose import BaseHeadPoseEstimator
from uniface.headpose.models import HeadPose as HeadPoseModel
def test_create_head_pose_estimator_default():
"""Test creating a head pose estimator with default parameters."""
estimator = create_head_pose_estimator()
assert isinstance(estimator, HeadPose), 'Should return HeadPose instance'
def test_create_head_pose_estimator_aliases():
"""Test that factory accepts all documented aliases."""
for alias in ('headpose', 'head_pose', '6drepnet'):
estimator = create_head_pose_estimator(alias)
assert isinstance(estimator, HeadPose), f"Alias '{alias}' should return HeadPose"
def test_create_head_pose_estimator_invalid():
"""Test that invalid method raises ValueError."""
with pytest.raises(ValueError, match='Unsupported head pose estimation method'):
create_head_pose_estimator('invalid_method')
def test_head_pose_inference():
"""Test that HeadPose can run inference on a mock image."""
estimator = HeadPose()
mock_image = np.random.randint(0, 255, (224, 224, 3), dtype=np.uint8)
result = estimator.estimate(mock_image)
assert isinstance(result, HeadPoseResult), 'Should return HeadPoseResult'
assert isinstance(result.pitch, float), 'pitch should be float'
assert isinstance(result.yaw, float), 'yaw should be float'
assert isinstance(result.roll, float), 'roll should be float'
def test_head_pose_callable():
"""Test that HeadPose is callable via __call__."""
estimator = HeadPose()
mock_image = np.random.randint(0, 255, (224, 224, 3), dtype=np.uint8)
result = estimator(mock_image)
assert isinstance(result, HeadPoseResult), '__call__ should return HeadPoseResult'
def test_head_pose_result_repr():
"""Test HeadPoseResult repr formatting."""
result = HeadPoseResult(pitch=10.5, yaw=-20.3, roll=5.1)
repr_str = repr(result)
assert 'HeadPoseResult' in repr_str
assert '10.5' in repr_str
assert '-20.3' in repr_str
assert '5.1' in repr_str
def test_head_pose_result_frozen():
"""Test that HeadPoseResult is immutable."""
result = HeadPoseResult(pitch=1.0, yaw=2.0, roll=3.0)
with pytest.raises(AttributeError):
result.pitch = 99.0 # type: ignore[misc]
def test_rotation_matrix_to_euler_identity():
"""Test that identity rotation matrix gives zero angles."""
identity = np.eye(3).reshape(1, 3, 3)
euler = HeadPoseModel.rotation_matrix_to_euler(identity)
assert euler.shape == (1, 3), 'Should return (1, 3) shaped array'
np.testing.assert_allclose(euler[0], [0.0, 0.0, 0.0], atol=1e-5)
def test_rotation_matrix_to_euler_90deg_yaw():
"""Test 90-degree yaw rotation."""
angle = np.radians(90)
R = np.array(
[
[np.cos(angle), 0, np.sin(angle)],
[0, 1, 0],
[-np.sin(angle), 0, np.cos(angle)],
]
).reshape(1, 3, 3)
euler = HeadPoseModel.rotation_matrix_to_euler(R)
np.testing.assert_allclose(euler[0, 1], 90.0, atol=1e-3)
def test_rotation_matrix_to_euler_batch():
"""Test batch processing of rotation matrices."""
batch = np.stack([np.eye(3), np.eye(3), np.eye(3)], axis=0)
euler = HeadPoseModel.rotation_matrix_to_euler(batch)
assert euler.shape == (3, 3), 'Batch of 3 should return (3, 3)'
np.testing.assert_allclose(euler, 0.0, atol=1e-5)
def test_factory_returns_correct_type():
"""Test that factory function returns BaseHeadPoseEstimator subclass."""
estimator = create_head_pose_estimator()
assert isinstance(estimator, BaseHeadPoseEstimator), 'Should be BaseHeadPoseEstimator subclass'
def test_head_pose_with_providers():
"""Test that HeadPose accepts providers kwarg."""
estimator = HeadPose(providers=['CPUExecutionProvider'])
assert isinstance(estimator, HeadPose), 'Should create with explicit providers'

158
tests/test_matting.py Normal file
View File

@@ -0,0 +1,158 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
import numpy as np
import pytest
from uniface.constants import MODNetWeights
from uniface.matting import MODNet, create_matting_model
def test_modnet_initialization():
"""Test MODNet initialization with default weights."""
matting = MODNet()
assert matting is not None
assert matting.input_size == 512
def test_modnet_with_webcam_weights():
"""Test MODNet initialization with webcam variant."""
matting = MODNet(model_name=MODNetWeights.WEBCAM)
assert matting is not None
assert matting.input_size == 512
def test_modnet_custom_input_size():
"""Test MODNet with custom input size."""
matting = MODNet(input_size=256)
assert matting.input_size == 256
def test_modnet_preprocess():
"""Test preprocessing produces correct tensor shape and dtype."""
matting = MODNet()
image = np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
tensor, orig_h, orig_w = matting.preprocess(image)
assert tensor.dtype == np.float32
assert tensor.ndim == 4
assert tensor.shape[0] == 1
assert tensor.shape[1] == 3
assert tensor.shape[2] % 32 == 0
assert tensor.shape[3] % 32 == 0
assert orig_h == 480
assert orig_w == 640
def test_modnet_preprocess_small_image():
"""Test preprocessing with image smaller than input_size."""
matting = MODNet(input_size=512)
image = np.random.randint(0, 255, (128, 128, 3), dtype=np.uint8)
tensor, orig_h, orig_w = matting.preprocess(image)
assert tensor.shape[2] % 32 == 0
assert tensor.shape[3] % 32 == 0
assert orig_h == 128
assert orig_w == 128
def test_modnet_preprocess_large_image():
"""Test preprocessing with image larger than input_size."""
matting = MODNet(input_size=512)
image = np.random.randint(0, 255, (1080, 1920, 3), dtype=np.uint8)
tensor, orig_h, orig_w = matting.preprocess(image)
assert tensor.shape[2] % 32 == 0
assert tensor.shape[3] % 32 == 0
assert orig_h == 1080
assert orig_w == 1920
def test_modnet_postprocess():
"""Test postprocessing resizes matte to original dimensions."""
matting = MODNet()
dummy_output = np.random.rand(1, 1, 512, 672).astype(np.float32)
matte = matting.postprocess(dummy_output, original_size=(640, 480))
assert matte.shape == (480, 640)
assert matte.dtype == np.float32
def test_modnet_predict():
"""Test end-to-end prediction."""
matting = MODNet()
image = np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
matte = matting.predict(image)
assert matte.shape == (480, 640)
assert matte.dtype == np.float32
assert matte.min() >= 0.0
assert matte.max() <= 1.0
def test_modnet_callable():
"""Test that MODNet is callable via __call__."""
matting = MODNet()
image = np.random.randint(0, 255, (256, 256, 3), dtype=np.uint8)
matte = matting(image)
assert matte.shape == (256, 256)
assert matte.dtype == np.float32
def test_modnet_different_input_sizes():
"""Test prediction with various image dimensions."""
matting = MODNet()
sizes = [(256, 256), (480, 640), (720, 1280), (300, 500)]
for h, w in sizes:
image = np.random.randint(0, 255, (h, w, 3), dtype=np.uint8)
matte = matting.predict(image)
assert matte.shape == (h, w), f'Failed for size {h}x{w}'
assert matte.dtype == np.float32
# Factory tests
def test_create_matting_model_default():
"""Test factory with default parameters."""
matting = create_matting_model()
assert matting is not None
assert isinstance(matting, MODNet)
def test_create_matting_model_with_enum():
"""Test factory with enum."""
matting = create_matting_model(MODNetWeights.WEBCAM)
assert isinstance(matting, MODNet)
def test_create_matting_model_with_string():
"""Test factory with string model name."""
matting = create_matting_model('modnet_photographic')
assert isinstance(matting, MODNet)
def test_create_matting_model_webcam_string():
"""Test factory with webcam string model name."""
matting = create_matting_model('modnet_webcam')
assert isinstance(matting, MODNet)
def test_create_matting_model_invalid():
"""Test factory with invalid model name."""
with pytest.raises(ValueError, match='Unknown matting model'):
create_matting_model('invalid_model')

View File

@@ -0,0 +1,132 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
from __future__ import annotations
import numpy as np
import pytest
from uniface.constants import PIPNetWeights
from uniface.landmark import PIPNet
@pytest.fixture(scope='module', params=[PIPNetWeights.WFLW_98, PIPNetWeights.DW300_CELEBA_68])
def pipnet_model(request):
return PIPNet(model_name=request.param)
@pytest.fixture
def mock_image():
return np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
@pytest.fixture
def mock_bbox():
return [100, 100, 300, 300]
def _expected_n_lms(model: PIPNet) -> int:
return 98 if model.num_lms == 98 else 68
def test_model_initialization(pipnet_model):
assert pipnet_model is not None, 'PIPNet model initialization failed.'
assert pipnet_model.num_lms in (68, 98), f'Unexpected num_lms: {pipnet_model.num_lms}'
assert pipnet_model.input_h == pipnet_model.input_w == 256
def test_landmark_detection(pipnet_model, mock_image, mock_bbox):
landmarks = pipnet_model.get_landmarks(mock_image, mock_bbox)
n = _expected_n_lms(pipnet_model)
assert landmarks.shape == (n, 2), f'Expected shape ({n}, 2), got {landmarks.shape}'
def test_landmark_dtype(pipnet_model, mock_image, mock_bbox):
landmarks = pipnet_model.get_landmarks(mock_image, mock_bbox)
assert landmarks.dtype == np.float32, f'Expected float32, got {landmarks.dtype}'
def test_landmark_coordinates_within_image(pipnet_model, mock_image, mock_bbox):
landmarks = pipnet_model.get_landmarks(mock_image, mock_bbox)
n = _expected_n_lms(pipnet_model)
x_coords = landmarks[:, 0]
y_coords = landmarks[:, 1]
x1, y1, x2, y2 = mock_bbox
margin = 50
x_in_bounds = int(np.sum((x_coords >= x1 - margin) & (x_coords <= x2 + margin)))
y_in_bounds = int(np.sum((y_coords >= y1 - margin) & (y_coords <= y2 + margin)))
threshold = max(int(0.9 * n), n - 5)
assert x_in_bounds >= threshold, f'Only {x_in_bounds}/{n} x-coordinates within bounds'
assert y_in_bounds >= threshold, f'Only {y_in_bounds}/{n} y-coordinates within bounds'
def test_different_bbox_sizes(pipnet_model, mock_image):
n = _expected_n_lms(pipnet_model)
test_bboxes = [
[50, 50, 150, 150],
[100, 100, 300, 300],
[50, 50, 400, 400],
]
for bbox in test_bboxes:
landmarks = pipnet_model.get_landmarks(mock_image, bbox)
assert landmarks.shape == (n, 2), f'Failed for bbox {bbox}'
def test_consistency(pipnet_model, mock_image, mock_bbox):
landmarks1 = pipnet_model.get_landmarks(mock_image, mock_bbox)
landmarks2 = pipnet_model.get_landmarks(mock_image, mock_bbox)
assert np.allclose(landmarks1, landmarks2), 'Same input should produce same landmarks'
def test_different_image_sizes(pipnet_model, mock_bbox):
n = _expected_n_lms(pipnet_model)
test_sizes = [(480, 640, 3), (720, 1280, 3), (1080, 1920, 3)]
for size in test_sizes:
mock_image = np.random.randint(0, 255, size, dtype=np.uint8)
landmarks = pipnet_model.get_landmarks(mock_image, mock_bbox)
assert landmarks.shape == (n, 2), f'Failed for image size {size}'
def test_bbox_list_format(pipnet_model, mock_image):
n = _expected_n_lms(pipnet_model)
landmarks = pipnet_model.get_landmarks(mock_image, [100, 100, 300, 300])
assert landmarks.shape == (n, 2), 'Should work with bbox as list'
def test_bbox_array_format(pipnet_model, mock_image):
n = _expected_n_lms(pipnet_model)
bbox_array = np.array([100, 100, 300, 300])
landmarks = pipnet_model.get_landmarks(mock_image, bbox_array)
assert landmarks.shape == (n, 2), 'Should work with bbox as numpy array'
def test_landmark_distribution(pipnet_model, mock_image, mock_bbox):
landmarks = pipnet_model.get_landmarks(mock_image, mock_bbox)
x_variance = np.var(landmarks[:, 0])
y_variance = np.var(landmarks[:, 1])
assert x_variance > 0, 'Landmarks should have variation in x-coordinates'
assert y_variance > 0, 'Landmarks should have variation in y-coordinates'
def test_default_model_is_wflw_98():
"""PIPNet() with no args should default to the 98-point WFLW model."""
model = PIPNet()
assert model.num_lms == 98
def test_meanface_lookup_invalid_num_lms():
"""get_meanface_info should reject unsupported landmark counts."""
from uniface.landmark._meanface import get_meanface_info
with pytest.raises(ValueError, match='No meanface table'):
get_meanface_info(num_lms=42)

View File

@@ -8,7 +8,7 @@ from __future__ import annotations
import numpy as np
import pytest
from uniface.recognition import ArcFace, MobileFace, SphereFace
from uniface.recognition import ArcFace, EdgeFace, MobileFace, SphereFace
@pytest.fixture
@@ -35,6 +35,12 @@ def sphereface_model():
return SphereFace()
@pytest.fixture
def edgeface_model():
"""Fixture to initialize the EdgeFace model for testing."""
return EdgeFace()
@pytest.fixture
def mock_aligned_face():
"""
@@ -74,7 +80,7 @@ def test_arcface_embedding_shape(arcface_model, mock_aligned_face):
"""
embedding = arcface_model.get_embedding(mock_aligned_face)
# ArcFace typically produces 512-dimensional embeddings
# ArcFace get_embedding returns raw ONNX output with batch dimension
assert embedding.shape[1] == 512, f'Expected 512-dim embedding, got {embedding.shape[1]}'
assert embedding.shape[0] == 1, 'Embedding should have batch dimension of 1'
@@ -88,7 +94,8 @@ def test_arcface_normalized_embedding(arcface_model, mock_landmarks):
embedding = arcface_model.get_normalized_embedding(mock_image, mock_landmarks)
# Check that embedding is normalized (L2 norm ≈ 1.0)
# Check shape and normalization
assert embedding.shape == (512,), f'Expected shape (512,), got {embedding.shape}'
norm = np.linalg.norm(embedding)
assert np.isclose(norm, 1.0, atol=1e-5), f'Normalized embedding should have norm 1.0, got {norm}'
@@ -125,7 +132,7 @@ def test_mobileface_embedding_shape(mobileface_model, mock_aligned_face):
"""
embedding = mobileface_model.get_embedding(mock_aligned_face)
# MobileFace typically produces 512-dimensional embeddings
# MobileFace get_embedding returns raw ONNX output with batch dimension
assert embedding.shape[1] == 512, f'Expected 512-dim embedding, got {embedding.shape[1]}'
assert embedding.shape[0] == 1, 'Embedding should have batch dimension of 1'
@@ -138,6 +145,7 @@ def test_mobileface_normalized_embedding(mobileface_model, mock_landmarks):
embedding = mobileface_model.get_normalized_embedding(mock_image, mock_landmarks)
assert embedding.shape == (512,), f'Expected shape (512,), got {embedding.shape}'
norm = np.linalg.norm(embedding)
assert np.isclose(norm, 1.0, atol=1e-5), f'Normalized embedding should have norm 1.0, got {norm}'
@@ -156,7 +164,7 @@ def test_sphereface_embedding_shape(sphereface_model, mock_aligned_face):
"""
embedding = sphereface_model.get_embedding(mock_aligned_face)
# SphereFace typically produces 512-dimensional embeddings
# SphereFace get_embedding returns raw ONNX output with batch dimension
assert embedding.shape[1] == 512, f'Expected 512-dim embedding, got {embedding.shape[1]}'
assert embedding.shape[0] == 1, 'Embedding should have batch dimension of 1'
@@ -169,10 +177,50 @@ def test_sphereface_normalized_embedding(sphereface_model, mock_landmarks):
embedding = sphereface_model.get_normalized_embedding(mock_image, mock_landmarks)
assert embedding.shape == (512,), f'Expected shape (512,), got {embedding.shape}'
norm = np.linalg.norm(embedding)
assert np.isclose(norm, 1.0, atol=1e-5), f'Normalized embedding should have norm 1.0, got {norm}'
# EdgeFace Tests
def test_edgeface_initialization(edgeface_model):
"""Test that the EdgeFace model initializes correctly."""
assert edgeface_model is not None, 'EdgeFace model initialization failed.'
def test_edgeface_embedding_shape(edgeface_model, mock_aligned_face):
"""Test that EdgeFace produces embeddings with the correct shape."""
embedding = edgeface_model.get_embedding(mock_aligned_face)
assert embedding.shape[1] == 512, f'Expected 512-dim embedding, got {embedding.shape[1]}'
assert embedding.shape[0] == 1, 'Embedding should have batch dimension of 1'
def test_edgeface_normalized_embedding(edgeface_model, mock_landmarks):
"""Test that EdgeFace normalized embeddings have unit length."""
mock_image = np.random.randint(0, 255, (640, 640, 3), dtype=np.uint8)
embedding = edgeface_model.get_normalized_embedding(mock_image, mock_landmarks)
assert embedding.shape == (512,), f'Expected shape (512,), got {embedding.shape}'
norm = np.linalg.norm(embedding)
assert np.isclose(norm, 1.0, atol=1e-5), f'Normalized embedding should have norm 1.0, got {norm}'
def test_edgeface_embedding_dtype(edgeface_model, mock_aligned_face):
"""Test that EdgeFace embeddings have the correct data type."""
embedding = edgeface_model.get_embedding(mock_aligned_face)
assert embedding.dtype == np.float32, f'Expected float32, got {embedding.dtype}'
def test_edgeface_consistency(edgeface_model, mock_aligned_face):
"""Test that the same input produces the same EdgeFace embedding."""
embedding1 = edgeface_model.get_embedding(mock_aligned_face)
embedding2 = edgeface_model.get_embedding(mock_aligned_face)
assert np.allclose(embedding1, embedding2), 'Same input should produce same embedding'
# Cross-model comparison tests
def test_different_models_different_embeddings(arcface_model, mobileface_model, mock_aligned_face):
"""

View File

@@ -12,9 +12,11 @@ CLI utilities for testing and running UniFace features.
| `anonymize.py` | Face anonymization/blurring for privacy |
| `emotion.py` | Emotion detection (7 or 8 emotions) |
| `gaze.py` | Gaze direction estimation |
| `headpose.py` | Head pose estimation (pitch, yaw, roll) |
| `landmarks.py` | 106-point facial landmark detection |
| `recognize.py` | Face embedding extraction and comparison |
| `search.py` | Real-time face matching against reference |
| `faiss_search.py` | FAISS index build and multi-identity face search |
| `fairface.py` | FairFace attribute prediction (race, gender, age) |
| `attribute.py` | Age and gender prediction |
| `spoofing.py` | Face anti-spoofing detection |
@@ -61,6 +63,11 @@ python tools/emotion.py --source 0
python tools/gaze.py --source assets/test.jpg
python tools/gaze.py --source 0
# Head pose estimation
python tools/headpose.py --source assets/test.jpg
python tools/headpose.py --source 0
python tools/headpose.py --source 0 --draw-type axis
# Landmarks
python tools/landmarks.py --source assets/test.jpg
python tools/landmarks.py --source 0
@@ -108,7 +115,7 @@ python tools/download_model.py # downloads all
| Option | Description |
|--------|-------------|
| `--source` | Input source: image/video path or camera ID (0, 1, ...) |
| `--detector` | Choose detector: `retinaface`, `scrfd`, `yolov5face` |
| `--detector` | Choose detector: `retinaface`, `scrfd`, `yolov5face`, `yolov8face` |
| `--threshold` | Visualization confidence threshold (default: varies) |
| `--save-dir` | Output directory (default: `outputs`) |

View File

@@ -27,12 +27,17 @@ from uniface.draw import draw_detections
from uniface.recognition import ArcFace
def draw_face_info(image, face, face_id):
"""Draw face ID and attributes above bounding box."""
def draw_face_info(image, face):
"""Draw face attributes above bounding box."""
x1, y1, _x2, y2 = map(int, face.bbox)
lines = [f'ID: {face_id}', f'Conf: {face.confidence:.2f}']
if face.age and face.sex:
lines = []
if face.age is not None and face.sex is not None:
lines.append(f'{face.sex}, {face.age}y')
if face.emotion is not None:
lines.append(face.emotion)
if not lines:
return
for i, line in enumerate(lines):
y_pos = y1 - 10 - (len(lines) - 1 - i) * 25
@@ -95,13 +100,10 @@ def process_image(analyzer, image_path: str, save_dir: str = 'outputs', show_sim
status = 'Same' if sim > 0.4 else 'Different'
print(f' Face {i + 1} ↔ Face {j + 1}: {sim:.3f} ({status})')
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, corner_bbox=True)
draw_detections(image=image, faces=faces, corner_bbox=True)
for i, face in enumerate(faces, 1):
draw_face_info(image, face, i)
for face in faces:
draw_face_info(image, face)
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(image_path).stem}_analysis.jpg')
@@ -137,13 +139,10 @@ def process_video(analyzer, video_path: str, save_dir: str = 'outputs'):
frame_count += 1
faces = analyzer.analyze(frame)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, corner_bbox=True)
draw_detections(image=frame, faces=faces, corner_bbox=True)
for i, face in enumerate(faces, 1):
draw_face_info(frame, face, i)
for face in faces:
draw_face_info(frame, face)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
out.write(frame)
@@ -167,19 +166,16 @@ def run_camera(analyzer, camera_id: int = 0):
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
frame = cv2.flip(frame, 1)
faces = analyzer.analyze(frame)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, corner_bbox=True)
draw_detections(image=frame, faces=faces, corner_bbox=True)
for i, face in enumerate(faces, 1):
draw_face_info(frame, face, i)
for face in faces:
draw_face_info(frame, face)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.imshow('Face Analyzer', frame)
@@ -201,7 +197,7 @@ def main():
detector = RetinaFace()
recognizer = ArcFace()
age_gender = AgeGender()
analyzer = FaceAnalyzer(detector, recognizer, age_gender)
analyzer = FaceAnalyzer(detector, recognizer=recognizer, attributes=[age_gender])
source_type = get_source_type(args.source)

View File

@@ -43,10 +43,7 @@ def process_image(
from uniface.draw import draw_detections
preview = image.copy()
bboxes = [face.bbox for face in faces]
scores = [face.confidence for face in faces]
landmarks = [face.landmarks for face in faces]
draw_detections(preview, bboxes, scores, landmarks)
draw_detections(image=preview, faces=faces)
cv2.imshow('Detections (Press any key to continue)', preview)
cv2.waitKey(0)
@@ -121,9 +118,9 @@ def run_camera(detector, blurrer: BlurFace, camera_id: int = 0):
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)
if faces:

View File

@@ -52,15 +52,10 @@ def process_image(
if not faces:
return
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
draw_detections(image=image, faces=faces, vis_threshold=threshold, corner_bbox=True)
for i, face in enumerate(faces):
result = age_gender.predict(image, face.bbox)
result = age_gender.predict(image, face)
print(f' Face {i + 1}: {result.sex}, {result.age} years old')
draw_age_gender_label(image, face.bbox, result.sex, result.age)
@@ -104,15 +99,10 @@ def process_video(
frame_count += 1
faces = detector.detect(frame)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
draw_detections(image=frame, faces=faces, vis_threshold=threshold, corner_bbox=True)
for face in faces:
result = age_gender.predict(frame, face.bbox)
result = age_gender.predict(frame, face)
draw_age_gender_label(frame, face.bbox, result.sex, result.age)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
@@ -137,21 +127,16 @@ def run_camera(detector, age_gender, camera_id: int = 0, threshold: float = 0.6)
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
draw_detections(image=frame, faces=faces, vis_threshold=threshold, corner_bbox=True)
for face in faces:
result = age_gender.predict(frame, face.bbox)
result = age_gender.predict(frame, face)
draw_age_gender_label(frame, face.bbox, result.sex, result.age)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)

View File

@@ -34,13 +34,7 @@ def process_image(detector, image_path: Path, output_path: Path, threshold: floa
faces = detector.detect(image)
# unpack face data for visualization
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
draw_detections(image=image, faces=faces, vis_threshold=threshold, corner_bbox=True)
cv2.putText(
image,

View File

@@ -15,6 +15,7 @@ from __future__ import annotations
import argparse
import os
from pathlib import Path
import time
from _common import get_source_type
import cv2
@@ -34,10 +35,7 @@ def process_image(detector, image_path: str, threshold: float = 0.6, save_dir: s
faces = detector.detect(image)
if faces:
bboxes = [face.bbox for face in faces]
scores = [face.confidence for face in faces]
landmarks = [face.landmarks for face in faces]
draw_detections(image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold)
draw_detections(image=image, faces=faces, vis_threshold=threshold)
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{os.path.splitext(os.path.basename(image_path))[0]}_out.jpg')
@@ -83,24 +81,22 @@ def process_video(
if not ret:
break
t0 = time.perf_counter()
frame_count += 1
faces = detector.detect(frame)
total_faces += len(faces)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame,
bboxes=bboxes,
scores=scores,
landmarks=landmarks,
faces=faces,
vis_threshold=threshold,
draw_score=True,
corner_bbox=True,
)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
inference_fps = 1.0 / max(time.perf_counter() - t0, 1e-9)
cv2.putText(frame, f'FPS: {inference_fps:.1f}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 65), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
out.write(frame)
if show_preview:
@@ -128,28 +124,28 @@ def run_camera(detector, camera_id: int = 0, threshold: float = 0.6):
print("Press 'q' to quit")
prev_time = time.perf_counter()
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame,
bboxes=bboxes,
scores=scores,
landmarks=landmarks,
faces=faces,
vis_threshold=threshold,
draw_score=True,
corner_bbox=True,
)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
curr_time = time.perf_counter()
fps = 1.0 / max(curr_time - prev_time, 1e-9)
prev_time = curr_time
cv2.putText(frame, f'FPS: {fps:.1f}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 65), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.imshow('Face Detection', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):

View File

@@ -1,9 +1,12 @@
import argparse
from uniface.constants import (
AdaFaceWeights,
AgeGenderWeights,
ArcFaceWeights,
DDAMFNWeights,
EdgeFaceWeights,
HeadPoseWeights,
LandmarkWeights,
MobileFaceWeights,
RetinaFaceWeights,
@@ -14,13 +17,16 @@ from uniface.model_store import verify_model_weights
MODEL_TYPES = {
'retinaface': RetinaFaceWeights,
'sphereface': SphereFaceWeights,
'mobileface': MobileFaceWeights,
'adaface': AdaFaceWeights,
'arcface': ArcFaceWeights,
'edgeface': EdgeFaceWeights,
'mobileface': MobileFaceWeights,
'sphereface': SphereFaceWeights,
'scrfd': SCRFDWeights,
'ddamfn': DDAMFNWeights,
'agegender': AgeGenderWeights,
'landmark': LandmarkWeights,
'headpose': HeadPoseWeights,
}

View File

@@ -52,15 +52,10 @@ def process_image(
if not faces:
return
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
draw_detections(image=image, faces=faces, vis_threshold=threshold, corner_bbox=True)
for i, face in enumerate(faces):
result = emotion_predictor.predict(image, face.landmarks)
result = emotion_predictor.predict(image, face)
print(f' Face {i + 1}: {result.emotion} (confidence: {result.confidence:.3f})')
draw_emotion_label(image, face.bbox, result.emotion, result.confidence)
@@ -104,15 +99,10 @@ def process_video(
frame_count += 1
faces = detector.detect(frame)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
draw_detections(image=frame, faces=faces, vis_threshold=threshold, corner_bbox=True)
for face in faces:
result = emotion_predictor.predict(frame, face.landmarks)
result = emotion_predictor.predict(frame, face)
draw_emotion_label(frame, face.bbox, result.emotion, result.confidence)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
@@ -137,21 +127,16 @@ def run_camera(detector, emotion_predictor, camera_id: int = 0, threshold: float
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
draw_detections(image=frame, faces=faces, vis_threshold=threshold, corner_bbox=True)
for face in faces:
result = emotion_predictor.predict(frame, face.landmarks)
result = emotion_predictor.predict(frame, face)
draw_emotion_label(frame, face.bbox, result.emotion, result.confidence)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)

View File

@@ -52,15 +52,10 @@ def process_image(
if not faces:
return
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=image, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
draw_detections(image=image, faces=faces, vis_threshold=threshold, corner_bbox=True)
for i, face in enumerate(faces):
result = fairface.predict(image, face.bbox)
result = fairface.predict(image, face)
print(f' Face {i + 1}: {result.sex}, {result.age_group}, {result.race}')
draw_fairface_label(image, face.bbox, result.sex, result.age_group, result.race)
@@ -104,15 +99,10 @@ def process_video(
frame_count += 1
faces = detector.detect(frame)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
draw_detections(image=frame, faces=faces, vis_threshold=threshold, corner_bbox=True)
for face in faces:
result = fairface.predict(frame, face.bbox)
result = fairface.predict(frame, face)
draw_fairface_label(frame, face.bbox, result.sex, result.age_group, result.race)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
@@ -137,21 +127,16 @@ def run_camera(detector, fairface, camera_id: int = 0, threshold: float = 0.6):
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)
bboxes = [f.bbox for f in faces]
scores = [f.confidence for f in faces]
landmarks = [f.landmarks for f in faces]
draw_detections(
image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, corner_bbox=True
)
draw_detections(image=frame, faces=faces, vis_threshold=threshold, corner_bbox=True)
for face in faces:
result = fairface.predict(frame, face.bbox)
result = fairface.predict(frame, face)
draw_fairface_label(frame, face.bbox, result.sex, result.age_group, result.race)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)

208
tools/faiss_search.py Normal file
View File

@@ -0,0 +1,208 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""FAISS index build and multi-identity face search.
Build a vector index from a directory of person sub-folders, then search
against it in a video or webcam stream.
Usage:
python tools/faiss_search.py build --faces-dir dataset/ --db-path ./vector_index
python tools/faiss_search.py run --db-path ./vector_index --source video.mp4
python tools/faiss_search.py run --db-path ./vector_index --source 0 # webcam
"""
from __future__ import annotations
import argparse
import os
from pathlib import Path
from _common import IMAGE_EXTENSIONS, get_source_type
import cv2
from uniface import create_detector, create_recognizer
from uniface.draw import draw_corner_bbox, draw_text_label
from uniface.stores import FAISS
def _draw_face(image, bbox, text: str, color: tuple[int, int, int]) -> None:
x1, y1, x2, y2 = map(int, bbox[:4])
thickness = max(round(sum(image.shape[:2]) / 2 * 0.003), 2)
font_scale = max(0.4, min(0.7, (y2 - y1) / 200))
draw_corner_bbox(image, (x1, y1, x2, y2), color=color, thickness=thickness)
draw_text_label(image, text, x1, y1, bg_color=color, font_scale=font_scale)
def process_frame(frame, detector, recognizer, store: FAISS, threshold: float = 0.4):
faces = detector.detect(frame)
if not faces:
return frame
for face in faces:
embedding = recognizer.get_normalized_embedding(frame, face.landmarks)
result, sim = store.search(embedding, threshold=threshold)
text = f'{result["person_id"]} ({sim:.2f})' if result else f'Unknown ({sim:.2f})'
color = (0, 255, 0) if result else (0, 0, 255)
_draw_face(frame, face.bbox, text, color)
return frame
def process_video(detector, recognizer, store: FAISS, video_path: str, save_dir: str, threshold: float = 0.4):
cap = cv2.VideoCapture(video_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{video_path}'")
return
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(video_path).stem}_faiss_search.mp4')
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
print(f'Processing video: {video_path} ({total_frames} frames)')
frame_count = 0
while True:
ret, frame = cap.read()
if not ret:
break
frame_count += 1
frame = process_frame(frame, detector, recognizer, store, threshold)
out.write(frame)
if frame_count % 100 == 0:
print(f' Processed {frame_count}/{total_frames} frames...')
cap.release()
out.release()
print(f'Done! Output saved: {output_path}')
def run_camera(detector, recognizer, store: FAISS, camera_id: int = 0, threshold: float = 0.4):
cap = cv2.VideoCapture(camera_id)
if not cap.isOpened():
print(f'Cannot open camera {camera_id}')
return
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
if not ret:
break
frame = cv2.flip(frame, 1)
frame = process_frame(frame, detector, recognizer, store, threshold)
cv2.imshow('Vector Search', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
def build(args: argparse.Namespace) -> None:
faces_dir = Path(args.faces_dir)
if not faces_dir.is_dir():
print(f"Error: '{faces_dir}' is not a directory")
return
detector = create_detector()
recognizer = create_recognizer()
store = FAISS(db_path=args.db_path)
persons = sorted(p.name for p in faces_dir.iterdir() if p.is_dir())
if not persons:
print(f"Error: No sub-folders found in '{faces_dir}'")
return
print(f'Found {len(persons)} persons: {", ".join(persons)}')
total_added = 0
for person_id in persons:
person_dir = faces_dir / person_id
images = [f for f in person_dir.iterdir() if f.suffix.lower() in IMAGE_EXTENSIONS]
added = 0
for img_path in images:
image = cv2.imread(str(img_path))
if image is None:
print(f' Warning: Failed to read {img_path}, skipping')
continue
faces = detector.detect(image)
if not faces:
print(f' Warning: No face detected in {img_path}, skipping')
continue
embedding = recognizer.get_normalized_embedding(image, faces[0].landmarks)
store.add(embedding, {'person_id': person_id, 'source': str(img_path)})
added += 1
total_added += added
if added:
print(f' {person_id}: {added} embeddings added')
else:
print(f' {person_id}: no valid faces found')
store.save()
print(f'\nIndex saved to {args.db_path} ({total_added} vectors, {len(persons)} persons)')
def run(args: argparse.Namespace) -> None:
detector = create_detector()
recognizer = create_recognizer()
store = FAISS(db_path=args.db_path)
if not store.load():
print(f"Error: No index found at '{args.db_path}'")
return
print(f'Loaded FAISS index: {store}')
source_type = get_source_type(args.source)
if source_type == 'camera':
run_camera(detector, recognizer, store, int(args.source), args.threshold)
elif source_type == 'video':
if not os.path.exists(args.source):
print(f'Error: Video not found: {args.source}')
return
process_video(detector, recognizer, store, args.source, args.save_dir, args.threshold)
else:
print(f"Error: Source must be a video file or camera ID, not '{args.source}'")
def main():
parser = argparse.ArgumentParser(description='FAISS vector search')
sub = parser.add_subparsers(dest='command', required=True)
build_p = sub.add_parser('build', help='Build a FAISS index from person sub-folders')
build_p.add_argument('--faces-dir', type=str, required=True, help='Directory with person sub-folders')
build_p.add_argument('--db-path', type=str, default='./vector_index', help='Where to save the index')
run_p = sub.add_parser('run', help='Search faces against a FAISS index')
run_p.add_argument('--db-path', type=str, required=True, help='Path to saved FAISS index')
run_p.add_argument('--source', type=str, required=True, help='Video path or camera ID')
run_p.add_argument('--threshold', type=float, default=0.4, help='Similarity threshold')
run_p.add_argument('--save-dir', type=str, default='outputs', help='Output directory')
args = parser.parse_args()
if args.command == 'build':
build(args)
elif args.command == 'run':
run(args)
if __name__ == '__main__':
main()

181
tools/headpose.py Normal file
View File

@@ -0,0 +1,181 @@
# Copyright 2025-2026 Yakhyokhuja Valikhujaev
# Author: Yakhyokhuja Valikhujaev
# GitHub: https://github.com/yakhyo
"""Head pose estimation on detected faces.
Usage:
python tools/headpose.py --source path/to/image.jpg
python tools/headpose.py --source path/to/video.mp4
python tools/headpose.py --source 0 # webcam
python tools/headpose.py --source path/to/image.jpg --draw-type axis
"""
from __future__ import annotations
import argparse
import os
from pathlib import Path
from _common import get_source_type
import cv2
from uniface.detection import RetinaFace
from uniface.draw import draw_head_pose
from uniface.headpose import HeadPose
def process_image(detector, head_pose_estimator, image_path: str, save_dir: str = 'outputs', draw_type: str = 'cube'):
"""Process a single image."""
image = cv2.imread(image_path)
if image is None:
print(f"Error: Failed to load image from '{image_path}'")
return
faces = detector.detect(image)
print(f'Detected {len(faces)} face(s)')
for i, face in enumerate(faces):
bbox = face.bbox
x1, y1, x2, y2 = map(int, bbox[:4])
face_crop = image[y1:y2, x1:x2]
if face_crop.size == 0:
continue
result = head_pose_estimator.estimate(face_crop)
print(f' Face {i + 1}: pitch={result.pitch:.1f}°, yaw={result.yaw:.1f}°, roll={result.roll:.1f}°')
draw_head_pose(image, bbox, result.pitch, result.yaw, result.roll, draw_type=draw_type)
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(image_path).stem}_headpose.jpg')
cv2.imwrite(output_path, image)
print(f'Output saved: {output_path}')
def process_video(detector, head_pose_estimator, video_path: str, save_dir: str = 'outputs', draw_type: str = 'cube'):
"""Process a video file."""
cap = cv2.VideoCapture(video_path)
if not cap.isOpened():
print(f"Error: Cannot open video file '{video_path}'")
return
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
os.makedirs(save_dir, exist_ok=True)
output_path = os.path.join(save_dir, f'{Path(video_path).stem}_headpose.mp4')
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
print(f'Processing video: {video_path} ({total_frames} frames)')
frame_count = 0
while True:
ret, frame = cap.read()
if not ret:
break
frame_count += 1
faces = detector.detect(frame)
for face in faces:
bbox = face.bbox
x1, y1, x2, y2 = map(int, bbox[:4])
face_crop = frame[y1:y2, x1:x2]
if face_crop.size == 0:
continue
result = head_pose_estimator.estimate(face_crop)
draw_head_pose(frame, bbox, result.pitch, result.yaw, result.roll, draw_type=draw_type)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
out.write(frame)
if frame_count % 100 == 0:
print(f' Processed {frame_count}/{total_frames} frames...')
cap.release()
out.release()
print(f'Done! Output saved: {output_path}')
def run_camera(detector, head_pose_estimator, camera_id: int = 0, draw_type: str = 'cube'):
"""Run real-time detection on webcam."""
cap = cv2.VideoCapture(camera_id)
if not cap.isOpened():
print(f'Cannot open camera {camera_id}')
return
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)
for face in faces:
bbox = face.bbox
x1, y1, x2, y2 = map(int, bbox[:4])
face_crop = frame[y1:y2, x1:x2]
if face_crop.size == 0:
continue
result = head_pose_estimator.estimate(face_crop)
draw_head_pose(frame, bbox, result.pitch, result.yaw, result.roll, draw_type=draw_type)
cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
cv2.imshow('Head Pose Estimation', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
def main():
parser = argparse.ArgumentParser(description='Run head pose estimation')
parser.add_argument('--source', type=str, required=True, help='Image/video path or camera ID (0, 1, ...)')
parser.add_argument('--save-dir', type=str, default='outputs', help='Output directory')
parser.add_argument(
'--draw-type',
type=str,
default='cube',
choices=['cube', 'axis'],
help='Visualization type: cube (default) or axis',
)
args = parser.parse_args()
detector = RetinaFace()
head_pose_estimator = HeadPose()
source_type = get_source_type(args.source)
if source_type == 'camera':
run_camera(detector, head_pose_estimator, int(args.source), args.draw_type)
elif source_type == 'image':
if not os.path.exists(args.source):
print(f'Error: Image not found: {args.source}')
return
process_image(detector, head_pose_estimator, args.source, args.save_dir, args.draw_type)
elif source_type == 'video':
if not os.path.exists(args.source):
print(f'Error: Video not found: {args.source}')
return
process_video(detector, head_pose_estimator, args.source, args.save_dir, args.draw_type)
else:
print(f"Error: Unknown source type for '{args.source}'")
print('Supported formats: images (.jpg, .png, ...), videos (.mp4, .avi, ...), or camera ID (0, 1, ...)')
if __name__ == '__main__':
main()

View File

@@ -114,9 +114,9 @@ def run_camera(detector, landmarker, camera_id: int = 0):
while True:
ret, frame = cap.read()
frame = cv2.flip(frame, 1)
if not ret:
break
frame = cv2.flip(frame, 1)
faces = detector.detect(frame)

Some files were not shown because too many files have changed in this diff Show More