ZF/uniface

mirror of https://github.com/yakhyo/uniface.git synced 2026-05-15 12:57:55 +00:00

Files

Yakhyokhuja Valikhujaev 13c4ac83d8 feat: Update the release workflow and package installation command (#110 )

* fix: Fix installation conflict between onnxruntime and onnxruntime-gpu

* fix: Fix CI, notebooks, type hints, and packaging issues found in audit

* feat: Add new release config

* ci: Automate release pipeline and document release process

2026-04-25 23:59:00 +09:00

4.7 KiB

Raw Blame History

Execution Providers

UniFace uses ONNX Runtime for model inference, which supports multiple hardware acceleration backends.

Automatic Provider Selection

UniFace automatically selects the optimal execution provider based on available hardware:

from uniface.detection import RetinaFace

# Automatically uses best available provider
detector = RetinaFace()

Priority order:

CoreMLExecutionProvider - Apple Silicon
CUDAExecutionProvider - NVIDIA GPU
CPUExecutionProvider - Fallback

Explicit Provider Selection

You can specify which execution provider to use by passing the providers parameter:

from uniface.detection import RetinaFace
from uniface.recognition import ArcFace

# Force CPU execution (even if GPU is available)
detector = RetinaFace(providers=['CPUExecutionProvider'])
recognizer = ArcFace(providers=['CPUExecutionProvider'])

# Use CUDA with CPU fallback
detector = RetinaFace(providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])

All ONNX-based model classes accept the providers parameter:

Detection: RetinaFace, SCRFD, YOLOv5Face, YOLOv8Face
Recognition: ArcFace, AdaFace, MobileFace, SphereFace
Landmarks: Landmark106
Gaze: MobileGaze
Parsing: BiSeNet, XSeg
Attributes: AgeGender, FairFace
Anti-Spoofing: MiniFASNet

!!! note "Non-ONNX components" - Emotion uses TorchScript and selects its device automatically (mps / cuda / cpu). It does not accept the providers parameter. - BlurFace is a pure OpenCV utility and does not load any model.

Check Available Providers

import onnxruntime as ort

providers = ort.get_available_providers()
print("Available providers:", providers)

Example outputs:

=== "macOS (Apple Silicon)"

```
['CoreMLExecutionProvider', 'CPUExecutionProvider']
```

=== "Linux (NVIDIA GPU)"

```
['CUDAExecutionProvider', 'CPUExecutionProvider']
```

=== "Windows (CPU)"

```
['CPUExecutionProvider']
```

Platform-Specific Setup

Apple Silicon (M1/M2/M3/M4)

No additional setup required. ARM64 optimizations are built into onnxruntime:

pip install uniface[cpu]

Verify ARM64:

python -c "import platform; print(platform.machine())"
# Should show: arm64

!!! tip "Performance" Apple Silicon Macs use CoreML acceleration automatically, providing excellent performance for face analysis tasks.

NVIDIA GPU (CUDA)

Install with GPU support (this installs onnxruntime-gpu, which already includes CPU fallback):

pip install uniface[gpu]

Requirements:

CUDA 11.x or 12.x
cuDNN 8.x
Compatible NVIDIA driver

Verify CUDA:

import onnxruntime as ort

if 'CUDAExecutionProvider' in ort.get_available_providers():
    print("CUDA is available!")
else:
    print("CUDA not available, using CPU")

CPU Fallback

CPU execution is always available:

pip install uniface[cpu]

Works on all platforms without additional configuration.

Internal API

For advanced use cases, you can access the provider utilities:

from uniface.onnx_utils import get_available_providers, create_onnx_session

# Check available providers
providers = get_available_providers()
print(f"Available: {providers}")

# Models use create_onnx_session() internally
# which auto-selects the best provider

Performance Tips

1. Use GPU When Available

For batch processing or real-time applications, GPU acceleration provides significant speedups:

pip install uniface[gpu]

2. Optimize Input Size

Smaller input sizes are faster but may reduce accuracy:

from uniface.detection import RetinaFace

# Faster, lower accuracy
detector = RetinaFace(input_size=(320, 320))

# Balanced (default)
detector = RetinaFace(input_size=(640, 640))

3. Batch Processing

Process multiple images to maximize GPU utilization:

# Process images in batch (GPU-efficient)
for image_path in image_paths:
    image = cv2.imread(image_path)
    faces = detector.detect(image)
    # ...

Troubleshooting

CUDA Not Detected

Verify CUDA installation:
```
nvidia-smi
```
Check CUDA version compatibility with ONNX Runtime

Reinstall with GPU support:

pip uninstall onnxruntime onnxruntime-gpu -y
pip install uniface[gpu]

Slow Performance on Mac

Verify you're using ARM64 Python (not Rosetta):

python -c "import platform; print(platform.machine())"
# Should show: arm64 (not x86_64)

Next Steps

Model Cache & Offline - Model management
Thresholds & Calibration - Tuning parameters

4.7 KiB Raw Blame History