Files
uniface/docs/concepts/execution-providers.md
Yakhyokhuja Valikhujaev 376e7bc488 docs: Add mkdocs material theme for documentation (#51)
* docs: Add mkdocs material theme for documentation

* chore: Add custom folder for rendering
2025-12-30 19:29:39 +09:00

3.5 KiB

Execution Providers

UniFace uses ONNX Runtime for model inference, which supports multiple hardware acceleration backends.


Automatic Provider Selection

UniFace automatically selects the optimal execution provider based on available hardware:

from uniface import RetinaFace

# Automatically uses best available provider
detector = RetinaFace()

Priority order:

  1. CUDAExecutionProvider - NVIDIA GPU
  2. CoreMLExecutionProvider - Apple Silicon
  3. CPUExecutionProvider - Fallback

Check Available Providers

import onnxruntime as ort

providers = ort.get_available_providers()
print("Available providers:", providers)

Example outputs:

=== "macOS (Apple Silicon)"

```
['CoreMLExecutionProvider', 'CPUExecutionProvider']
```

=== "Linux (NVIDIA GPU)"

```
['CUDAExecutionProvider', 'CPUExecutionProvider']
```

=== "Windows (CPU)"

```
['CPUExecutionProvider']
```

Platform-Specific Setup

Apple Silicon (M1/M2/M3/M4)

No additional setup required. ARM64 optimizations are built into onnxruntime:

pip install uniface

Verify ARM64:

python -c "import platform; print(platform.machine())"
# Should show: arm64

!!! tip "Performance" Apple Silicon Macs use CoreML acceleration automatically, providing excellent performance for face analysis tasks.


NVIDIA GPU (CUDA)

Install with GPU support:

pip install uniface[gpu]

Requirements:

  • CUDA 11.x or 12.x
  • cuDNN 8.x
  • Compatible NVIDIA driver

Verify CUDA:

import onnxruntime as ort

if 'CUDAExecutionProvider' in ort.get_available_providers():
    print("CUDA is available!")
else:
    print("CUDA not available, using CPU")

CPU Fallback

CPU execution is always available:

pip install uniface

Works on all platforms without additional configuration.


Internal API

For advanced use cases, you can access the provider utilities:

from uniface.onnx_utils import get_available_providers, create_onnx_session

# Check available providers
providers = get_available_providers()
print(f"Available: {providers}")

# Models use create_onnx_session() internally
# which auto-selects the best provider

Performance Tips

1. Use GPU When Available

For batch processing or real-time applications, GPU acceleration provides significant speedups:

pip install uniface[gpu]

2. Optimize Input Size

Smaller input sizes are faster but may reduce accuracy:

from uniface import RetinaFace

# Faster, lower accuracy
detector = RetinaFace(input_size=(320, 320))

# Balanced (default)
detector = RetinaFace(input_size=(640, 640))

3. Batch Processing

Process multiple images to maximize GPU utilization:

# Process images in batch (GPU-efficient)
for image_path in image_paths:
    image = cv2.imread(image_path)
    faces = detector.detect(image)
    # ...

Troubleshooting

CUDA Not Detected

  1. Verify CUDA installation:

    nvidia-smi
    
  2. Check CUDA version compatibility with ONNX Runtime

  3. Reinstall with GPU support:

    pip uninstall onnxruntime onnxruntime-gpu
    pip install uniface[gpu]
    

Slow Performance on Mac

Verify you're using ARM64 Python (not Rosetta):

python -c "import platform; print(platform.machine())"
# Should show: arm64 (not x86_64)

Next Steps