uniface/examples/11_head_pose_estimation.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Head Pose Estimation with UniFace\n",
    "\n",
    "<div style=\"display:flex; flex-wrap:wrap; align-items:center;\">\n",
    "  <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://pepy.tech/projects/uniface\"><img alt=\"PyPI Downloads\" src=\"https://static.pepy.tech/personalized-badge/uniface?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads\"></a>\n",
    "  <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://pypi.org/project/uniface/\"><img alt=\"PyPI Version\" src=\"https://img.shields.io/pypi/v/uniface.svg\"></a>\n",
    "  <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://opensource.org/licenses/MIT\"><img alt=\"License\" src=\"https://img.shields.io/badge/License-MIT-blue.svg\"></a>\n",
    "  <a style=\"margin-bottom:6px;\" href=\"https://github.com/yakhyo/uniface\"><img alt=\"GitHub Stars\" src=\"https://img.shields.io/github/stars/yakhyo/uniface.svg?style=social\"></a>\n",
    "</div>\n",
    "\n",
    "**UniFace** is a lightweight, production-ready Python library for face detection, recognition, tracking, landmark analysis, face parsing, gaze estimation, and face attributes.\n",
    "\n",
    "🔗 **GitHub**: [github.com/yakhyo/uniface](https://github.com/yakhyo/uniface) | 📚 **Docs**: [yakhyo.github.io/uniface](https://yakhyo.github.io/uniface)\n",
    "\n",
    "---\n",
    "\n",
    "This notebook demonstrates head pose estimation using the **UniFace** library.\n",
    "\n",
    "## 1. Install UniFace"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install -q \"uniface[cpu]\"\n",
    "\n",
    "# Clone repo for assets (Colab only)\n",
    "import os\n",
    "if 'COLAB_GPU' in os.environ or 'COLAB_RELEASE_TAG' in os.environ:\n",
    "    if not os.path.exists('uniface'):\n",
    "        !git clone --depth 1 https://github.com/yakhyo/uniface.git\n",
    "    os.chdir('uniface/examples')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Import Libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import cv2\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "from pathlib import Path\n",
    "from PIL import Image\n",
    "\n",
    "import uniface\n",
    "from uniface.detection import RetinaFace\n",
    "from uniface.headpose import HeadPose\n",
    "from uniface.draw import draw_head_pose\n",
    "\n",
    "print(f\"UniFace version: {uniface.__version__}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Initialize Models"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Initialize face detector\n",
    "detector = RetinaFace(confidence_threshold=0.5)\n",
    "\n",
    "# Initialize head pose estimator (default: ResNet18 backbone)\n",
    "head_pose = HeadPose()\n",
    "\n",
    "print(\"Models initialized successfully!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Process All Test Images\n",
    "\n",
    "Display original images in the first row and head-pose-annotated images in the second row."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get all test images\n",
    "test_images_dir = Path('../assets/test_images')\n",
    "test_images = sorted(test_images_dir.glob('*.jpg'))\n",
    "\n",
    "original_images = []\n",
    "annotated_images = []\n",
    "\n",
    "for img_path in test_images:\n",
    "    image = cv2.imread(str(img_path))\n",
    "    if image is None:\n",
    "        continue\n",
    "\n",
    "    # Store original (BGR -> RGB for display)\n",
    "    original_images.append(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))\n",
    "\n",
    "    # Detect faces and estimate head pose\n",
    "    faces = detector.detect(image)\n",
    "\n",
    "    for face in faces:\n",
    "        x1, y1, x2, y2 = map(int, face.bbox)\n",
    "        face_crop = image[y1:y2, x1:x2]\n",
    "\n",
    "        if face_crop.size == 0:\n",
    "            continue\n",
    "\n",
    "        result = head_pose.estimate(face_crop)\n",
    "        draw_head_pose(image, face.bbox, result.pitch, result.yaw, result.roll)\n",
    "\n",
    "        print(f\"{img_path.name}: pitch={result.pitch:.1f}°, yaw={result.yaw:.1f}°, roll={result.roll:.1f}°\")\n",
    "\n",
    "    annotated_images.append(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))\n",
    "\n",
    "print(f\"\\nProcessed {len(original_images)} images\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Visualize Results\n",
    "\n",
    "**First row**: Original images  \n",
    "**Second row**: Images with head pose 3D cube overlay"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "num_images = len(original_images)\n",
    "\n",
    "# Create figure with 2 rows\n",
    "fig, axes = plt.subplots(2, num_images, figsize=(5 * num_images, 10))\n",
    "\n",
    "if num_images == 1:\n",
    "    axes = axes.reshape(2, 1)\n",
    "\n",
    "for i in range(num_images):\n",
    "    axes[0, i].imshow(original_images[i])\n",
    "    axes[0, i].set_title('Original', fontsize=12)\n",
    "    axes[0, i].axis('off')\n",
    "\n",
    "    axes[1, i].imshow(annotated_images[i])\n",
    "    axes[1, i].set_title('Head Pose', fontsize=12)\n",
    "    axes[1, i].axis('off')\n",
    "\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Notes\n",
    "\n",
    "- **Input**: Head pose estimation requires a face crop (obtained from face detection)\n",
    "- **Output**: `HeadPoseResult` with pitch, yaw, and roll angles in **degrees**\n",
    "- **Visualization**: Two modes available — `'cube'` (3D wireframe) and `'axis'` (X/Y/Z coordinate axes)\n",
    "- **Models**: 6 backbone variants available via `HeadPoseWeights` enum\n",
    "- **Method**: Uses 6D rotation representation converted to Euler angles\n",
    "\n",
    "### Available Backbones\n",
    "\n",
    "```python\n",
    "from uniface.constants import HeadPoseWeights\n",
    "\n",
    "# Options: RESNET18, RESNET34, RESNET50, MOBILENET_V2, MOBILENET_V3_SMALL, MOBILENET_V3_LARGE\n",
    "head_pose = HeadPose(model_name=HeadPoseWeights.RESNET50)\n",
    "```"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "base",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}