{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# XSeg Face Segmentation\n", "\n", "
\n", "\n", "**UniFace** is a lightweight, production-ready Python library for face detection, recognition, tracking, landmark analysis, face parsing, gaze estimation, and face attributes.\n", "\n", "🔗 **GitHub**: [github.com/yakhyo/uniface](https://github.com/yakhyo/uniface) | 📚 **Docs**: [yakhyo.github.io/uniface](https://yakhyo.github.io/uniface)\n", "\n", "---\n", "\n", "This notebook demonstrates face segmentation using the **XSeg** model from DeepFaceLab.\n", "\n", "XSeg outputs a mask for face regions. Unlike BiSeNet which works on bbox crops, XSeg requires 5-point landmarks for face alignment.\n", "\n", "## 1. Install UniFace" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install -q \"uniface[cpu]\"\n", "\n", "# Clone repo for assets (Colab only)\n", "import os\n", "if 'COLAB_GPU' in os.environ or 'COLAB_RELEASE_TAG' in os.environ:\n", " if not os.path.exists('uniface'):\n", " !git clone --depth 1 https://github.com/yakhyo/uniface.git\n", " os.chdir('uniface/examples')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Import Libraries" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import cv2\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "from pathlib import Path\n", "\n", "import uniface\n", "from uniface.detection import RetinaFace\n", "from uniface.parsing import XSeg\n", "\n", "print(f\"UniFace version: {uniface.__version__}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Initialize Models\n", "\n", "XSeg requires face detection with landmarks. We use RetinaFace for detection and XSeg for segmentation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Initialize detector and parser\n", "detector = RetinaFace()\n", "parser = XSeg()\n", "\n", "print(f\"XSeg input size: {parser.input_size}\")\n", "print(f\"Align size: {parser.align_size}\")\n", "print(f\"Blur sigma: {parser.blur_sigma}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Helper Functions" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def apply_mask_overlay(image, mask, color=(0, 255, 0), alpha=0.5):\n", " \"\"\"Apply colored mask overlay on image.\"\"\"\n", " overlay = image.copy().astype(np.float32)\n", "\n", " # Create colored overlay where mask is positive\n", " color_overlay = np.zeros_like(image, dtype=np.float32)\n", " color_overlay[:] = color\n", "\n", " mask_3ch = mask[..., np.newaxis]\n", " overlay = overlay * (1 - mask_3ch * alpha) + color_overlay * mask_3ch * alpha\n", "\n", " return overlay.clip(0, 255).astype(np.uint8)\n", "\n", "\n", "def show_results(original, mask, result, title=\"XSeg Result\"):\n", " \"\"\"Display original, mask, and result side by side.\"\"\"\n", " fig, axes = plt.subplots(1, 3, figsize=(15, 5))\n", "\n", " axes[0].imshow(cv2.cvtColor(original, cv2.COLOR_BGR2RGB))\n", " axes[0].set_title(\"Original\")\n", " axes[0].axis(\"off\")\n", "\n", " axes[1].imshow(mask, cmap=\"gray\")\n", " axes[1].set_title(\"Mask\")\n", " axes[1].axis(\"off\")\n", "\n", " axes[2].imshow(cv2.cvtColor(result, cv2.COLOR_BGR2RGB))\n", " axes[2].set_title(\"Overlay\")\n", " axes[2].axis(\"off\")\n", "\n", " plt.suptitle(title)\n", " plt.tight_layout()\n", " plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. Process Single Image" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Load image\n", "image_path = \"../assets/einstien.png\"\n", "image = cv2.imread(image_path)\n", "print(f\"Image shape: {image.shape}\")\n", "\n", "# Detect faces\n", "faces = detector.detect(image)\n", "print(f\"Detected {len(faces)} face(s)\")\n", "\n", "# Parse first face\n", "if len(faces) > 0 and faces[0].landmarks is not None:\n", " face = faces[0]\n", " mask = parser.parse(image, landmarks=face.landmarks)\n", "\n", " print(f\"Mask shape: {mask.shape}\")\n", " print(f\"Mask range: [{mask.min():.3f}, {mask.max():.3f}]\")\n", "\n", " # Visualize\n", " result = apply_mask_overlay(image, mask)\n", " show_results(image, mask, result, \"Single Face Segmentation\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 6. Configurable Parameters\n", "\n", "XSeg has two main parameters:\n", "- `align_size`: Face alignment output size (default: 256)\n", "- `blur_sigma`: Gaussian blur for mask smoothing (default: 0 = raw output)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Load image\n", "image_path = \"../assets/einstien.png\"\n", "image = cv2.imread(image_path)\n", "\n", "# Detect face\n", "faces = detector.detect(image)\n", "landmarks = faces[0].landmarks\n", "\n", "# Compare different blur settings\n", "blur_values = [0, 3, 5]\n", "\n", "fig, axes = plt.subplots(1, len(blur_values), figsize=(15, 5))\n", "\n", "for i, blur in enumerate(blur_values):\n", " parser_test = XSeg(blur_sigma=blur)\n", " mask = parser_test.parse(image, landmarks=landmarks)\n", "\n", " axes[i].imshow(mask, cmap=\"gray\")\n", " axes[i].set_title(f\"blur_sigma={blur}\")\n", " axes[i].axis(\"off\")\n", "\n", "plt.suptitle(\"Effect of blur_sigma\")\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 7. Using parse_aligned\n", "\n", "If you already have aligned face crops, use `parse_aligned()` directly." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from uniface.face_utils import face_alignment\n", "\n", "# Load and detect\n", "image = cv2.imread(\"../assets/einstien.png\")\n", "faces = detector.detect(image)\n", "landmarks = faces[0].landmarks\n", "\n", "# Align face manually\n", "aligned_face, inverse_matrix = face_alignment(image, landmarks, image_size=256)\n", "print(f\"Aligned face shape: {aligned_face.shape}\")\n", "\n", "# Parse aligned crop directly\n", "mask = parser.parse_aligned(aligned_face)\n", "print(f\"Mask shape: {mask.shape}\")\n", "\n", "# Visualize\n", "result = apply_mask_overlay(aligned_face, mask)\n", "\n", "fig, axes = plt.subplots(1, 3, figsize=(12, 4))\n", "axes[0].imshow(cv2.cvtColor(aligned_face, cv2.COLOR_BGR2RGB))\n", "axes[0].set_title(\"Aligned Face\")\n", "axes[0].axis(\"off\")\n", "\n", "axes[1].imshow(mask, cmap=\"gray\")\n", "axes[1].set_title(\"Mask\")\n", "axes[1].axis(\"off\")\n", "\n", "axes[2].imshow(cv2.cvtColor(result, cv2.COLOR_BGR2RGB))\n", "axes[2].set_title(\"Overlay\")\n", "axes[2].axis(\"off\")\n", "\n", "plt.suptitle(\"parse_aligned() on pre-aligned crop\")\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 8. XSeg vs BiSeNet\n", "\n", "| Feature | XSeg | BiSeNet |\n", "|---------|------|--------|\n", "| Output | Mask [0, 1] | 19 class labels |\n", "| Input | Requires landmarks | Works on bbox crops |\n", "| Use case | Face region extraction | Facial component parsing |\n", "| Origin | DeepFaceLab | CelebAMask-HQ |" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from uniface.parsing import BiSeNet\n", "from uniface.draw import vis_parsing_maps\n", "\n", "# Load image and detect\n", "image = cv2.imread(\"../assets/einstien.png\")\n", "faces = detector.detect(image)\n", "face = faces[0]\n", "\n", "# XSeg: requires landmarks\n", "xseg_mask = parser.parse(image, landmarks=face.landmarks)\n", "\n", "# BiSeNet: works on bbox crop\n", "bisenet = BiSeNet()\n", "x1, y1, x2, y2 = map(int, face.bbox[:4])\n", "face_crop = image[y1:y2, x1:x2]\n", "bisenet_mask = bisenet.parse(face_crop)\n", "\n", "# Visualize comparison\n", "fig, axes = plt.subplots(1, 3, figsize=(15, 5))\n", "\n", "axes[0].imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))\n", "axes[0].set_title(\"Original\")\n", "axes[0].axis(\"off\")\n", "\n", "axes[1].imshow(xseg_mask, cmap=\"gray\")\n", "axes[1].set_title(\"XSeg\")\n", "axes[1].axis(\"off\")\n", "\n", "face_rgb = cv2.cvtColor(face_crop, cv2.COLOR_BGR2RGB)\n", "bisenet_vis = vis_parsing_maps(face_rgb, bisenet_mask, save_image=False)\n", "axes[2].imshow(bisenet_vis)\n", "axes[2].set_title(\"BiSeNet (19 classes)\")\n", "axes[2].axis(\"off\")\n", "\n", "plt.suptitle(\"XSeg vs BiSeNet\")\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 9. Application: Face Masking\n", "\n", "Use XSeg mask to extract or replace face regions." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Load image\n", "image = cv2.imread(\"../assets/einstien.png\")\n", "faces = detector.detect(image)\n", "mask = parser.parse(image, landmarks=faces[0].landmarks)\n", "\n", "# Extract face only\n", "mask_3ch = np.stack([mask] * 3, axis=-1)\n", "face_only = (image * mask_3ch).astype(np.uint8)\n", "\n", "# Replace background with white\n", "white_bg = np.ones_like(image) * 255\n", "face_on_white = (image * mask_3ch + white_bg * (1 - mask_3ch)).astype(np.uint8)\n", "\n", "# Visualize\n", "fig, axes = plt.subplots(1, 3, figsize=(15, 5))\n", "\n", "axes[0].imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))\n", "axes[0].set_title(\"Original\")\n", "axes[0].axis(\"off\")\n", "\n", "axes[1].imshow(cv2.cvtColor(face_only, cv2.COLOR_BGR2RGB))\n", "axes[1].set_title(\"Face Extracted\")\n", "axes[1].axis(\"off\")\n", "\n", "axes[2].imshow(cv2.cvtColor(face_on_white, cv2.COLOR_BGR2RGB))\n", "axes[2].set_title(\"White Background\")\n", "axes[2].axis(\"off\")\n", "\n", "plt.suptitle(\"Face Masking Applications\")\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary\n", "\n", "XSeg provides face segmentation using landmark-based alignment:\n", "\n", "- **`parse(image, landmarks=landmarks)`** - Full pipeline: align, segment, warp back\n", "- **`parse_aligned(face_crop)`** - For pre-aligned crops\n", "- **`parse_with_inverse(image, landmarks)`** - Returns mask + crop + inverse matrix\n", "\n", "Parameters:\n", "- `align_size` - Face alignment size (default: 256)\n", "- `blur_sigma` - Mask smoothing (default: 0 = raw)" ] } ], "metadata": { "kernelspec": { "display_name": "base", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.5" } }, "nbformat": 4, "nbformat_minor": 4 }