mirror of
https://github.com/yakhyo/uniface.git
synced 2026-05-15 04:37:49 +00:00
418 lines
13 KiB
Plaintext
418 lines
13 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# XSeg Face Segmentation\n",
|
|
"\n",
|
|
"<div style=\"display:flex; flex-wrap:wrap; align-items:center;\">\n",
|
|
" <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://pepy.tech/projects/uniface\"><img alt=\"PyPI Downloads\" src=\"https://static.pepy.tech/personalized-badge/uniface?period=total&units=international_system&left_color=grey&right_color=blue&left_text=Downloads\"></a>\n",
|
|
" <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://pypi.org/project/uniface/\"><img alt=\"PyPI Version\" src=\"https://img.shields.io/pypi/v/uniface.svg\"></a>\n",
|
|
" <a style=\"margin-right:10px; margin-bottom:6px;\" href=\"https://opensource.org/licenses/MIT\"><img alt=\"License\" src=\"https://img.shields.io/badge/License-MIT-blue.svg\"></a>\n",
|
|
" <a style=\"margin-bottom:6px;\" href=\"https://github.com/yakhyo/uniface\"><img alt=\"GitHub Stars\" src=\"https://img.shields.io/github/stars/yakhyo/uniface.svg?style=social\"></a>\n",
|
|
"</div>\n",
|
|
"\n",
|
|
"**UniFace** is a lightweight, production-ready Python library for face detection, recognition, tracking, landmark analysis, face parsing, gaze estimation, and face attributes.\n",
|
|
"\n",
|
|
"🔗 **GitHub**: [github.com/yakhyo/uniface](https://github.com/yakhyo/uniface) | 📚 **Docs**: [yakhyo.github.io/uniface](https://yakhyo.github.io/uniface)\n",
|
|
"\n",
|
|
"---\n",
|
|
"\n",
|
|
"This notebook demonstrates face segmentation using the **XSeg** model from DeepFaceLab.\n",
|
|
"\n",
|
|
"XSeg outputs a mask for face regions. Unlike BiSeNet which works on bbox crops, XSeg requires 5-point landmarks for face alignment.\n",
|
|
"\n",
|
|
"## 1. Install UniFace"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"%pip install -q \"uniface[cpu]\"\n",
|
|
"\n",
|
|
"# Clone repo for assets (Colab only)\n",
|
|
"import os\n",
|
|
"if 'COLAB_GPU' in os.environ or 'COLAB_RELEASE_TAG' in os.environ:\n",
|
|
" if not os.path.exists('uniface'):\n",
|
|
" !git clone --depth 1 https://github.com/yakhyo/uniface.git\n",
|
|
" os.chdir('uniface/examples')"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## 2. Import Libraries"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import cv2\n",
|
|
"import numpy as np\n",
|
|
"import matplotlib.pyplot as plt\n",
|
|
"from pathlib import Path\n",
|
|
"\n",
|
|
"import uniface\n",
|
|
"from uniface.detection import RetinaFace\n",
|
|
"from uniface.parsing import XSeg\n",
|
|
"\n",
|
|
"print(f\"UniFace version: {uniface.__version__}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## 3. Initialize Models\n",
|
|
"\n",
|
|
"XSeg requires face detection with landmarks. We use RetinaFace for detection and XSeg for segmentation."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Initialize detector and parser\n",
|
|
"detector = RetinaFace()\n",
|
|
"parser = XSeg()\n",
|
|
"\n",
|
|
"print(f\"XSeg input size: {parser.input_size}\")\n",
|
|
"print(f\"Align size: {parser.align_size}\")\n",
|
|
"print(f\"Blur sigma: {parser.blur_sigma}\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## 4. Helper Functions"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def apply_mask_overlay(image, mask, color=(0, 255, 0), alpha=0.5):\n",
|
|
" \"\"\"Apply colored mask overlay on image.\"\"\"\n",
|
|
" overlay = image.copy().astype(np.float32)\n",
|
|
"\n",
|
|
" # Create colored overlay where mask is positive\n",
|
|
" color_overlay = np.zeros_like(image, dtype=np.float32)\n",
|
|
" color_overlay[:] = color\n",
|
|
"\n",
|
|
" mask_3ch = mask[..., np.newaxis]\n",
|
|
" overlay = overlay * (1 - mask_3ch * alpha) + color_overlay * mask_3ch * alpha\n",
|
|
"\n",
|
|
" return overlay.clip(0, 255).astype(np.uint8)\n",
|
|
"\n",
|
|
"\n",
|
|
"def show_results(original, mask, result, title=\"XSeg Result\"):\n",
|
|
" \"\"\"Display original, mask, and result side by side.\"\"\"\n",
|
|
" fig, axes = plt.subplots(1, 3, figsize=(15, 5))\n",
|
|
"\n",
|
|
" axes[0].imshow(cv2.cvtColor(original, cv2.COLOR_BGR2RGB))\n",
|
|
" axes[0].set_title(\"Original\")\n",
|
|
" axes[0].axis(\"off\")\n",
|
|
"\n",
|
|
" axes[1].imshow(mask, cmap=\"gray\")\n",
|
|
" axes[1].set_title(\"Mask\")\n",
|
|
" axes[1].axis(\"off\")\n",
|
|
"\n",
|
|
" axes[2].imshow(cv2.cvtColor(result, cv2.COLOR_BGR2RGB))\n",
|
|
" axes[2].set_title(\"Overlay\")\n",
|
|
" axes[2].axis(\"off\")\n",
|
|
"\n",
|
|
" plt.suptitle(title)\n",
|
|
" plt.tight_layout()\n",
|
|
" plt.show()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## 5. Process Single Image"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Load image\n",
|
|
"image_path = \"../assets/einstien.png\"\n",
|
|
"image = cv2.imread(image_path)\n",
|
|
"print(f\"Image shape: {image.shape}\")\n",
|
|
"\n",
|
|
"# Detect faces\n",
|
|
"faces = detector.detect(image)\n",
|
|
"print(f\"Detected {len(faces)} face(s)\")\n",
|
|
"\n",
|
|
"# Parse first face\n",
|
|
"if len(faces) > 0 and faces[0].landmarks is not None:\n",
|
|
" face = faces[0]\n",
|
|
" mask = parser.parse(image, landmarks=face.landmarks)\n",
|
|
"\n",
|
|
" print(f\"Mask shape: {mask.shape}\")\n",
|
|
" print(f\"Mask range: [{mask.min():.3f}, {mask.max():.3f}]\")\n",
|
|
"\n",
|
|
" # Visualize\n",
|
|
" result = apply_mask_overlay(image, mask)\n",
|
|
" show_results(image, mask, result, \"Single Face Segmentation\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## 6. Configurable Parameters\n",
|
|
"\n",
|
|
"XSeg has two main parameters:\n",
|
|
"- `align_size`: Face alignment output size (default: 256)\n",
|
|
"- `blur_sigma`: Gaussian blur for mask smoothing (default: 0 = raw output)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Load image\n",
|
|
"image_path = \"../assets/einstien.png\"\n",
|
|
"image = cv2.imread(image_path)\n",
|
|
"\n",
|
|
"# Detect face\n",
|
|
"faces = detector.detect(image)\n",
|
|
"landmarks = faces[0].landmarks\n",
|
|
"\n",
|
|
"# Compare different blur settings\n",
|
|
"blur_values = [0, 3, 5]\n",
|
|
"\n",
|
|
"fig, axes = plt.subplots(1, len(blur_values), figsize=(15, 5))\n",
|
|
"\n",
|
|
"for i, blur in enumerate(blur_values):\n",
|
|
" parser_test = XSeg(blur_sigma=blur)\n",
|
|
" mask = parser_test.parse(image, landmarks=landmarks)\n",
|
|
"\n",
|
|
" axes[i].imshow(mask, cmap=\"gray\")\n",
|
|
" axes[i].set_title(f\"blur_sigma={blur}\")\n",
|
|
" axes[i].axis(\"off\")\n",
|
|
"\n",
|
|
"plt.suptitle(\"Effect of blur_sigma\")\n",
|
|
"plt.tight_layout()\n",
|
|
"plt.show()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## 7. Using parse_aligned\n",
|
|
"\n",
|
|
"If you already have aligned face crops, use `parse_aligned()` directly."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from uniface.face_utils import face_alignment\n",
|
|
"\n",
|
|
"# Load and detect\n",
|
|
"image = cv2.imread(\"../assets/einstien.png\")\n",
|
|
"faces = detector.detect(image)\n",
|
|
"landmarks = faces[0].landmarks\n",
|
|
"\n",
|
|
"# Align face manually\n",
|
|
"aligned_face, inverse_matrix = face_alignment(image, landmarks, image_size=256)\n",
|
|
"print(f\"Aligned face shape: {aligned_face.shape}\")\n",
|
|
"\n",
|
|
"# Parse aligned crop directly\n",
|
|
"mask = parser.parse_aligned(aligned_face)\n",
|
|
"print(f\"Mask shape: {mask.shape}\")\n",
|
|
"\n",
|
|
"# Visualize\n",
|
|
"result = apply_mask_overlay(aligned_face, mask)\n",
|
|
"\n",
|
|
"fig, axes = plt.subplots(1, 3, figsize=(12, 4))\n",
|
|
"axes[0].imshow(cv2.cvtColor(aligned_face, cv2.COLOR_BGR2RGB))\n",
|
|
"axes[0].set_title(\"Aligned Face\")\n",
|
|
"axes[0].axis(\"off\")\n",
|
|
"\n",
|
|
"axes[1].imshow(mask, cmap=\"gray\")\n",
|
|
"axes[1].set_title(\"Mask\")\n",
|
|
"axes[1].axis(\"off\")\n",
|
|
"\n",
|
|
"axes[2].imshow(cv2.cvtColor(result, cv2.COLOR_BGR2RGB))\n",
|
|
"axes[2].set_title(\"Overlay\")\n",
|
|
"axes[2].axis(\"off\")\n",
|
|
"\n",
|
|
"plt.suptitle(\"parse_aligned() on pre-aligned crop\")\n",
|
|
"plt.tight_layout()\n",
|
|
"plt.show()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## 8. XSeg vs BiSeNet\n",
|
|
"\n",
|
|
"| Feature | XSeg | BiSeNet |\n",
|
|
"|---------|------|--------|\n",
|
|
"| Output | Mask [0, 1] | 19 class labels |\n",
|
|
"| Input | Requires landmarks | Works on bbox crops |\n",
|
|
"| Use case | Face region extraction | Facial component parsing |\n",
|
|
"| Origin | DeepFaceLab | CelebAMask-HQ |"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from uniface.parsing import BiSeNet\n",
|
|
"from uniface.draw import vis_parsing_maps\n",
|
|
"\n",
|
|
"# Load image and detect\n",
|
|
"image = cv2.imread(\"../assets/einstien.png\")\n",
|
|
"faces = detector.detect(image)\n",
|
|
"face = faces[0]\n",
|
|
"\n",
|
|
"# XSeg: requires landmarks\n",
|
|
"xseg_mask = parser.parse(image, landmarks=face.landmarks)\n",
|
|
"\n",
|
|
"# BiSeNet: works on bbox crop\n",
|
|
"bisenet = BiSeNet()\n",
|
|
"x1, y1, x2, y2 = map(int, face.bbox[:4])\n",
|
|
"face_crop = image[y1:y2, x1:x2]\n",
|
|
"bisenet_mask = bisenet.parse(face_crop)\n",
|
|
"\n",
|
|
"# Visualize comparison\n",
|
|
"fig, axes = plt.subplots(1, 3, figsize=(15, 5))\n",
|
|
"\n",
|
|
"axes[0].imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))\n",
|
|
"axes[0].set_title(\"Original\")\n",
|
|
"axes[0].axis(\"off\")\n",
|
|
"\n",
|
|
"axes[1].imshow(xseg_mask, cmap=\"gray\")\n",
|
|
"axes[1].set_title(\"XSeg\")\n",
|
|
"axes[1].axis(\"off\")\n",
|
|
"\n",
|
|
"face_rgb = cv2.cvtColor(face_crop, cv2.COLOR_BGR2RGB)\n",
|
|
"bisenet_vis = vis_parsing_maps(face_rgb, bisenet_mask, save_image=False)\n",
|
|
"axes[2].imshow(bisenet_vis)\n",
|
|
"axes[2].set_title(\"BiSeNet (19 classes)\")\n",
|
|
"axes[2].axis(\"off\")\n",
|
|
"\n",
|
|
"plt.suptitle(\"XSeg vs BiSeNet\")\n",
|
|
"plt.tight_layout()\n",
|
|
"plt.show()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## 9. Application: Face Masking\n",
|
|
"\n",
|
|
"Use XSeg mask to extract or replace face regions."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Load image\n",
|
|
"image = cv2.imread(\"../assets/einstien.png\")\n",
|
|
"faces = detector.detect(image)\n",
|
|
"mask = parser.parse(image, landmarks=faces[0].landmarks)\n",
|
|
"\n",
|
|
"# Extract face only\n",
|
|
"mask_3ch = np.stack([mask] * 3, axis=-1)\n",
|
|
"face_only = (image * mask_3ch).astype(np.uint8)\n",
|
|
"\n",
|
|
"# Replace background with white\n",
|
|
"white_bg = np.ones_like(image) * 255\n",
|
|
"face_on_white = (image * mask_3ch + white_bg * (1 - mask_3ch)).astype(np.uint8)\n",
|
|
"\n",
|
|
"# Visualize\n",
|
|
"fig, axes = plt.subplots(1, 3, figsize=(15, 5))\n",
|
|
"\n",
|
|
"axes[0].imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))\n",
|
|
"axes[0].set_title(\"Original\")\n",
|
|
"axes[0].axis(\"off\")\n",
|
|
"\n",
|
|
"axes[1].imshow(cv2.cvtColor(face_only, cv2.COLOR_BGR2RGB))\n",
|
|
"axes[1].set_title(\"Face Extracted\")\n",
|
|
"axes[1].axis(\"off\")\n",
|
|
"\n",
|
|
"axes[2].imshow(cv2.cvtColor(face_on_white, cv2.COLOR_BGR2RGB))\n",
|
|
"axes[2].set_title(\"White Background\")\n",
|
|
"axes[2].axis(\"off\")\n",
|
|
"\n",
|
|
"plt.suptitle(\"Face Masking Applications\")\n",
|
|
"plt.tight_layout()\n",
|
|
"plt.show()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Summary\n",
|
|
"\n",
|
|
"XSeg provides face segmentation using landmark-based alignment:\n",
|
|
"\n",
|
|
"- **`parse(image, landmarks=landmarks)`** - Full pipeline: align, segment, warp back\n",
|
|
"- **`parse_aligned(face_crop)`** - For pre-aligned crops\n",
|
|
"- **`parse_with_inverse(image, landmarks)`** - Returns mask + crop + inverse matrix\n",
|
|
"\n",
|
|
"Parameters:\n",
|
|
"- `align_size` - Face alignment size (default: 256)\n",
|
|
"- `blur_sigma` - Mask smoothing (default: 0 = raw)"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "base",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.13.5"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 4
|
|
}
|