chore: Upgrade version to v1.5.3

fix: Fix the version
feat: Add BiSeNet face parsing model (#36 )
2025-12-30 09:02:25 +00:00 · 2025-12-15 15:09:54 +09:00 · 2025-12-15 14:53:36 +09:00 · 2025-12-15 14:50:15 +09:00 · 2025-12-14 21:13:53 +09:00 · 2025-12-14 14:07:46 +09:00
22 changed files with 2266 additions and 9 deletions
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -58,3 +58,6 @@ Open an issue or start a discussion on GitHub.



+
+
+
--- a/MODELS.md
+++ b/MODELS.md
@@ -291,6 +291,119 @@ emotion, confidence = predictor.predict(image, landmarks)

 ---

+## Gaze Estimation Models
+
+### MobileGaze Family
+
+Real-time gaze direction prediction models trained on Gaze360 dataset. Returns pitch (vertical) and yaw (horizontal) angles in radians.
+
+| Model Name     | Params | Size    | MAE*  | Use Case                      |
+| -------------- | ------ | ------- | ----- | ----------------------------- |
+| `RESNET18`   | 11.7M  | 43 MB   | 12.84 | Balanced accuracy/speed       |
+| `RESNET34` ⭐ | 24.8M  | 81.6 MB | 11.33 | **Recommended default** |
+| `RESNET50`   | 25.6M  | 91.3 MB | 11.34 | High accuracy                 |
+| `MOBILENET_V2` | 3.5M   | 9.59 MB | 13.07 | Mobile/Edge devices           |
+| `MOBILEONE_S0` | 2.1M   | 4.8 MB  | 12.58 | Lightweight/Real-time         |
+
+*MAE (Mean Absolute Error) in degrees on Gaze360 test set - lower is better
+
+**Dataset**: Trained on Gaze360 (indoor/outdoor scenes with diverse head poses)
+**Training**: 200 epochs with classification-based approach (binned angles)
+
+#### Usage
+
+```python
+from uniface import MobileGaze
+from uniface.constants import GazeWeights
+import numpy as np
+
+# Default (recommended)
+gaze_estimator = MobileGaze()  # Uses RESNET34
+
+# Lightweight model
+gaze_estimator = MobileGaze(model_name=GazeWeights.MOBILEONE_S0)
+
+# Estimate gaze from face crop
+pitch, yaw = gaze_estimator.estimate(face_crop)
+print(f"Pitch: {np.degrees(pitch):.1f}°, Yaw: {np.degrees(yaw):.1f}°")
+```
+
+**Note**: Requires face crop as input. Use face detection first to obtain bounding boxes.
+
+---
+
+## Face Parsing Models
+
+### BiSeNet Family
+
+BiSeNet (Bilateral Segmentation Network) models for semantic face parsing. Segments face images into 19 facial component classes.
+
+| Model Name     | Params | Size    | Classes | Use Case                      |
+| -------------- | ------ | ------- | ------- | ----------------------------- |
+| `RESNET18` ⭐ | 13.3M  | 50.7 MB | 19      | **Recommended default** |
+| `RESNET34`   | 24.1M  | 89.2 MB | 19      | Higher accuracy               |
+
+**19 Facial Component Classes:**
+1. Background
+2. Skin
+3. Left Eyebrow
+4. Right Eyebrow
+5. Left Eye
+6. Right Eye
+7. Eye Glasses
+8. Left Ear
+9. Right Ear
+10. Ear Ring
+11. Nose
+12. Mouth
+13. Upper Lip
+14. Lower Lip
+15. Neck
+16. Neck Lace
+17. Cloth
+18. Hair
+19. Hat
+
+**Dataset**: Trained on CelebAMask-HQ
+**Architecture**: BiSeNet with ResNet backbone
+**Input Size**: 512×512 (automatically resized)
+
+#### Usage
+
+```python
+from uniface.parsing import BiSeNet
+from uniface.constants import ParsingWeights
+from uniface.visualization import vis_parsing_maps
+import cv2
+
+# Default (recommended)
+parser = BiSeNet()  # Uses RESNET18
+
+# Higher accuracy model
+parser = BiSeNet(model_name=ParsingWeights.RESNET34)
+
+# Parse face image (already cropped)
+mask = parser.parse(face_image)
+
+# Visualize with overlay
+face_rgb = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
+vis_result = vis_parsing_maps(face_rgb, mask, save_image=False)
+
+# mask shape: (H, W) with values 0-18 representing classes
+print(f"Detected {len(np.unique(mask))} facial components")
+```
+
+**Applications:**
+- Face makeup and beauty applications
+- Virtual try-on systems
+- Face editing and manipulation
+- Facial feature extraction
+- Portrait segmentation
+
+**Note**: Input should be a cropped face image. For full pipeline, use face detection first to obtain face crops.
+
+---
+
 ## Model Updates

 Models are automatically downloaded and cached on first use. Cache location: `~/.uniface/models/`
@@ -330,6 +443,8 @@ python scripts/download_model.py --model MNET_V2
 - **YOLOv5-Face Original**: [deepcam-cn/yolov5-face](https://github.com/deepcam-cn/yolov5-face) - Original PyTorch implementation
 - **YOLOv5-Face ONNX**: [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) - ONNX inference implementation
 - **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) - ArcFace, MobileFace, SphereFace training code
+- **Gaze Estimation Training**: [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) - MobileGaze training code and pretrained weights
+- **Face Parsing Training**: [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) - BiSeNet training code and pretrained weights
 - **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface) - Model architectures and pretrained weights

 ### Papers
@@ -339,3 +454,4 @@ python scripts/download_model.py --model MNET_V2
 - **YOLOv5-Face**: [YOLO5Face: Why Reinventing a Face Detector](https://arxiv.org/abs/2105.12931)
 - **ArcFace**: [Additive Angular Margin Loss for Deep Face Recognition](https://arxiv.org/abs/1801.07698)
 - **SphereFace**: [Deep Hypersphere Embedding for Face Recognition](https://arxiv.org/abs/1704.08063)
+- **BiSeNet**: [Bilateral Segmentation Network for Real-time Semantic Segmentation](https://arxiv.org/abs/1808.00897)
--- a/QUICKSTART.md
+++ b/QUICKSTART.md
@@ -242,7 +242,93 @@ if faces:

 ---

-## 7. Batch Processing (3 minutes)
+## 7. Gaze Estimation (2 minutes)
+
+Estimate where a person is looking:
+
+```python
+import cv2
+import numpy as np
+from uniface import RetinaFace, MobileGaze
+from uniface.visualization import draw_gaze
+
+# Initialize models
+detector = RetinaFace()
+gaze_estimator = MobileGaze()
+
+# Load image
+image = cv2.imread("photo.jpg")
+faces = detector.detect(image)
+
+# Estimate gaze for each face
+for i, face in enumerate(faces):
+    bbox = face['bbox']
+    x1, y1, x2, y2 = map(int, bbox[:4])
+    face_crop = image[y1:y2, x1:x2]
+
+    if face_crop.size > 0:
+        pitch, yaw = gaze_estimator.estimate(face_crop)
+        print(f"Face {i+1}: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°")
+
+        # Draw gaze direction
+        draw_gaze(image, bbox, pitch, yaw)
+
+cv2.imwrite("gaze_output.jpg", image)
+```
+
+**Output:**
+
+```
+Face 1: pitch=5.2°, yaw=-12.3°
+Face 2: pitch=-8.1°, yaw=15.7°
+```
+
+---
+
+## 8. Face Parsing (2 minutes)
+
+Segment face into semantic components (skin, eyes, nose, mouth, hair, etc.):
+
+```python
+import cv2
+import numpy as np
+from uniface.parsing import BiSeNet
+from uniface.visualization import vis_parsing_maps
+
+# Initialize parser
+parser = BiSeNet()  # Uses ResNet18 by default
+
+# Load face image (already cropped)
+face_image = cv2.imread("face.jpg")
+
+# Parse face into 19 components
+mask = parser.parse(face_image)
+
+# Visualize with overlay
+face_rgb = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
+vis_result = vis_parsing_maps(face_rgb, mask, save_image=False)
+
+# Convert back to BGR for saving
+vis_bgr = cv2.cvtColor(vis_result, cv2.COLOR_RGB2BGR)
+cv2.imwrite("parsed_face.jpg", vis_bgr)
+
+print(f"Detected {len(np.unique(mask))} facial components")
+```
+
+**Output:**
+
+```
+Detected 12 facial components
+```
+
+**19 Facial Component Classes:**
+- Background, Skin, Eyebrows (L/R), Eyes (L/R), Eye Glasses
+- Ears (L/R), Ear Ring, Nose, Mouth, Lips (Upper/Lower)
+- Neck, Neck Lace, Cloth, Hair, Hat
+
+---
+
+## 9. Batch Processing (3 minutes)

 Process multiple images:

@@ -275,7 +361,7 @@ print("Done!")

 ---

-## 8. Model Selection
+## 10. Model Selection

 Choose the right model for your use case:

@@ -326,6 +412,35 @@ recognizer = MobileFace(model_name=MobileFaceWeights.MNET_V2)  # Fast, small siz
 recognizer = SphereFace(model_name=SphereFaceWeights.SPHERE20)  # Alternative method
 ```

+### Gaze Estimation Models
+
+```python
+from uniface import MobileGaze
+from uniface.constants import GazeWeights
+
+# Default (recommended)
+gaze_estimator = MobileGaze()  # Uses RESNET34
+
+# Lightweight (mobile/edge devices)
+gaze_estimator = MobileGaze(model_name=GazeWeights.MOBILEONE_S0)
+
+# High accuracy
+gaze_estimator = MobileGaze(model_name=GazeWeights.RESNET50)
+```
+
+### Face Parsing Models
+
+```python
+from uniface.parsing import BiSeNet
+from uniface.constants import ParsingWeights
+
+# Default (recommended, 50.7 MB)
+parser = BiSeNet()  # Uses RESNET18
+
+# Higher accuracy (89.2 MB)
+parser = BiSeNet(model_name=ParsingWeights.RESNET34)
+```
+
 ---

 ## Common Issues
@@ -387,6 +502,8 @@ Explore interactive examples for common tasks:
 | **Face Recognition** | Extract face embeddings and compare faces | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
 | **Face Verification** | Compare two faces to verify identity | [face_verification.ipynb](examples/face_verification.ipynb) |
 | **Face Search** | Find a person in a group photo | [face_search.ipynb](examples/face_search.ipynb) |
+| **Face Parsing** | Segment face into semantic components | [face_parsing.ipynb](examples/face_parsing.ipynb) |
+| **Gaze Estimation** | Estimate gaze direction | [gaze_estimation.ipynb](examples/gaze_estimation.ipynb) |

 ### Additional Resources

@@ -400,4 +517,6 @@ Explore interactive examples for common tasks:
 - **RetinaFace Training**: [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch)
 - **YOLOv5-Face ONNX**: [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference)
 - **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition)
+- **Gaze Estimation Training**: [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation)
+- **Face Parsing Training**: [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing)
 - **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface)
--- a/README.md
+++ b/README.md
@@ -11,7 +11,7 @@
    <img src=".github/logos/logo_web.webp" width=75%>
 </div>

-**UniFace** is a lightweight, production-ready face analysis library built on ONNX Runtime. It provides high-performance face detection, recognition, landmark detection, and attribute analysis with hardware acceleration support across platforms.
+**UniFace** is a lightweight, production-ready face analysis library built on ONNX Runtime. It provides high-performance face detection, recognition, landmark detection, face parsing, gaze estimation, and attribute analysis with hardware acceleration support across platforms.

 ---

@@ -20,6 +20,8 @@
 - **High-Speed Face Detection**: ONNX-optimized RetinaFace, SCRFD, and YOLOv5-Face models
 - **Facial Landmark Detection**: Accurate 106-point landmark localization
 - **Face Recognition**: ArcFace, MobileFace, and SphereFace embeddings
+- **Face Parsing**: BiSeNet-based semantic segmentation with 19 facial component classes
+- **Gaze Estimation**: Real-time gaze direction prediction with MobileGaze
 - **Attribute Analysis**: Age, gender, and emotion detection
 - **Face Alignment**: Precise alignment for downstream tasks
 - **Hardware Acceleration**: ARM64 optimizations (Apple Silicon), CUDA (NVIDIA), CPU fallback
@@ -152,6 +154,50 @@ gender_str = 'Female' if gender == 0 else 'Male'
 print(f"{gender_str}, {age} years old")
 ```

+### Gaze Estimation
+
+```python
+from uniface import RetinaFace, MobileGaze
+from uniface.visualization import draw_gaze
+import numpy as np
+
+detector = RetinaFace()
+gaze_estimator = MobileGaze()
+
+faces = detector.detect(image)
+for face in faces:
+    bbox = face['bbox']
+    x1, y1, x2, y2 = map(int, bbox[:4])
+    face_crop = image[y1:y2, x1:x2]
+
+    pitch, yaw = gaze_estimator.estimate(face_crop)
+    print(f"Gaze: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°")
+
+    # Visualize
+    draw_gaze(image, bbox, pitch, yaw)
+```
+
+### Face Parsing
+
+```python
+from uniface.parsing import BiSeNet
+from uniface.visualization import vis_parsing_maps
+
+# Initialize parser
+parser = BiSeNet()  # Uses ResNet18 by default
+
+# Parse face image (already cropped)
+mask = parser.parse(face_image)
+
+# Visualize with overlay
+import cv2
+face_rgb = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
+vis_result = vis_parsing_maps(face_rgb, mask, save_image=False)
+
+# mask contains 19 classes: skin, eyes, nose, mouth, hair, etc.
+print(f"Unique classes: {len(np.unique(mask))}")
+```
+
 ---

 ## Documentation
@@ -252,6 +298,18 @@ faces = detect_faces(image, method='retinaface', conf_thresh=0.8)  # methods: re
 | `AgeGender`   | `model_name=AgeGenderWeights.DEFAULT`; `input_size` auto-detected | Requires bbox; ONNXRuntime              |
 | `Emotion`     | `model_weights=DDAMFNWeights.AFFECNET7`, `input_size=(112, 112)`  | Requires 5-point landmarks; TorchScript |

+**Gaze Estimation**
+
+| Class         | Key params (defaults)                      | Notes                                |
+| ------------- | ------------------------------------------ | ------------------------------------ |
+| `MobileGaze` | `model_name=GazeWeights.RESNET34`       | Returns (pitch, yaw) angles in radians; trained on Gaze360 |
+
+**Face Parsing**
+
+| Class      | Key params (defaults)                    | Notes                                |
+| ---------- | ---------------------------------------- | ------------------------------------ |
+| `BiSeNet` | `model_name=ParsingWeights.RESNET18`, `input_size=(512, 512)` | 19 facial component classes; BiSeNet architecture with ResNet backbone |
+
 ---

 ## Model Performance
@@ -298,6 +356,8 @@ Interactive examples covering common face analysis tasks:
 | **Face Recognition** | Extract face embeddings and compare faces | [face_analyzer.ipynb](examples/face_analyzer.ipynb) |
 | **Face Verification** | Compare two faces to verify identity | [face_verification.ipynb](examples/face_verification.ipynb) |
 | **Face Search** | Find a person in a group photo | [face_search.ipynb](examples/face_search.ipynb) |
+| **Face Parsing** | Segment face into semantic components | [face_parsing.ipynb](examples/face_parsing.ipynb) |
+| **Gaze Estimation** | Estimate gaze direction from face images | [gaze_estimation.ipynb](examples/gaze_estimation.ipynb) |

 ### Webcam Face Detection

@@ -488,6 +548,8 @@ uniface/
 │   ├── detection/       # Face detection models
 │   ├── recognition/     # Face recognition models
 │   ├── landmark/        # Landmark detection
+│   ├── parsing/         # Face parsing
+│   ├── gaze/            # Gaze estimation
 │   ├── attribute/       # Age, gender, emotion
 │   ├── onnx_utils.py    # ONNX Runtime utilities
 │   ├── model_store.py   # Model download & caching
@@ -504,6 +566,8 @@ uniface/
 - **RetinaFace Training**: [yakhyo/retinaface-pytorch](https://github.com/yakhyo/retinaface-pytorch) - PyTorch implementation and training code
 - **YOLOv5-Face ONNX**: [yakhyo/yolov5-face-onnx-inference](https://github.com/yakhyo/yolov5-face-onnx-inference) - ONNX inference implementation
 - **Face Recognition Training**: [yakhyo/face-recognition](https://github.com/yakhyo/face-recognition) - ArcFace, MobileFace, SphereFace training code
+- **Face Parsing Training**: [yakhyo/face-parsing](https://github.com/yakhyo/face-parsing) - BiSeNet face parsing training code and pretrained weights
+- **Gaze Estimation Training**: [yakhyo/gaze-estimation](https://github.com/yakhyo/gaze-estimation) - MobileGaze training code and pretrained weights
 - **InsightFace**: [deepinsight/insightface](https://github.com/deepinsight/insightface) - Model architectures and pretrained weights

 ## Contributing
--- a/examples/face_parsing.ipynb
+++ b/examples/face_parsing.ipynb
--- a/examples/gaze_estimation.ipynb
+++ b/examples/gaze_estimation.ipynb
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,7 +1,7 @@
 [project]
 name = "uniface"
-version = "1.3.2"
-description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Landmark Analysis, Age, and Gender Detection"
+version = "1.5.3"
+description = "UniFace: A Comprehensive Library for Face Detection, Recognition, Landmark Analysis, Face Parsing, Gaze Estimation, Age, and Gender Detection"
 readme = "README.md"
 license = { text = "MIT" }
 authors = [{ name = "Yakhyokhuja Valikhujaev", email = "yakhyo9696@gmail.com" }]
@@ -14,6 +14,9 @@ keywords = [
    "face-detection",
    "face-recognition",
    "facial-landmarks",
+    "face-parsing",
+    "face-segmentation",
+    "gaze-estimation",
    "age-detection",
    "gender-detection",
    "computer-vision",
@@ -21,6 +24,7 @@ keywords = [
    "onnx",
    "onnxruntime",
    "face-analysis",
+    "bisenet",
 ]

 classifiers = [
--- a/scripts/README.md
+++ b/scripts/README.md
@@ -9,6 +9,7 @@ Scripts for testing UniFace features.
 | `run_detection.py` | Face detection on image or webcam |
 | `run_age_gender.py` | Age and gender prediction |
 | `run_emotion.py` | Emotion detection (7 or 8 emotions) |
+| `run_gaze_estimation.py` | Gaze direction estimation |
 | `run_landmarks.py` | 106-point facial landmark detection |
 | `run_recognition.py` | Face embedding extraction and comparison |
 | `run_face_analyzer.py` | Complete face analysis (detection + recognition + attributes) |
@@ -33,6 +34,10 @@ python scripts/run_age_gender.py --webcam
 python scripts/run_emotion.py --image assets/test.jpg
 python scripts/run_emotion.py --webcam

+# Gaze estimation
+python scripts/run_gaze_estimation.py --image assets/test.jpg
+python scripts/run_gaze_estimation.py --webcam
+
 # Landmarks
 python scripts/run_landmarks.py --image assets/test.jpg
 python scripts/run_landmarks.py --webcam
--- a/scripts/run_age_gender.py
+++ b/scripts/run_age_gender.py
@@ -79,7 +79,9 @@ def run_webcam(detector, age_gender, threshold: float = 0.6):
        bboxes = [f['bbox'] for f in faces]
        scores = [f['confidence'] for f in faces]
        landmarks = [f['landmarks'] for f in faces]
-        draw_detections(frame, bboxes, scores, landmarks, vis_threshold=threshold)
+        draw_detections(
+            image=frame, bboxes=bboxes, scores=scores, landmarks=landmarks, vis_threshold=threshold, fancy_bbox=True
+        )

        for face in faces:
            gender_id, age = age_gender.predict(frame, face['bbox'])  # predict per face
--- a/scripts/run_detection.py
+++ b/scripts/run_detection.py
@@ -98,7 +98,7 @@ def main():
    else:
        from uniface.constants import YOLOv5FaceWeights

-        detector = YOLOv5Face(model_name=YOLOv5FaceWeights.YOLOV5N)
+        detector = YOLOv5Face(model_name=YOLOv5FaceWeights.YOLOV5M)

    if args.webcam:
        run_webcam(detector, args.threshold)
--- a/scripts/run_face_parsing.py
+++ b/scripts/run_face_parsing.py
@@ -0,0 +1,126 @@
+# Face parsing on detected faces
+# Usage: python run_face_parsing.py --image path/to/image.jpg
+#        python run_face_parsing.py --webcam
+
+import argparse
+import os
+from pathlib import Path
+
+import cv2
+
+from uniface import RetinaFace
+from uniface.constants import ParsingWeights
+from uniface.parsing import BiSeNet
+from uniface.visualization import vis_parsing_maps
+
+
+def process_image(detector, parser, image_path: str, save_dir: str = 'outputs'):
+    image = cv2.imread(image_path)
+    if image is None:
+        print(f"Error: Failed to load image from '{image_path}'")
+        return
+
+    faces = detector.detect(image)
+    print(f'Detected {len(faces)} face(s)')
+
+    result_image = image.copy()
+
+    for i, face in enumerate(faces):
+        bbox = face['bbox']
+        x1, y1, x2, y2 = map(int, bbox[:4])
+        face_crop = image[y1:y2, x1:x2]
+
+        if face_crop.size == 0:
+            continue
+
+        # Parse the face
+        mask = parser.parse(face_crop)
+        print(f'  Face {i + 1}: parsed with {len(set(mask.flatten()))} unique classes')
+
+        # Visualize the parsing result
+        face_crop_rgb = cv2.cvtColor(face_crop, cv2.COLOR_BGR2RGB)
+        vis_result = vis_parsing_maps(face_crop_rgb, mask, save_image=False)
+
+        # Place the visualization back on the original image
+        result_image[y1:y2, x1:x2] = vis_result
+
+        # Draw bounding box
+        cv2.rectangle(result_image, (x1, y1), (x2, y2), (0, 255, 0), 2)
+
+    os.makedirs(save_dir, exist_ok=True)
+    output_path = os.path.join(save_dir, f'{Path(image_path).stem}_parsing.jpg')
+    cv2.imwrite(output_path, result_image)
+    print(f'Output saved: {output_path}')
+
+
+def run_webcam(detector, parser):
+    cap = cv2.VideoCapture(0)
+    if not cap.isOpened():
+        print('Cannot open webcam')
+        return
+
+    print("Press 'q' to quit")
+
+    while True:
+        ret, frame = cap.read()
+        if not ret:
+            break
+
+        frame = cv2.flip(frame, 1)
+        faces = detector.detect(frame)
+
+        for face in faces:
+            bbox = face['bbox']
+            x1, y1, x2, y2 = map(int, bbox[:4])
+            face_crop = frame[y1:y2, x1:x2]
+
+            if face_crop.size == 0:
+                continue
+
+            # Parse the face
+            mask = parser.parse(face_crop)
+
+            # Visualize the parsing result
+            face_crop_rgb = cv2.cvtColor(face_crop, cv2.COLOR_BGR2RGB)
+            vis_result = vis_parsing_maps(face_crop_rgb, mask, save_image=False)
+
+            # Place the visualization back on the frame
+            frame[y1:y2, x1:x2] = vis_result
+
+            # Draw bounding box
+            cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
+
+        cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
+        cv2.imshow('Face Parsing', frame)
+
+        if cv2.waitKey(1) & 0xFF == ord('q'):
+            break
+
+    cap.release()
+    cv2.destroyAllWindows()
+
+
+def main():
+    parser_arg = argparse.ArgumentParser(description='Run face parsing')
+    parser_arg.add_argument('--image', type=str, help='Path to input image')
+    parser_arg.add_argument('--webcam', action='store_true', help='Use webcam')
+    parser_arg.add_argument('--save_dir', type=str, default='outputs')
+    parser_arg.add_argument(
+        '--model', type=str, default=ParsingWeights.RESNET18, choices=[ParsingWeights.RESNET18, ParsingWeights.RESNET34]
+    )
+    args = parser_arg.parse_args()
+
+    if not args.image and not args.webcam:
+        parser_arg.error('Either --image or --webcam must be specified')
+
+    detector = RetinaFace()
+    parser = BiSeNet(model_name=ParsingWeights.RESNET34)
+
+    if args.webcam:
+        run_webcam(detector, parser)
+    else:
+        process_image(detector, parser, args.image, args.save_dir)
+
+
+if __name__ == '__main__':
+    main()
--- a/scripts/run_gaze_estimation.py
+++ b/scripts/run_gaze_estimation.py
@@ -0,0 +1,104 @@
+# Gaze estimation on detected faces
+# Usage: python run_gaze_estimation.py --image path/to/image.jpg
+#        python run_gaze_estimation.py --webcam
+
+import argparse
+import os
+from pathlib import Path
+
+import cv2
+import numpy as np
+
+from uniface import RetinaFace
+from uniface.gaze import MobileGaze
+from uniface.visualization import draw_gaze
+
+
+def process_image(detector, gaze_estimator, image_path: str, save_dir: str = 'outputs'):
+    image = cv2.imread(image_path)
+    if image is None:
+        print(f"Error: Failed to load image from '{image_path}'")
+        return
+
+    faces = detector.detect(image)
+    print(f'Detected {len(faces)} face(s)')
+
+    for i, face in enumerate(faces):
+        bbox = face['bbox']
+        x1, y1, x2, y2 = map(int, bbox[:4])
+        face_crop = image[y1:y2, x1:x2]
+
+        if face_crop.size == 0:
+            continue
+
+        pitch, yaw = gaze_estimator.estimate(face_crop)
+        print(f'  Face {i + 1}: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°')
+
+        # Draw both bbox and gaze arrow with angle text
+        draw_gaze(image, bbox, pitch, yaw, draw_angles=True)
+
+    os.makedirs(save_dir, exist_ok=True)
+    output_path = os.path.join(save_dir, f'{Path(image_path).stem}_gaze.jpg')
+    cv2.imwrite(output_path, image)
+    print(f'Output saved: {output_path}')
+
+
+def run_webcam(detector, gaze_estimator):
+    cap = cv2.VideoCapture(0)
+    if not cap.isOpened():
+        print('Cannot open webcam')
+        return
+
+    print("Press 'q' to quit")
+
+    while True:
+        ret, frame = cap.read()
+        if not ret:
+            break
+
+        frame = cv2.flip(frame, 1)
+        faces = detector.detect(frame)
+
+        for face in faces:
+            bbox = face['bbox']
+            x1, y1, x2, y2 = map(int, bbox[:4])
+            face_crop = frame[y1:y2, x1:x2]
+
+            if face_crop.size == 0:
+                continue
+
+            pitch, yaw = gaze_estimator.estimate(face_crop)
+            # Draw both bbox and gaze arrow
+            draw_gaze(frame, bbox, pitch, yaw)
+
+        cv2.putText(frame, f'Faces: {len(faces)}', (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
+        cv2.imshow('Gaze Estimation', frame)
+
+        if cv2.waitKey(1) & 0xFF == ord('q'):
+            break
+
+    cap.release()
+    cv2.destroyAllWindows()
+
+
+def main():
+    parser = argparse.ArgumentParser(description='Run gaze estimation')
+    parser.add_argument('--image', type=str, help='Path to input image')
+    parser.add_argument('--webcam', action='store_true', help='Use webcam')
+    parser.add_argument('--save_dir', type=str, default='outputs')
+    args = parser.parse_args()
+
+    if not args.image and not args.webcam:
+        parser.error('Either --image or --webcam must be specified')
+
+    detector = RetinaFace()
+    gaze_estimator = MobileGaze()
+
+    if args.webcam:
+        run_webcam(detector, gaze_estimator)
+    else:
+        process_image(detector, gaze_estimator, args.image, args.save_dir)
+
+
+if __name__ == '__main__':
+    main()
--- a/tests/test_parsing.py
+++ b/tests/test_parsing.py
@@ -0,0 +1,118 @@
+# Copyright 2025 Yakhyokhuja Valikhujaev
+# Author: Yakhyokhuja Valikhujaev
+# GitHub: https://github.com/yakhyo
+
+import numpy as np
+import pytest
+
+from uniface.constants import ParsingWeights
+from uniface.parsing import BiSeNet, create_face_parser
+
+
+def test_bisenet_initialization():
+    """Test BiSeNet initialization."""
+    parser = BiSeNet()
+    assert parser is not None
+    assert parser.input_size == (512, 512)
+
+
+def test_bisenet_with_different_models():
+    """Test BiSeNet with different model weights."""
+    parser_resnet18 = BiSeNet(model_name=ParsingWeights.RESNET18)
+    parser_resnet34 = BiSeNet(model_name=ParsingWeights.RESNET34)
+
+    assert parser_resnet18 is not None
+    assert parser_resnet34 is not None
+
+
+def test_bisenet_preprocess():
+    """Test preprocessing."""
+    parser = BiSeNet()
+
+    # Create a dummy face image
+    face_image = np.random.randint(0, 255, (256, 256, 3), dtype=np.uint8)
+
+    # Preprocess
+    preprocessed = parser.preprocess(face_image)
+
+    assert preprocessed.shape == (1, 3, 512, 512)
+    assert preprocessed.dtype == np.float32
+
+
+def test_bisenet_postprocess():
+    """Test postprocessing."""
+    parser = BiSeNet()
+
+    # Create dummy model output (batch_size=1, num_classes=19, H=512, W=512)
+    dummy_output = np.random.randn(1, 19, 512, 512).astype(np.float32)
+
+    # Postprocess
+    mask = parser.postprocess(dummy_output, original_size=(256, 256))
+
+    assert mask.shape == (256, 256)
+    assert mask.dtype == np.uint8
+    assert mask.min() >= 0
+    assert mask.max() < 19  # 19 classes (0-18)
+
+
+def test_bisenet_parse():
+    """Test end-to-end parsing."""
+    parser = BiSeNet()
+
+    # Create a dummy face image
+    face_image = np.random.randint(0, 255, (256, 256, 3), dtype=np.uint8)
+
+    # Parse
+    mask = parser.parse(face_image)
+
+    assert mask.shape == (256, 256)
+    assert mask.dtype == np.uint8
+    assert mask.min() >= 0
+    assert mask.max() < 19
+
+
+def test_bisenet_callable():
+    """Test that BiSeNet is callable."""
+    parser = BiSeNet()
+    face_image = np.random.randint(0, 255, (256, 256, 3), dtype=np.uint8)
+
+    # Should work as callable
+    mask = parser(face_image)
+
+    assert mask.shape == (256, 256)
+    assert mask.dtype == np.uint8
+
+
+def test_create_face_parser_with_enum():
+    """Test factory function with enum."""
+    parser = create_face_parser(ParsingWeights.RESNET18)
+    assert parser is not None
+    assert isinstance(parser, BiSeNet)
+
+
+def test_create_face_parser_with_string():
+    """Test factory function with string."""
+    parser = create_face_parser('parsing_resnet18')
+    assert parser is not None
+    assert isinstance(parser, BiSeNet)
+
+
+def test_create_face_parser_invalid_model():
+    """Test factory function with invalid model name."""
+    with pytest.raises(ValueError, match='Unknown face parsing model'):
+        create_face_parser('invalid_model')
+
+
+def test_bisenet_different_input_sizes():
+    """Test parsing with different input image sizes."""
+    parser = BiSeNet()
+
+    # Test with different sizes
+    sizes = [(128, 128), (256, 256), (512, 512), (640, 480)]
+
+    for h, w in sizes:
+        face_image = np.random.randint(0, 255, (h, w, 3), dtype=np.uint8)
+        mask = parser.parse(face_image)
+
+        assert mask.shape == (h, w), f'Failed for size {h}x{w}'
+        assert mask.dtype == np.uint8
--- a/uniface/init.py
+++ b/uniface/init.py
@@ -13,13 +13,13 @@

 __license__ = 'MIT'
 __author__ = 'Yakhyokhuja Valikhujaev'
-__version__ = '1.3.2'
+__version__ = '1.5.3'


 from uniface.face_utils import compute_similarity, face_alignment
 from uniface.log import Logger, enable_logging
 from uniface.model_store import verify_model_weights
-from uniface.visualization import draw_detections
+from uniface.visualization import draw_detections, vis_parsing_maps

 from .analyzer import FaceAnalyzer
 from .attribute import AgeGender
@@ -37,7 +37,9 @@ from .detection import (
    detect_faces,
    list_available_detectors,
 )
+from .gaze import MobileGaze, create_gaze_estimator
 from .landmark import Landmark106, create_landmarker
+from .parsing import BiSeNet, create_face_parser
 from .recognition import ArcFace, MobileFace, SphereFace, create_recognizer

 __all__ = [
@@ -49,6 +51,8 @@ __all__ = [
    'FaceAnalyzer',
    # Factory functions
    'create_detector',
+    'create_face_parser',
+    'create_gaze_estimator',
    'create_landmarker',
    'create_recognizer',
    'detect_faces',
@@ -63,12 +67,17 @@ __all__ = [
    'SphereFace',
    # Landmark models
    'Landmark106',
+    # Gaze models
+    'MobileGaze',
+    # Parsing models
+    'BiSeNet',
    # Attribute models
    'AgeGender',
    'Emotion',
    # Utilities
    'compute_similarity',
    'draw_detections',
+    'vis_parsing_maps',
    'face_alignment',
    'verify_model_weights',
    'Logger',
--- a/uniface/constants.py
+++ b/uniface/constants.py
@@ -96,6 +96,29 @@ class LandmarkWeights(str, Enum):
    DEFAULT = "2d_106"


+class GazeWeights(str, Enum):
+    """
+    MobileGaze: Real-Time Gaze Estimation models.
+    Trained on Gaze360 dataset.
+    https://github.com/yakhyo/gaze-estimation
+    """
+    RESNET18     = "gaze_resnet18"
+    RESNET34     = "gaze_resnet34"
+    RESNET50     = "gaze_resnet50"
+    MOBILENET_V2 = "gaze_mobilenetv2"
+    MOBILEONE_S0 = "gaze_mobileone_s0"
+
+
+class ParsingWeights(str, Enum):
+    """
+    Face Parsing: Semantic Segmentation of Facial Components.
+    Trained on CelebAMask-HQ dataset.
+    https://github.com/yakhyo/face-parsing
+    """
+    RESNET18 = "parsing_resnet18"
+    RESNET34 = "parsing_resnet34"
+
+
 MODEL_URLS: Dict[Enum, str] = {
    # RetinaFace
    RetinaFaceWeights.MNET_025:      'https://github.com/yakhyo/uniface/releases/download/weights/retinaface_mv1_0.25.onnx',
@@ -129,6 +152,15 @@ MODEL_URLS: Dict[Enum, str] = {
    AgeGenderWeights.DEFAULT:        'https://github.com/yakhyo/uniface/releases/download/weights/genderage.onnx',
    # Landmarks
    LandmarkWeights.DEFAULT:         'https://github.com/yakhyo/uniface/releases/download/weights/2d106det.onnx',
+    # Gaze (MobileGaze)
+    GazeWeights.RESNET18:            'https://github.com/yakhyo/gaze-estimation/releases/download/weights/resnet18_gaze.onnx',
+    GazeWeights.RESNET34:            'https://github.com/yakhyo/gaze-estimation/releases/download/weights/resnet34_gaze.onnx',
+    GazeWeights.RESNET50:            'https://github.com/yakhyo/gaze-estimation/releases/download/weights/resnet50_gaze.onnx',
+    GazeWeights.MOBILENET_V2:        'https://github.com/yakhyo/gaze-estimation/releases/download/weights/mobilenetv2_gaze.onnx',
+    GazeWeights.MOBILEONE_S0:        'https://github.com/yakhyo/gaze-estimation/releases/download/weights/mobileone_s0_gaze.onnx',
+    # Parsing
+    ParsingWeights.RESNET18:         'https://github.com/yakhyo/face-parsing/releases/download/weights/resnet18.onnx',
+    ParsingWeights.RESNET34:         'https://github.com/yakhyo/face-parsing/releases/download/weights/resnet34.onnx',
 }

 MODEL_SHA256: Dict[Enum, str] = {
@@ -164,6 +196,15 @@ MODEL_SHA256: Dict[Enum, str] = {
    AgeGenderWeights.DEFAULT:        '4fde69b1c810857b88c64a335084f1c3fe8f01246c9a191b48c7bb756d6652fb',
    # Landmark
    LandmarkWeights.DEFAULT:         'f001b856447c413801ef5c42091ed0cd516fcd21f2d6b79635b1e733a7109dbf',
+    # MobileGaze (trained on Gaze360)
+    GazeWeights.RESNET18:            '23d5d7e4f6f40dce8c35274ce9d08b45b9e22cbaaf5af73182f473229d713d31',
+    GazeWeights.RESNET34:            '4457ee5f7acd1a5ab02da4b61f02fc3a0b17adbf3844dd0ba3cd4288f2b5e1de',
+    GazeWeights.RESNET50:            'e1eaf98f5ec7c89c6abe7cfe39f7be83e747163f98d1ff945c0603b3c521be22',
+    GazeWeights.MOBILENET_V2:        'fdcdb84e3e6421b5a79e8f95139f249fc258d7f387eed5ddac2b80a9a15ce076',
+    GazeWeights.MOBILEONE_S0:        'c0b5a4f4a0ffd24f76ab3c1452354bb2f60110899fd9a88b464c75bafec0fde8',
+    # Face Parsing
+    ParsingWeights.RESNET18:         '0d9bd318e46987c3bdbfacae9e2c0f461cae1c6ac6ea6d43bbe541a91727e33f',
+    ParsingWeights.RESNET34:         '5b805bba7b5660ab7070b5a381dcf75e5b3e04199f1e9387232a77a00095102e',
 }

 CHUNK_SIZE = 8192
--- a/uniface/gaze/init.py
+++ b/uniface/gaze/init.py
@@ -0,0 +1,58 @@
+# Copyright 2025 Yakhyokhuja Valikhujaev
+# Author: Yakhyokhuja Valikhujaev
+# GitHub: https://github.com/yakhyo
+
+from .base import BaseGazeEstimator
+from .models import MobileGaze
+
+
+def create_gaze_estimator(method: str = 'mobilegaze', **kwargs) -> BaseGazeEstimator:
+    """
+    Factory function to create gaze estimators.
+
+    This function initializes and returns a gaze estimator instance based on the
+    specified method. It acts as a high-level interface to the underlying
+    model classes.
+
+    Args:
+        method (str): The gaze estimation method to use.
+            Options: 'mobilegaze' (default).
+        **kwargs: Model-specific parameters passed to the estimator's constructor.
+            For example, `model_name` can be used to select a specific
+            backbone from `GazeWeights` enum (RESNET18, RESNET34, RESNET50,
+            MOBILENET_V2, MOBILEONE_S0).
+
+    Returns:
+        BaseGazeEstimator: An initialized gaze estimator instance ready for use.
+
+    Raises:
+        ValueError: If the specified `method` is not supported.
+
+    Examples:
+        >>> # Create the default MobileGaze estimator (ResNet18 backbone)
+        >>> estimator = create_gaze_estimator()
+
+        >>> # Create with MobileNetV2 backbone
+        >>> from uniface.constants import GazeWeights
+        >>> estimator = create_gaze_estimator(
+        ...     'mobilegaze',
+        ...     model_name=GazeWeights.MOBILENET_V2
+        ... )
+
+        >>> # Use the estimator
+        >>> pitch, yaw = estimator.estimate(face_crop)
+    """
+    method = method.lower()
+
+    if method in ('mobilegaze', 'mobile_gaze', 'gaze'):
+        return MobileGaze(**kwargs)
+    else:
+        available = ['mobilegaze']
+        raise ValueError(f"Unsupported gaze estimation method: '{method}'. Available: {available}")
+
+
+__all__ = [
+    'create_gaze_estimator',
+    'MobileGaze',
+    'BaseGazeEstimator',
+]
--- a/uniface/gaze/base.py
+++ b/uniface/gaze/base.py
@@ -0,0 +1,108 @@
+# Copyright 2025 Yakhyokhuja Valikhujaev
+# Author: Yakhyokhuja Valikhujaev
+# GitHub: https://github.com/yakhyo
+
+from abc import ABC, abstractmethod
+from typing import Tuple
+
+import numpy as np
+
+
+class BaseGazeEstimator(ABC):
+    """
+    Abstract base class for all gaze estimation models.
+
+    This class defines the common interface that all gaze estimators must implement,
+    ensuring consistency across different gaze estimation methods. Gaze estimation
+    predicts the direction a person is looking based on their face image.
+
+    The gaze direction is represented as pitch and yaw angles in radians:
+    - Pitch: Vertical angle (positive = looking up, negative = looking down)
+    - Yaw: Horizontal angle (positive = looking right, negative = looking left)
+    """
+
+    @abstractmethod
+    def _initialize_model(self) -> None:
+        """
+        Initialize the underlying model for inference.
+
+        This method should handle loading model weights, creating the
+        inference session (e.g., ONNX Runtime), and any necessary
+        setup procedures to prepare the model for prediction.
+
+        Raises:
+            RuntimeError: If the model fails to load or initialize.
+        """
+        raise NotImplementedError('Subclasses must implement the _initialize_model method.')
+
+    @abstractmethod
+    def preprocess(self, face_image: np.ndarray) -> np.ndarray:
+        """
+        Preprocess the input face image for model inference.
+
+        This method should take a raw face crop and convert it into the format
+        expected by the model's inference engine (e.g., normalized tensor).
+
+        Args:
+            face_image (np.ndarray): A cropped face image in BGR format with
+                                     shape (H, W, C).
+
+        Returns:
+            np.ndarray: The preprocessed image tensor ready for inference,
+                        typically with shape (1, C, H, W).
+        """
+        raise NotImplementedError('Subclasses must implement the preprocess method.')
+
+    @abstractmethod
+    def postprocess(self, outputs: Tuple[np.ndarray, np.ndarray]) -> Tuple[float, float]:
+        """
+        Postprocess raw model outputs into gaze angles.
+
+        This method takes the raw output from the model's inference and
+        converts it into pitch and yaw angles in radians.
+
+        Args:
+            outputs: Raw outputs from the model inference. The format depends
+                     on the specific model architecture.
+
+        Returns:
+            Tuple[float, float]: A tuple of (pitch, yaw) angles in radians.
+        """
+        raise NotImplementedError('Subclasses must implement the postprocess method.')
+
+    @abstractmethod
+    def estimate(self, face_image: np.ndarray) -> Tuple[float, float]:
+        """
+        Perform end-to-end gaze estimation on a face image.
+
+        This method orchestrates the full pipeline: preprocessing the input,
+        running inference, and postprocessing to return the gaze direction.
+
+        Args:
+            face_image (np.ndarray): A cropped face image in BGR format.
+                                     The face should be roughly centered and
+                                     well-framed within the image.
+
+        Returns:
+            Tuple[float, float]: A tuple of (pitch, yaw) angles in radians:
+                - pitch: Vertical gaze angle (positive = up, negative = down)
+                - yaw: Horizontal gaze angle (positive = right, negative = left)
+
+        Example:
+            >>> estimator = create_gaze_estimator()
+            >>> pitch, yaw = estimator.estimate(face_crop)
+            >>> print(f"Looking: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°")
+        """
+        raise NotImplementedError('Subclasses must implement the estimate method.')
+
+    def __call__(self, face_image: np.ndarray) -> Tuple[float, float]:
+        """
+        Provides a convenient, callable shortcut for the `estimate` method.
+
+        Args:
+            face_image (np.ndarray): A cropped face image in BGR format.
+
+        Returns:
+            Tuple[float, float]: A tuple of (pitch, yaw) angles in radians.
+        """
+        return self.estimate(face_image)
--- a/uniface/gaze/models.py
+++ b/uniface/gaze/models.py
@@ -0,0 +1,187 @@
+# Copyright 2025 Yakhyokhuja Valikhujaev
+# Author: Yakhyokhuja Valikhujaev
+# GitHub: https://github.com/yakhyo
+
+from typing import Tuple
+
+import cv2
+import numpy as np
+
+from uniface.constants import GazeWeights
+from uniface.log import Logger
+from uniface.model_store import verify_model_weights
+from uniface.onnx_utils import create_onnx_session
+
+from .base import BaseGazeEstimator
+
+__all__ = ['MobileGaze']
+
+
+class MobileGaze(BaseGazeEstimator):
+    """
+    MobileGaze: Real-Time Gaze Estimation with ONNX Runtime.
+
+    MobileGaze is a gaze estimation model that predicts gaze direction from a single
+    face image. It supports multiple backbone architectures including ResNet 18/34/50,
+    MobileNetV2, and MobileOne S0. The model uses a classification approach with binned
+    angles, which are then decoded to continuous pitch and yaw values.
+
+    The model outputs gaze direction as pitch (vertical) and yaw (horizontal) angles
+    in radians.
+
+    Reference:
+        https://github.com/yakhyo/gaze-estimation
+
+    Args:
+        model_name (GazeWeights): The enum specifying the gaze model backbone to load.
+            Options: RESNET18, RESNET34, RESNET50, MOBILENET_V2, MOBILEONE_S0.
+            Defaults to `GazeWeights.RESNET18`.
+        input_size (Tuple[int, int]): The resolution (width, height) for the model's
+            input. Defaults to (448, 448).
+
+    Attributes:
+        input_size (Tuple[int, int]): Model input dimensions.
+        input_mean (list): Per-channel mean values for normalization (ImageNet).
+        input_std (list): Per-channel std values for normalization (ImageNet).
+
+    Example:
+        >>> from uniface.gaze import MobileGaze
+        >>> from uniface import RetinaFace
+        >>>
+        >>> detector = RetinaFace()
+        >>> gaze_estimator = MobileGaze()
+        >>>
+        >>> # Detect faces and estimate gaze for each
+        >>> faces = detector.detect(image)
+        >>> for face in faces:
+        ...     bbox = face['bbox']
+        ...     x1, y1, x2, y2 = map(int, bbox[:4])
+        ...     face_crop = image[y1:y2, x1:x2]
+        ...     pitch, yaw = gaze_estimator.estimate(face_crop)
+        ...     print(f"Gaze: pitch={np.degrees(pitch):.1f}°, yaw={np.degrees(yaw):.1f}°")
+    """
+
+    def __init__(
+        self,
+        model_name: GazeWeights = GazeWeights.RESNET34,
+        input_size: Tuple[int, int] = (448, 448),
+    ) -> None:
+        Logger.info(f'Initializing MobileGaze with model={model_name}, input_size={input_size}')
+
+        self.input_size = input_size
+        self.input_mean = [0.485, 0.456, 0.406]
+        self.input_std = [0.229, 0.224, 0.225]
+
+        # Model specific parameters for bin-based classification (Gaze360 config)
+        self._bins = 90
+        self._binwidth = 4
+        self._angle_offset = 180
+        self._idx_tensor = np.arange(self._bins, dtype=np.float32)
+
+        self.model_path = verify_model_weights(model_name)
+        self._initialize_model()
+
+    def _initialize_model(self) -> None:
+        """
+        Initialize the ONNX model from the stored model path.
+
+        Raises:
+            RuntimeError: If the model fails to load or initialize.
+        """
+        try:
+            self.session = create_onnx_session(self.model_path)
+
+            # Get input configuration
+            input_cfg = self.session.get_inputs()[0]
+            input_shape = input_cfg.shape
+            self.input_name = input_cfg.name
+            self.input_size = tuple(input_shape[2:4][::-1])  # Update from model
+
+            # Get output configuration
+            outputs = self.session.get_outputs()
+            self.output_names = [output.name for output in outputs]
+
+            if len(self.output_names) != 2:
+                raise ValueError(f'Expected 2 output nodes (pitch, yaw), got {len(self.output_names)}')
+
+            Logger.info(f'MobileGaze initialized with input size {self.input_size}')
+
+        except Exception as e:
+            Logger.error(f"Failed to load gaze model from '{self.model_path}'", exc_info=True)
+            raise RuntimeError(f'Failed to initialize gaze model: {e}') from e
+
+    def preprocess(self, face_image: np.ndarray) -> np.ndarray:
+        """
+        Preprocess a face crop for gaze estimation.
+
+        Args:
+            face_image (np.ndarray): A cropped face image in BGR format.
+
+        Returns:
+            np.ndarray: Preprocessed image tensor with shape (1, 3, H, W).
+        """
+        # Convert BGR to RGB
+        image = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
+
+        # Resize to model input size
+        image = cv2.resize(image, self.input_size)
+
+        # Normalize to [0, 1] and apply normalization
+        image = image.astype(np.float32) / 255.0
+        mean = np.array(self.input_mean, dtype=np.float32)
+        std = np.array(self.input_std, dtype=np.float32)
+        image = (image - mean) / std
+
+        # HWC -> CHW -> NCHW
+        image = np.transpose(image, (2, 0, 1))
+        image = np.expand_dims(image, axis=0).astype(np.float32)
+
+        return image
+
+    def _softmax(self, x: np.ndarray) -> np.ndarray:
+        """Apply softmax along axis 1."""
+        e_x = np.exp(x - np.max(x, axis=1, keepdims=True))
+        return e_x / e_x.sum(axis=1, keepdims=True)
+
+    def postprocess(self, outputs: Tuple[np.ndarray, np.ndarray]) -> Tuple[np.ndarray, np.ndarray]:
+        """
+        Postprocess raw model outputs into gaze angles.
+
+        This method takes the raw output from the model's inference and
+        converts it into pitch and yaw angles in radians.
+
+        Args:
+            outputs: Raw outputs from the model inference. The format depends
+                     on the specific model architecture.
+
+        Returns:
+            Tuple[np.ndarray, np.ndarray]: A tuple of (pitch, yaw) angles in radians.
+        """
+        pitch_logits, yaw_logits = outputs
+
+        # Convert logits to probabilities
+        pitch_probs = self._softmax(pitch_logits)
+        yaw_probs = self._softmax(yaw_logits)
+
+        # Compute expected bin index (soft-argmax)
+        pitch_deg = np.sum(pitch_probs * self._idx_tensor, axis=1) * self._binwidth - self._angle_offset
+        yaw_deg = np.sum(yaw_probs * self._idx_tensor, axis=1) * self._binwidth - self._angle_offset
+
+        # Convert degrees to radians
+        pitch = np.radians(pitch_deg[0])
+        yaw = np.radians(yaw_deg[0])
+
+        return pitch, yaw
+
+    def estimate(self, face_image: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
+        """
+        Perform end-to-end gaze estimation on a face image.
+
+        This method orchestrates the full pipeline: preprocessing the input,
+        running inference, and postprocessing to return the gaze direction.
+        """
+        input_tensor = self.preprocess(face_image)
+        outputs = self.session.run(self.output_names, {self.input_name: input_tensor})
+        pitch, yaw = self.postprocess((outputs[0], outputs[1]))
+
+        return pitch, yaw
--- a/uniface/parsing/init.py
+++ b/uniface/parsing/init.py
@@ -0,0 +1,61 @@
+# Copyright 2025 Yakhyokhuja Valikhujaev
+# Author: Yakhyokhuja Valikhujaev
+# GitHub: https://github.com/yakhyo
+
+from typing import Union
+
+from uniface.constants import ParsingWeights
+
+from .base import BaseFaceParser
+from .bisenet import BiSeNet
+
+__all__ = ['BaseFaceParser', 'BiSeNet', 'create_face_parser']
+
+
+def create_face_parser(
+    model_name: Union[str, ParsingWeights] = ParsingWeights.RESNET18,
+) -> BaseFaceParser:
+    """
+    Factory function to create a face parsing model instance.
+
+    This function provides a convenient way to instantiate face parsing models
+    without directly importing the specific model classes. It supports both
+    string-based and enum-based model selection.
+
+    Args:
+        model_name (Union[str, ParsingWeights]): The face parsing model to create.
+            Can be either a string or a ParsingWeights enum value.
+            Available options:
+            - 'parsing_resnet18' or ParsingWeights.RESNET18 (default)
+            - 'parsing_resnet34' or ParsingWeights.RESNET34
+
+    Returns:
+        BaseFaceParser: An instance of the requested face parsing model.
+
+    Raises:
+        ValueError: If the model_name is not recognized.
+
+    Examples:
+        >>> # Using enum
+        >>> from uniface.parsing import create_face_parser
+        >>> from uniface.constants import ParsingWeights
+        >>> parser = create_face_parser(ParsingWeights.RESNET18)
+        >>>
+        >>> # Using string
+        >>> parser = create_face_parser('parsing_resnet18')
+        >>>
+        >>> # Parse a face image
+        >>> mask = parser.parse(face_crop)
+    """
+    # Convert string to enum if necessary
+    if isinstance(model_name, str):
+        try:
+            model_name = ParsingWeights(model_name)
+        except ValueError as e:
+            valid_models = [e.value for e in ParsingWeights]
+            raise ValueError(
+                f"Unknown face parsing model: '{model_name}'. Valid options are: {', '.join(valid_models)}"
+            ) from e
+
+    # All parsing models use the same BiSeNet class
+    return BiSeNet(model_name=model_name)
--- a/uniface/parsing/base.py
+++ b/uniface/parsing/base.py
@@ -0,0 +1,106 @@
+# Copyright 2025 Yakhyokhuja Valikhujaev
+# Author: Yakhyokhuja Valikhujaev
+# GitHub: https://github.com/yakhyo
+
+from abc import ABC, abstractmethod
+from typing import Tuple
+
+import numpy as np
+
+
+class BaseFaceParser(ABC):
+    """
+    Abstract base class for all face parsing models.
+
+    This class defines the common interface that all face parsing models must implement,
+    ensuring consistency across different parsing methods. Face parsing segments a face
+    image into semantic regions such as skin, eyes, nose, mouth, hair, etc.
+
+    The output is a segmentation mask where each pixel is assigned a class label
+    representing a facial component.
+    """
+
+    @abstractmethod
+    def _initialize_model(self) -> None:
+        """
+        Initialize the underlying model for inference.
+
+        This method should handle loading model weights, creating the
+        inference session (e.g., ONNX Runtime), and any necessary
+        setup procedures to prepare the model for prediction.
+
+        Raises:
+            RuntimeError: If the model fails to load or initialize.
+        """
+        raise NotImplementedError('Subclasses must implement the _initialize_model method.')
+
+    @abstractmethod
+    def preprocess(self, face_image: np.ndarray) -> np.ndarray:
+        """
+        Preprocess the input face image for model inference.
+
+        This method should take a raw face crop and convert it into the format
+        expected by the model's inference engine (e.g., normalized tensor).
+
+        Args:
+            face_image (np.ndarray): A face image in BGR format with
+                                     shape (H, W, C).
+
+        Returns:
+            np.ndarray: The preprocessed image tensor ready for inference,
+                        typically with shape (1, C, H, W).
+        """
+        raise NotImplementedError('Subclasses must implement the preprocess method.')
+
+    @abstractmethod
+    def postprocess(self, outputs: np.ndarray, original_size: Tuple[int, int]) -> np.ndarray:
+        """
+        Postprocess raw model outputs into a segmentation mask.
+
+        This method takes the raw output from the model's inference and
+        converts it into a segmentation mask at the original image size.
+
+        Args:
+            outputs (np.ndarray): Raw outputs from the model inference.
+            original_size (Tuple[int, int]): Original image size (width, height).
+
+        Returns:
+            np.ndarray: Segmentation mask with the same size as the original image.
+        """
+        raise NotImplementedError('Subclasses must implement the postprocess method.')
+
+    @abstractmethod
+    def parse(self, face_image: np.ndarray) -> np.ndarray:
+        """
+        Perform end-to-end face parsing on a face image.
+
+        This method orchestrates the full pipeline: preprocessing the input,
+        running inference, and postprocessing to return the segmentation mask.
+
+        Args:
+            face_image (np.ndarray): A face image in BGR format.
+                                     The face should be roughly centered and
+                                     well-framed within the image.
+
+        Returns:
+            np.ndarray: Segmentation mask with the same size as input image,
+                       where each pixel value represents a facial component class.
+
+        Example:
+            >>> parser = create_face_parser()
+            >>> mask = parser.parse(face_crop)
+            >>> print(f"Mask shape: {mask.shape}, unique classes: {np.unique(mask)}")
+        """
+        raise NotImplementedError('Subclasses must implement the parse method.')
+
+    def __call__(self, face_image: np.ndarray) -> np.ndarray:
+        """
+        Provides a convenient, callable shortcut for the `parse` method.
+
+        Args:
+            face_image (np.ndarray): A face image in BGR format.
+
+        Returns:
+            np.ndarray: Segmentation mask with the same size as input image.
+        """
+        return self.parse(face_image)
--- a/uniface/parsing/bisenet.py
+++ b/uniface/parsing/bisenet.py
@@ -0,0 +1,166 @@
+# Copyright 2025 Yakhyokhuja Valikhujaev
+# Author: Yakhyokhuja Valikhujaev
+# GitHub: https://github.com/yakhyo
+
+from typing import Tuple
+
+import cv2
+import numpy as np
+
+from uniface.constants import ParsingWeights
+from uniface.log import Logger
+from uniface.model_store import verify_model_weights
+from uniface.onnx_utils import create_onnx_session
+
+from .base import BaseFaceParser
+
+__all__ = ['BiSeNet']
+
+
+class BiSeNet(BaseFaceParser):
+    """
+    BiSeNet: Bilateral Segmentation Network for Face Parsing with ONNX Runtime.
+
+    BiSeNet is a semantic segmentation model that segments a face image into
+    different facial components such as skin, eyes, nose, mouth, hair, etc. The model
+    uses a BiSeNet architecture with ResNet backbone and outputs a segmentation mask
+    where each pixel is assigned a class label.
+
+    The model supports 19 facial component classes including:
+    - Background, skin, eyebrows, eyes, nose, mouth, lips, ears, hair, etc.
+
+    Reference:
+        https://github.com/yakhyo/face-parsing
+
+    Args:
+        model_name (ParsingWeights): The enum specifying the parsing model to load.
+            Options: RESNET18, RESNET34.
+            Defaults to `ParsingWeights.RESNET18`.
+        input_size (Tuple[int, int]): The resolution (width, height) for the model's
+            input. Defaults to (512, 512).
+
+    Attributes:
+        input_size (Tuple[int, int]): Model input dimensions.
+        input_mean (np.ndarray): Per-channel mean values for normalization (ImageNet).
+        input_std (np.ndarray): Per-channel std values for normalization (ImageNet).
+
+    Example:
+        >>> from uniface.parsing import BiSeNet
+        >>> from uniface import RetinaFace
+        >>>
+        >>> detector = RetinaFace()
+        >>> parser = BiSeNet()
+        >>>
+        >>> # Detect faces and parse each face
+        >>> faces = detector.detect(image)
+        >>> for face in faces:
+        ...     bbox = face['bbox']
+        ...     x1, y1, x2, y2 = map(int, bbox[:4])
+        ...     face_crop = image[y1:y2, x1:x2]
+        ...     mask = parser.parse(face_crop)
+        ...     print(f"Mask shape: {mask.shape}, unique classes: {np.unique(mask)}")
+    """
+
+    def __init__(
+        self,
+        model_name: ParsingWeights = ParsingWeights.RESNET18,
+        input_size: Tuple[int, int] = (512, 512),
+    ) -> None:
+        Logger.info(f'Initializing BiSeNet with model={model_name}, input_size={input_size}')
+
+        self.input_size = input_size
+        self.input_mean = np.array([0.485, 0.456, 0.406], dtype=np.float32)
+        self.input_std = np.array([0.229, 0.224, 0.225], dtype=np.float32)
+
+        self.model_path = verify_model_weights(model_name)
+        self._initialize_model()
+
+    def _initialize_model(self) -> None:
+        """
+        Initialize the ONNX model from the stored model path.
+
+        Raises:
+            RuntimeError: If the model fails to load or initialize.
+        """
+        try:
+            self.session = create_onnx_session(self.model_path)
+
+            # Get input configuration
+            input_cfg = self.session.get_inputs()[0]
+            input_shape = input_cfg.shape
+            self.input_name = input_cfg.name
+            self.input_size = tuple(input_shape[2:4][::-1])  # Update from model
+
+            # Get output configuration
+            outputs = self.session.get_outputs()
+            self.output_names = [output.name for output in outputs]
+
+            Logger.info(f'BiSeNet initialized with input size {self.input_size}')
+
+        except Exception as e:
+            Logger.error(f"Failed to load parsing model from '{self.model_path}'", exc_info=True)
+            raise RuntimeError(f'Failed to initialize parsing model: {e}') from e
+
+    def preprocess(self, face_image: np.ndarray) -> np.ndarray:
+        """
+        Preprocess a face image for parsing.
+
+        Args:
+            face_image (np.ndarray): A face image in BGR format.
+
+        Returns:
+            np.ndarray: Preprocessed image tensor with shape (1, 3, H, W).
+        """
+        # Convert BGR to RGB
+        image = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
+
+        # Resize to model input size
+        image = cv2.resize(image, self.input_size, interpolation=cv2.INTER_LINEAR)
+
+        # Normalize to [0, 1] and apply normalization
+        image = image.astype(np.float32) / 255.0
+        image = (image - self.input_mean) / self.input_std
+
+        # HWC -> CHW -> NCHW
+        image = np.transpose(image, (2, 0, 1))
+        image = np.expand_dims(image, axis=0).astype(np.float32)
+
+        return image
+
+    def postprocess(self, outputs: np.ndarray, original_size: Tuple[int, int]) -> np.ndarray:
+        """
+        Postprocess model output to segmentation mask.
+
+        Args:
+            outputs (np.ndarray): Raw model output.
+            original_size (Tuple[int, int]): Original image size (width, height).
+
+        Returns:
+            np.ndarray: Segmentation mask resized to original dimensions.
+        """
+        # Get the class with highest probability for each pixel
+        predicted_mask = outputs.squeeze(0).argmax(0).astype(np.uint8)
+
+        # Resize back to original size
+        restored_mask = cv2.resize(predicted_mask, original_size, interpolation=cv2.INTER_NEAREST)
+
+        return restored_mask
+
+    def parse(self, face_image: np.ndarray) -> np.ndarray:
+        """
+        Perform end-to-end face parsing on a face image.
+
+        This method orchestrates the full pipeline: preprocessing the input,
+        running inference, and postprocessing to return the segmentation mask.
+
+        Args:
+            face_image (np.ndarray): A face image in BGR format.
+
+        Returns:
+            np.ndarray: Segmentation mask with the same size as input image.
+        """
+        original_size = (face_image.shape[1], face_image.shape[0])  # (width, height)
+        input_tensor = self.preprocess(face_image)
+        outputs = self.session.run(self.output_names, {self.input_name: input_tensor})
+
+        return self.postprocess(outputs[0], original_size)
--- a/uniface/visualization.py
+++ b/uniface/visualization.py
@@ -7,6 +7,52 @@ from typing import List, Tuple, Union
 import cv2
 import numpy as np

+# Face parsing component names (19 classes)
+FACE_PARSING_LABELS = [
+    'background',
+    'skin',
+    'l_brow',
+    'r_brow',
+    'l_eye',
+    'r_eye',
+    'eye_g',
+    'l_ear',
+    'r_ear',
+    'ear_r',
+    'nose',
+    'mouth',
+    'u_lip',
+    'l_lip',
+    'neck',
+    'neck_l',
+    'cloth',
+    'hair',
+    'hat',
+]
+
+# Color palette for face parsing visualization
+FACE_PARSING_COLORS = [
+    [0, 0, 0],
+    [255, 85, 0],
+    [255, 170, 0],
+    [255, 0, 85],
+    [255, 0, 170],
+    [0, 255, 0],
+    [85, 255, 0],
+    [170, 255, 0],
+    [0, 255, 85],
+    [0, 255, 170],
+    [0, 0, 255],
+    [85, 0, 255],
+    [170, 0, 255],
+    [0, 85, 255],
+    [0, 170, 255],
+    [255, 255, 0],
+    [255, 255, 85],
+    [255, 255, 170],
+    [255, 0, 255],
+]
+

 def draw_detections(
    *,
@@ -126,3 +172,159 @@ def draw_fancy_bbox(
    # Bottom-right corner
    cv2.line(image, (x2, y2), (x2, y2 - corner_length), color, thickness)
    cv2.line(image, (x2, y2), (x2 - corner_length, y2), color, thickness)
+
+
+def draw_gaze(
+    image: np.ndarray,
+    bbox: np.ndarray,
+    pitch: np.ndarray,
+    yaw: np.ndarray,
+    *,
+    draw_bbox: bool = True,
+    fancy_bbox: bool = True,
+    draw_angles: bool = True,
+):
+    """
+    Draws gaze direction with optional bounding box on an image.
+
+    Args:
+        image: Input image to draw on (modified in-place).
+        bbox: Face bounding box [x1, y1, x2, y2].
+        pitch: Vertical gaze angle in radians.
+        yaw: Horizontal gaze angle in radians.
+        draw_bbox: Whether to draw the bounding box. Defaults to True.
+        fancy_bbox: Use fancy corner-style bbox. Defaults to True.
+        draw_angles: Whether to display pitch/yaw values as text. Defaults to False.
+    """
+    x_min, y_min, x_max, y_max = map(int, bbox[:4])
+
+    # Calculate dynamic line thickness based on image size (same as draw_detections)
+    line_thickness = max(round(sum(image.shape[:2]) / 2 * 0.003), 2)
+
+    # Calculate dynamic font scale based on bbox height (same as draw_detections)
+    bbox_h = y_max - y_min
+    font_scale = max(0.4, min(0.7, bbox_h / 200))
+    font_thickness = 2
+
+    # Draw bounding box if requested
+    if draw_bbox:
+        if fancy_bbox:
+            draw_fancy_bbox(image, bbox, color=(0, 255, 0), thickness=line_thickness)
+        else:
+            cv2.rectangle(image, (x_min, y_min), (x_max, y_max), (0, 255, 0), line_thickness)
+
+    # Calculate center of the bounding box
+    x_center = (x_min + x_max) // 2
+    y_center = (y_min + y_max) // 2
+
+    # Calculate the direction of the gaze
+    length = x_max - x_min
+    dx = int(-length * np.sin(pitch) * np.cos(yaw))
+    dy = int(-length * np.sin(yaw))
+
+    point1 = (x_center, y_center)
+    point2 = (x_center + dx, y_center + dy)
+
+    # Calculate dynamic center point radius based on line thickness
+    center_radius = max(line_thickness + 1, 4)
+
+    # Draw gaze direction
+    cv2.circle(image, (x_center, y_center), radius=center_radius, color=(0, 0, 255), thickness=-1)
+    cv2.arrowedLine(
+        image,
+        point1,
+        point2,
+        color=(0, 0, 255),
+        thickness=line_thickness,
+        line_type=cv2.LINE_AA,
+        tipLength=0.25,
+    )
+
+    # Draw angle values
+    if draw_angles:
+        text = f'P:{np.degrees(pitch):.0f}deg Y:{np.degrees(yaw):.0f}deg'
+        (text_width, text_height), baseline = cv2.getTextSize(
+            text, cv2.FONT_HERSHEY_SIMPLEX, font_scale, font_thickness
+        )
+
+        # Draw background rectangle for text
+        cv2.rectangle(
+            image,
+            (x_min, y_min - text_height - baseline - 10),
+            (x_min + text_width + 10, y_min),
+            (0, 0, 255),
+            -1,
+        )
+
+        # Draw text
+        cv2.putText(
+            image,
+            text,
+            (x_min + 5, y_min - 5),
+            cv2.FONT_HERSHEY_SIMPLEX,
+            font_scale,
+            (255, 255, 255),
+            font_thickness,
+        )
+
+
+def vis_parsing_maps(
+    image: np.ndarray,
+    segmentation_mask: np.ndarray,
+    *,
+    save_image: bool = False,
+    save_path: str = 'result.png',
+) -> np.ndarray:
+    """
+    Visualizes face parsing segmentation mask by overlaying colored regions on the image.
+
+    Args:
+        image: Input face image in RGB format with shape (H, W, 3).
+        segmentation_mask: Segmentation mask with shape (H, W) where each pixel
+                          value represents a facial component class (0-18).
+        save_image: Whether to save the visualization to disk. Defaults to False.
+        save_path: Path to save the visualization if save_image is True.
+
+    Returns:
+        np.ndarray: Blended image with segmentation overlay in BGR format.
+
+    Example:
+        >>> import cv2
+        >>> from uniface.parsing import BiSeNet
+        >>> from uniface.visualization import vis_parsing_maps
+        >>>
+        >>> parser = BiSeNet()
+        >>> face_image = cv2.imread('face.jpg')
+        >>> mask = parser.parse(face_image)
+        >>>
+        >>> # Visualize
+        >>> face_rgb = cv2.cvtColor(face_image, cv2.COLOR_BGR2RGB)
+        >>> result = vis_parsing_maps(face_rgb, mask)
+        >>> cv2.imwrite('parsed_face.jpg', result)
+    """
+    # Create numpy arrays for image and segmentation mask
+    image = np.array(image).copy().astype(np.uint8)
+    segmentation_mask = segmentation_mask.copy().astype(np.uint8)
+
+    # Create a color mask
+    segmentation_mask_color = np.zeros((segmentation_mask.shape[0], segmentation_mask.shape[1], 3))
+
+    num_classes = np.max(segmentation_mask)
+
+    for class_index in range(1, num_classes + 1):
+        class_pixels = np.where(segmentation_mask == class_index)
+        segmentation_mask_color[class_pixels[0], class_pixels[1], :] = FACE_PARSING_COLORS[class_index]
+
+    segmentation_mask_color = segmentation_mask_color.astype(np.uint8)
+
+    # Convert image to BGR format for blending
+    bgr_image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
+
+    # Blend the image with the segmentation mask
+    blended_image = cv2.addWeighted(bgr_image, 0.6, segmentation_mask_color, 0.4, 0)
+
+    # Save the result if required
+    if save_image:
+        cv2.imwrite(save_path, blended_image, [int(cv2.IMWRITE_JPEG_QUALITY), 100])
+
+    return blended_image
Author	SHA1	Message	Date
yakhyo	13b518e96d	chore: Upgrade version to v1.5.3	2025-12-15 15:09:54 +09:00
yakhyo	1b877bc9fc	fix: Fix the version	2025-12-15 14:53:36 +09:00
Yakhyokhuja Valikhujaev	bb1d209f3b	feat: Add BiSeNet face parsing model (#36 ) * Add BiSeNet face parsing implementation * Add parsing model weights configuration * Export BiSeNet in main package * Add face parsing tests * Add face parsing examples and script * Bump version to 1.5.0 * Update documentation for face parsing * Fix face parsing notebook to use lips instead of mouth * chore: Update the face parsing example * fix: Fix model argument to use Enum * ref: Move vis_parsing_map function into visualization.py * docs: Update README.md	2025-12-15 14:50:15 +09:00
Yakhyokhuja Valikhujaev	54b769c0f1	feat: Add Face Parsing model BiSeNet model trained on CelebMask dataset (#35 ) * Add BiSeNet face parsing implementation * Add parsing model weights configuration * Export BiSeNet in main package * Add face parsing tests * Add face parsing examples and script * Bump version to 1.5.0 * Update documentation for face parsing * Fix face parsing notebook to use lips instead of mouth * chore: Update the face parsing example * fix: Fix model argument to use Enum * ref: Move vis_parsing_map function into visualization.py * docs: Update README.md	2025-12-14 21:13:53 +09:00
Yakhyokhuja Valikhujaev	4d1921e531	feat: Add 2D Gaze estimation models (#34 ) * feat: Add Gaze Estimation, update docs and Add example notebook, inference code * docs: Update README.md	2025-12-14 14:07:46 +09:00
				`@@ -58,3 +58,6 @@ Open an issue or start a discussion on GitHub.`