Facial Keypoints Detection

Dual-framework deep learning system for detecting 15 facial keypoints on 96×96 grayscale images

Built for the Kaggle Facial Keypoints Detection competition. Implements both a PyTorch ResNet with custom NaN-aware loss and a Keras CNN with two-phase training, achieving RMSE 2.10 on the leaderboard.

ResNet keypoint predictions on test faces

ResNet predictions: detected keypoints (red) on unseen test faces

Results

Model	Framework	Kaggle RMSE	Parameters	Strategy
ResNet	PyTorch	2.10	~4.2M	Adam + StepLR + NaN-aware MSE
CNN	TensorFlow/Keras	2.55	~1.5M	Two-phase: Adam → SGD + Huber

Training curves — Left: ResNet (Adam + StepLR) | Right: CNN (Two-Phase Adam → SGD)

Key Innovation: NaN-Aware Loss

Only ~2,140 of 7,049 training samples have all 15 keypoints labeled. Rather than discarding ~70% of the data, the custom MSELossIgnoreNan masks missing targets and computes gradients only on available labels:

class MSELossIgnoreNan(nn.Module):
    def forward(self, pred, target):
        mask = torch.isfinite(target)
        count = mask.sum()
        if count == 0:
            return torch.tensor(0.0, requires_grad=True)
        return ((pred[mask] - target[mask]) ** 2).sum() / count

Features

Dual-framework architecture — PyTorch ResNet and Keras CNN sharing data loading and configuration
NaN-aware loss — Custom MSELossIgnoreNan trains on all 7,049 samples including partially-labeled ones
Two-phase CNN training — Adam for fast convergence, then SGD with ReduceLROnPlateau for fine-tuning
Deep residual learning — 6-stage ResNet (32→512 channels) with batch normalization and skip connections
Centralized YAML config — All hyperparameters in config/default.yaml with typed frozen dataclass validation
Unit tests — pytest suite covering models, loss functions, dataset utilities, and config loading

Project Structure

├── config/default.yaml         # Hyperparameters for both models
├── keypoints/
│   ├── config.py               # Frozen dataclasses + YAML loader
│   ├── models/
│   │   ├── cnn.py              # Keras CNN (3 conv blocks + 2 FC layers)
│   │   └── resnet.py           # PyTorch ResNet (12 residual blocks)
│   └── utils/
│       ├── dataset.py          # Data loading, preprocessing, PyTorch Dataset
│       ├── losses.py           # MSELossIgnoreNan
│       └── visualization.py    # Keypoint plotting, training curves
├── tests/
│   └── test_models.py          # Unit tests for models, loss, dataset, config
├── train_cnn.py                # CNN training entry point
├── train_resnet.py             # ResNet training entry point
├── predict.py                  # Inference + Kaggle submission
└── assets/                     # Sample output visualizations

Quick Start

Installation

git clone https://github.com/allureking/FacialKeypointsDetection.git
cd FacialKeypointsDetection

# Install with both frameworks
pip install -e ".[all]"

# Or install for one framework only
pip install -e ".[pytorch]"    # PyTorch ResNet only
pip install -e ".[tensorflow]" # Keras CNN only

Download Data

Download the Kaggle dataset and place files in data/:

data/
├── training.csv
├── test.csv
└── IdLookupTable.csv

Train

# Train ResNet (PyTorch)
python train_resnet.py --config config/default.yaml

# Train CNN (Keras/TensorFlow)
python train_cnn.py --config config/default.yaml

Predict

# Generate Kaggle submission
python predict.py --model resnet --weights best_model.pth
python predict.py --model cnn --weights sgd_best.h5

Run Tests

pytest tests/ -v

Architecture

ResNet (PyTorch)

Input (1×96×96)
  └── Stem: Conv(1→32) + BN + ReLU + MaxPool
      └── Stage 1: ResBlock×2 (32→32)
          └── Stage 2: ResBlock×2 (32→64)
              └── Stage 3: ResBlock×2 (64→128)
                  └── Stage 4: ResBlock×2 (128→256)
                      └── Stage 5: ResBlock×2 (256→512)
                          └── AdaptiveAvgPool → Linear(512→30)

Each residual block: x → Conv3×3 → BN → ReLU → Conv3×3 → BN → (+shortcut) → ReLU

CNN (Keras)

Input (96×96×1)
  └── Conv2D(32) → LeakyReLU → MaxPool → Dropout(0.05)
      └── Conv2D(64) → LeakyReLU → MaxPool → Dropout(0.01)
          └── Conv2D(128) → LeakyReLU → MaxPool → Dropout(0.15)
              └── Dense(500) → Dense(500) → Dense(30)

Two-phase training: Adam (lr=5e-4) with Huber loss → SGD (lr=1e-3) with ReduceLROnPlateau.

Configuration

All hyperparameters are centralized in config/default.yaml and loaded into frozen dataclasses:

resnet:
  batch_size: 16
  learning_rate: 0.0001
  epochs: 100
  patience: 5
  step_lr_size: 5
  step_lr_gamma: 0.1

References

He, K. et al. (2016). Deep Residual Learning for Image Recognition. CVPR. arXiv:1512.03385
Kaggle. Facial Keypoints Detection. Competition Page

Acknowledgments

CNN implementation developed in collaboration with Felix-hyy for the USF 276DS course final project. ResNet architecture inspired by the Advanced Machine Learning course at USF.

License

MIT — see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
config		config
docs		docs
keypoints		keypoints
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
predict.py		predict.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
train_cnn.py		train_cnn.py
train_resnet.py		train_resnet.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Facial Keypoints Detection

Results

Key Innovation: NaN-Aware Loss

Features

Project Structure

Quick Start

Installation

Download Data

Train

Predict

Run Tests

Architecture

ResNet (PyTorch)

CNN (Keras)

Configuration

References

Acknowledgments

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Facial Keypoints Detection

Results

Key Innovation: NaN-Aware Loss

Features

Project Structure

Quick Start

Installation

Download Data

Train

Predict

Run Tests

Architecture

ResNet (PyTorch)

CNN (Keras)

Configuration

References

Acknowledgments

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages