Skip to content

sharmarnavv/FER

Repository files navigation

An Improved Deep Attention Network for Facial Expression Recognition

Python PyTorch License

This repository contains a PyTorch implementation of an enhanced Deep Attention Network (DAN) for facial expression recognition. While many approaches leverage attention for this task, this implementation introduces several key architectural improvements to boost performance.

Table of Contents

Architectural Enhancements

Our model builds upon the common paradigm of using attention mechanisms for facial expression recognition but introduces a more robust and effective architecture. The key upgrades in this implementation are:

  • Upgraded Backbone: The previous ResNet-18 backbone has been replaced with a modern ConvNeXt Tiny backbone, providing a significant boost in feature extraction capabilities.
  • Multi-Head Cross-Attention: Instead of a single attention mechanism, our model uses multiple cross-attention heads. This allows the network to focus on different facial regions simultaneously, capturing a richer set of features for more accurate classification.
  • Composite Loss Function: To further enhance performance, we employ a composite loss function:
    • PartitionLoss: This novel loss function encourages diversity among the attention heads, ensuring that each head learns unique and complementary features.
    • CenterLoss: This loss function improves the discriminative power of the learned features by minimizing intra-class variations.

Results

These architectural enhancements have led to a 12% increase in accuracy compared to baseline implementations.

Installation

  1. Clone the repository:

    git clone https://github.com/yourusername/dan-improvement.git
    cd dan-improvement
  2. Install the required dependencies:

    pip install torch torchvision pandas numpy tqdm Pillow

Project Structure

dan-improvement/
├── datasets/           # Dataset directory
├── networks/           # Model architecture definitions
│   └── dan.py          # DAN model with CenterLoss and PartitionLoss
├── utils/              # Utility scripts
├── affectnet.py        # Main training script for AffectNet
├── demo.py             # Inference script for single images
├── evaluate_backbones.py # Script to compare different backbones
├── run_grad_cam.py     # Grad-CAM visualization script
├── verify_dan.py       # Verification script for the model
└── README.md           # Project documentation

How to Use

Training

To train the model, you will need the AffectNet dataset. Once the dataset is in place, you can run the training script:

python affectnet.py --aff_path datasets/AffectNet/ --epochs 40 --batch_size 128

Inference

To run inference on a single image, use the demo.py script. Please note that you need a trained model checkpoint.

python demo.py --image <path_to_your_image>

Future Improvements

While the current implementation is robust, there are several areas for future improvement:

  • Command-Line Arguments: The current scripts use hardcoded paths. These will be replaced with command-line arguments for better flexibility.
  • Modern Face Detector: The face detector in demo.py will be upgraded to a more modern, deep learning-based detector.
  • Unified Data Loading: The data loading logic will be unified and better documented.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages