Skip to content

chaitanyapeshin/segmentation_for_color_change

Repository files navigation

Carvana Image Masking Challenge

Carvana Logo

Welcome to the Carvana Image Masking Challenge repository. This project focuses on semantic segmentation of cars as part of the Carvana Image Masking Challenge on Kaggle. The goal is to generate precise masks for cars in images.


Table of Contents


Objective

The Carvana Image Masking Challenge aims to generate highly precise masks for cars in images.
Semantic segmentation is used to identify the boundaries of cars, contributing to applications such as autonomous driving and object detection.

Input Image
Input Image

Output Image
Predicted Mask


Evaluation

The main evaluation metric is the Dice coefficient (equivalent to the F1-score in binary segmentation):

Dice Coefficient Formula

  • Dice = 1 -> perfect overlap between predicted pixels (X) and ground truth (Y)
  • Dice = 0 -> no overlap

Our aim is to maximize Dice by improving overlap between prediction and ground truth.


Model

I developed a custom encoder–decoder architecture, inspired by both SegNet and U-Net:

  • Encoder: SegNet-style downsampling (Conv2D → BatchNorm → ReLU → MaxPool)
  • Decoder: U-Net-style upsampling with skip connections from encoder layers
  • Final layer: Sigmoid (binary mask output)

Architecture Summary:

  • 7 encoder layers
  • 2 center convolutional layers
  • 7 decoder layers
  • 1 final classification layer

Model Architecture
Encoder–Decoder with skip connections (final activation: sigmoid)

Note: Hardware limitations (NVIDIA RTX 3060, 6GB VRAM) influenced design choices.


Data

The dataset is provided by Kaggle:
Carvana Image Masking Challenge Data

Expected folder structure:

data/
├── raw/
│   ├── train/          # input images
│   └── train_masks/    # ground truth masks
└── processed/          # preprocessed data

Data Augmentation

To improve generalization, I applied minor augmentations:

  • Random shifts
  • Scaling
  • Rotations

These help the model perform better on unseen data.


Installation

Clone the repository and set up the environment:

'''bash git clone https://github.com/chaitanyapeshin/segmentation_for_color_change.git cd segmentation_for_color_change '''

Conda '''bash conda env create -f environment.yml conda activate carvana '''


Getting Started (Training)

Run the training notebook:

'''bash jupyter notebook notebooks/model.ipynb '''

This will train the model and log progress to TensorBoard (assets/tensorboard/).


Inference

Use the inference script to predict masks for new images:

'''bash python infer.py --input path/to/image.jpg --output outputs/mask.png '''


Results

The model was trained with Adam optimizer and a custom loss = BCE + (1 - Dice).
Validation performance after ~13 epochs:

Metric Value
Dice 0.9956
IoU (Jaccard) 0.9912
Pixel Accuracy 0.9971
Precision 0.9965
Recall 0.9948
Dice (5/50/95%) 0.992 / 0.996 / 0.998

Post Analysis

  • The model segments the main body of cars very well.
  • Struggles with fine details:
    • Dark shadows near wheels
    • Cars painted similar to background
    • Thin structures (antennas, roof racks)

Despite these challenges, performance is human-level or better on most images.


Code


References


Folder Structure

.  
├── 29bb3ece3180_11.jpg  
├── assets/  
│   └── tensorboard/  
├── data/  
│   ├── processed/  
│   └── raw/  
├── LICENSE  
├── notebooks/  
│   └── model.ipynb  
├── README.md  
├── references/  
│   ├── 1411.4038.pdf  
│   ├── 1505.04366.pdf  
│   └── 1511.00561.pdf  
├── environment.yml  
├── requirements.txt  
├── sample_submission.csv  
└── src/  
    └── data/  

License

This project is licensed under the MIT License – see the LICENSE file for details.

About

Segmentation project on Carvana dataset

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published