Carvana-Segmentation-UNet

This repository contains an implementation of the UNet architecture for image segmentation tasks, specifically targeting binary segmentation. The code is designed to train a UNet model, evaluate its performance, and make predictions on new images.

Overview

What is UNet?

UNet is a convolutional neural network architecture primarily used for biomedical image segmentation. It was first introduced by Olaf Ronneberger et al. in their 2015 paper "U-Net: Convolutional Networks for Biomedical Image Segmentation". The key feature of UNet is its U-shaped architecture, consisting of a contracting path (encoder) and an expansive path (decoder), which makes it highly effective for precise localization and segmentation tasks.

Use Cases

UNet is widely used in various fields, including:

Medical Imaging: Segmentation of organs, tumors, and other structures in CT, MRI, and ultrasound images.
Satellite Image Analysis: Land cover classification, road detection, and urban planning.
Autonomous Vehicles: Identifying objects and boundaries on the road for navigation.
Agriculture: Crop and soil segmentation from aerial or satellite images.

UNet Architecture

Below is a visual representation of the UNet architecture:

The architecture consists of:

Contracting Path (Encoder): A sequence of convolutional layers followed by max pooling to downsample the input image, capturing the context.
Bottleneck: The bottom of the U, where the feature maps are the smallest in spatial dimensions but have the deepest representation.
Expansive Path (Decoder): A sequence of transposed convolutions that upsample the feature maps and concatenate them with corresponding feature maps from the contracting path, allowing precise localization.

Dataset

This project utilizes the Carvana Image Masking Challenge dataset, which is hosted on Kaggle. The dataset consists of high-resolution images of cars, along with corresponding binary masks that outline the car's silhouette.

Dataset Details

Images: The dataset contains 5,000 images of cars taken from various angles.
Masks: Each image has an associated binary mask that highlights the car in the image. The masks are used as ground truth for training the segmentation model.
Challenge: The goal is to accurately predict the car mask for each image, essentially segmenting the car from the background.

Downloading the Dataset

Sign up or log in to Kaggle.
Visit the Carvana Image Masking Challenge Dataset page.
Download the dataset and extract it into the data/ directory of this repository, maintaining the structure:

data/
├── train_images/
├── train_masks/
├── val_images/
└── val_masks/

Repository Structure

Here's a brief overview of the files in this repository:

models.py: Contains the implementation of the UNet architecture.
train.py: Script to train the UNet model on a given dataset.
utils.py: Utility functions for saving/loading model checkpoints, calculating accuracy, and saving prediction images.
config.py: Configuration file containing hyperparameters, file paths, and other settings.
inference.py (to be created): Script for running inference on new images using a trained UNet model.

Setup and Installation

Clone the Repository:

git clone https://github.com/matin-ghorbani/Carvana-Segmentation-UNet
cd Carvana-Segmentation-UNet

Install the Required Packages: Ensure you have Python 3.8+ and PyTorch installed. Install the dependencies using pip:
```
pip install -r requirements.txt
```

Training the Model

To train the model, run the train.py script. Make sure the dataset is correctly placed in the data/ directory as mentioned above.

python train.py

During training, the script will:

Load the training and validation datasets.
Train the UNet model for the specified number of epochs.
Save the trained model checkpoints.
Evaluate the model's performance on the validation set.
Save sample predictions as images.

You can adjust the training parameters (e.g., learning rate, batch size, number of epochs) in the config.py file.

Inference and Making Predictions

To run inference on a new image and overlay the prediction on the original image, use the inference.py script. Run the inference script:

python inference.py --model path/to/checkpoint.pth.tar --img path/to/your/image.jpg --save

You can download my weight from here

Results

I got these results after only 3 epochs

Test Image 1	Test Image 2	Test Image 3

Prediction 1	Prediction 2	Prediction 3

Training loss: 0.0772
Testing accuracy: 0.9756
Testing dice score: 0.9455

References

UNet Paper: U-Net: Convolutional Networks for Biomedical Image Segmentation
Carvana Image Masking Challenge: Kaggle Competition

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
outputs		outputs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
dataset.py		dataset.py
inference.py		inference.py
models.py		models.py
requirements.txt		requirements.txt
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Carvana-Segmentation-UNet

Table of Contents

Overview

What is UNet?

Use Cases

UNet Architecture

Dataset

Dataset Details

Downloading the Dataset

Repository Structure

Setup and Installation

Training the Model

Inference and Making Predictions

Results

References

About

Languages

License

matin-ghorbani/Carvana-Segmentation-UNet

Folders and files

Latest commit

History

Repository files navigation

Carvana-Segmentation-UNet

Table of Contents

Overview

What is UNet?

Use Cases

UNet Architecture

Dataset

Dataset Details

Downloading the Dataset

Repository Structure

Setup and Installation

Training the Model

Inference and Making Predictions

Results

References

About

Topics

Resources

License

Stars

Watchers

Forks

Languages