Garbage Image Classification (CNN)
## Overview
This project develops image classification models to classify waste images into 6 categories.
The goal is to explore transfer learning and data augmentation techniques to improve classification accuracy on a relatively small dataset (~2,500 images).
## Dataset
- Source: Garbage classification dataset (not included here)
- Size: ~2,500 images
- Classes: 6 categories (e.g., cardboard, glass, metal, paper, plastic, trash)
- Note: Since the dataset does not provide a separate test set, part of the training data was split and used for evaluation.
## Preprocessing steps
- Applied data augmentation using torchvision.transforms:
Resize, random rotation, color jitter, random crop, normalization
- Constructed custom Dataset and DataLoader
- Split dataset into training and test sets
## Models & Methods
- Compared multiple pre-trained CNN architectures:
ResNet18, ResNet50, EfficientNet-B0, MobileNet-V2, VGG16
- Introduced dropout in fully connected layers to evaluate generalization effect
- Used weighted cross-entropy loss to address class imbalance
- Compared optimizers (Adam, RMSprop, SGD)
## Results
- Best test accuracy: ~77.5%
- Evaluated confusion matrix and classification report to analyze misclassifications
- Data augmentation contributed to improved robustness
## Technologies Used
- Python, Pandas
- PyTorch, Torchvision
- Pre-trained CNNs (ResNet, EfficientNet, MobileNet, VGG)
- scikit-learn (evaluation)
- Matplotlib
- Jupyter Notebook
## Repository Structure
garbage-classifier/
├── garbage_classifier.ipynb # Main notebook
├── README.md # Project description
└── data/ # Dataset (not included, see below)
## About Dataset
The dataset is not included in this repository due to license restrictions. Please download it directly from Kaggle.
https://www.kaggle.com/datasets/asdasdasasdas/garbage-classification
## Note
This notebook was originally developed and executed in a local Jupyter/Colab environment.
Due to the use of custom folder structures (e.g., data/, notebook/, model/), it may not run directly without modifications.
The main purpose of this repository is to showcase the analysis process and results, rather than to provide a fully reproducible environment.