Use Keras to train a neural network for the binary classification of muffins and Chihuahuas.
- Python version is: 3.10.5 (click on the badge)
- requirements.txt contains all the necessary python packages (use the command below to install all the pachages)
pip install -r requirements.txt
- The file called "SMML_Project_Report" is the document describing the project
the architecture of this project is fundamentally organized into four blocks. The initial two blocks are dedicated to preprocessing and data preparation tasks, whereas the latter two blocks are focused on model construction: classification and evaluation.
In the preprocessing phase, the emphasis is on refining the dataset. The process involves systematically addressing corrupted files, detecting and managing duplicates through image hashing, and conducting a thorough dataset check.
In the data preparation phase, the primary focus is on loading and enhancing the dataset. This involves using Keras and TensorFlow to load training, validation, and test datasets, applying data augmentation techniques such as flip, rotation, and zoom, and normalizing pixel values. The goal is to ensure the dataset is well-prepared and suitable for subsequent steps.
In the classification phase, a robust image classification pipeline is established using Keras and TensorFlow. The implementation introduces configurable models, including Multilayer Perceptron, Convolutional Neural Network and MobileNet. The workflow seamlessly integrates hyperparameter tuning and K-fold cross-validation for comprehensive model optimization.
In the evaluation phase, the model’s performance is tested through the presentation of insightful metrics, such as loss and accuracy. The module further generates classification reports, produces confusion matrices, and offers intuitive plots to analyze predictions.
MLP | CNN | MOBILENET | |
---|---|---|---|
Accuracy (%) | 71.537 | 94.510 | 99.493 |
Loss | 0.573 | 0.222 | 0.019 |
The models exhibit varying degrees of performance, with MobileNet emerging as the standout performer, achieving near-perfect accuracy and classification proficiency. The CNN model also demonstrates notable results. The MLP model performs worse than its counterparts, exhibiting suboptimal performance characterized by higher loss resulting in a notable rate of misclassification (underfitting).