From 0b91dad911bf7d5ed6ca9facf078be4e4b443a80 Mon Sep 17 00:00:00 2001 From: Diego Carpintero <6709785+dcarpintero@users.noreply.github.com> Date: Thu, 11 Apr 2024 12:45:08 +0200 Subject: [PATCH] update image formats --- README.md | 102 ++++++++++++++++++++++++++++++------------------------ 1 file changed, 56 insertions(+), 46 deletions(-) diff --git a/README.md b/README.md index bf2bfc0..294b4dd 100644 --- a/README.md +++ b/README.md @@ -17,31 +17,34 @@ In this article, we will implement in Python the essential modules required to b By the end of this guide, you will be able to construct the building blocks of a neural network from scratch, understand how it learns, and deploy it to [HuggingFace Spaces](https://huggingface.co/spaces/dcarpintero/fashion-image-recognitionhttps://huggingface.co/spaces/dcarpintero/fashion-image-recognition) to classify real-world garment images. -
- -
Garment Classifier deployed to HuggingFace Spaces
-
+

+ +

+ +

Garment Classifier deployed to HuggingFace Spaces

## Table of Contents -- [The Intuition behind our Neural Network](#the-intuition-behind-our-neuralnetwork) -- [Architecture](#architecture) - - [Linear Transformation](#linear-transformation) - - [Introducing non-linearity](#introducing-non-linearity) - - [Regularization](#regularization) - - [Flatten Transformation](#flatten-transformation) - - [Sequential Layer](#sequential-layer) - - [Classifier Model](#classifier-model) - - [Gradient Descent Optimizer](#gradient-descent-optimizer) - - [Backpropagation](#backpropagation) -- [Training](#training) - - [The Fashion Dataset](#the-fashion-dataset) - - [Data Loaders for Mini-Batches](#data-loaders-for-mini-batches) - - [Fitting the Model](#fitting-the-model) -- [Model Assessment](#model-assessment) -- [Inference](#inference) -- [Resources](#resources) -- [References](#references) +- [Building a Neural Network Classifier from the Ground Up: A Step-by-Step Guide](#building-a-neural-network-classifier-from-the-ground-up-a-step-by-step-guide) + - [Table of Contents](#table-of-contents) + - [The Intuition behind our Neural Network](#the-intuition-behind-our-neuralnetwork) + - [Architecture](#architecture) + - [Linear Transformation](#linear-transformation) + - [Introducing non-linearity](#introducing-non-linearity) + - [Regularization](#regularization) + - [Flatten Transformation](#flatten-transformation) + - [Sequential Layer](#sequential-layer) + - [Classifier Model](#classifier-model) + - [Gradient Descent Optimizer](#gradient-descent-optimizer) + - [Backpropagation](#backpropagation) + - [Training](#training) + - [The Fashion Dataset](#the-fashion-dataset) + - [Data Loaders for Mini-Batches](#data-loaders-for-mini-batches) + - [Fitting the Model](#fitting-the-model) + - [Model Assessment](#model-assessment) + - [Inference](#inference) + - [Resources](#resources) + - [References](#references) @@ -241,10 +244,12 @@ class Classifier(nn.Module): The research paper *Visualizing and Understanding Convolutional Networks* [7] offers insights into a concept akin to hierarchical progressive learning, specifically applied to convolutional layers. This provides a comparable intuition to understand how stacked layers are capable of automatically learning features within images: -
- -
Visualization of features in a convolutional neural network - https://arxiv.org/pdf/1311.2901.pdf
-
+

+ +

+ +

Visualization of features in a convolutional neural network - https://arxiv.org/pdf/1311.2901.pdf

+ ### Gradient Descent Optimizer @@ -406,31 +411,35 @@ class Learner: After 25 epochs, our model achieves 0.868 accuracy, which fairly approximates [benchmark results](http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/#) (0.874 for an MLP Classifier using ReLU as the activation function). -
- -
Model Assessment (epochs=25, lr=0.005, batch_size=32, SGD, CrossEntropyLoss) w/ self-implemented modules
-
+

+ +

+ +

Model Assessment (epochs=25, lr=0.005, batch_size=32, SGD, CrossEntropyLoss) w/ self-implemented modules

We observe comparable accuracy levels between our self-implemented modules and a standard PyTorch implementation with the same hyperparameters (`epochs=25, lr=0.005, batch_size=32`). Notably, the PyTorch model demonstrates a slighly smaller gap between validation and training losses, suggesting better generalization capabilities: -
- -
Model Assessment (epochs=25, lr=0.005, batch_size=32, SGD, CrossEntropyLoss) w/ PyTorch modules
-
+

+ +

+ +

Model Assessment (epochs=25, lr=0.005, batch_size=32, SGD, CrossEntropyLoss) w/ PyTorch modules

Furthermore, a basic analysis of precision (accuracy of the positive predictions for a specific class), recall (ability to detect all relevant instances of a specific class), and f1-score (mean of precision and recall) reveals that our model excels in categories with distinctive features such as Trouser/Jeans, Sandal, Bag, and Ankle-Boot. However, it performs below average with Shirts, Pullovers, and Coats. -
- -
Precision, Recall and F1-Scores across Categories
-
+

+ +

+ +

Precision, Recall and F1-Scores across Categories

The confussion matrix confirms that the Shirt category is frequently confused with the T-Shirt/Top, Pullover, and Coat classes; whereas Coat is confused with Shirt and Pullover. This suggests that working at 28x28 pixels resolution might cast upper body garment categories as visually challenging. -
- -
Confussion Matrix
-
+

+ +

+ +

Confussion Matrix

## Inference @@ -456,10 +465,11 @@ transform = transforms.Compose( This can be easily integrated into a Gradio App, and then deployed to [HuggingFace Spaces](https://huggingface.co/spaces/dcarpintero/fashion-image-recognition): -
- -
Garment Classifier deployed to HuggingFace Spaces
-
+

+ +

+ +

Garment Classifier deployed to HuggingFace Spaces

## Resources