DigitVision 🧠✍️

A full-stack AI web application that recognizes handwritten digits (0–9) in real time.

Built from scratch — including model training, REST API, and interactive frontend.

Demo · How it works · Quick Start · Experiments

What it does

Draw any digit on the canvas. The AI predicts it instantly — showing the predicted digit, confidence score, and a full probability distribution across all 10 classes.

No external AI APIs. The model is trained and served entirely locally.

Demo

Draw → Predict → See confidence scores across all 10 digits

How to run it locally: follow the Quick Start below, open frontend/index.html, draw any digit and click Predict. Results appear instantly with a full probability bar chart.

How it works

User draws on canvas
      ↓
canvas.toDataURL() → base64 PNG
      ↓
POST /api/predict  (Flask)
      ↓
Pillow: resize to 28×28, normalize to [0,1]
      ↓
CNN model: 10 softmax probabilities
      ↓
JSON response → UI update

Model Architecture

Input (28×28×1)
  → Conv2D(32, 3×3, relu) + MaxPool
  → Conv2D(64, 3×3, relu) + MaxPool
  → Flatten
  → Dense(128, relu) + Dropout(0.5)
  → Dense(10, softmax)

Trained on MNIST (60,000 images). Final test accuracy: 99.27% (73 wrong out of 10,000).

Project Structure

DigitVision/
├── frontend/
│   ├── index.html          # Main page
│   ├── css/style.css       # Dark theme UI
│   └── js/
│       ├── app.js          # Entry point — event wiring
│       ├── canvas.js       # HTML5 Canvas drawing logic
│       ├── predict.js      # fetch() → Flask API
│       └── ui.js           # DOM updates, probability bars
├── backend/
│   ├── app.py              # Flask server + /api/predict route
│   ├── predict.py          # Inference pipeline
│   ├── preprocess.py       # base64 PNG → 28×28 float32 tensor
│   ├── model_loader.py     # Singleton model cache
│   └── requirements.txt
├── model/
│   ├── cnn.py              # CNN architecture definition
│   ├── train.py            # Interactive training script
│   ├── evaluate.py         # Confusion matrix + sample predictions
│   ├── visualize.py        # Training curve plots
│   └── checkpoints/        # Saved model weights (.keras)
└── docs/
    └── experiment_log.md   # Hyperparameter experiment results

Quick Start

# 1. Clone
git clone https://github.com/murodovdev/digit-vision.git
cd digit-vision

# 2. Create virtual environment
python -m venv venv
venv\Scripts\activate        # Windows
# source venv/bin/activate   # macOS/Linux

# 3. Install dependencies
pip install -r backend/requirements.txt

# 4. Train the model (interactive — you set the hyperparameters)
cd model
python train.py

# 5. Start the backend
cd ../backend
python app.py

# 6. Open frontend/index.html in your browser

Experiments

Part of this project was running controlled experiments to understand how each hyperparameter affects training. All results are logged in docs/experiment_log.md.

Run	Learning Rate	Batch Size	Dropout	Accuracy	Key Finding
Baseline	0.001	64	0.5	99.27%	Best balance of speed and accuracy. Reproducible with `SEED=42`
High LR	0.05	64	0.5	10.28%	Loss stuck at ln(10) — pure random guessing
No dropout	0.001	64	0.0	98.94%	Overfitting: train loss → 0.008, val plateaued
Tiny batch	0.001	8	0.5	99.28%	Same accuracy, 3.7× slower (621s vs 169s)
+ Augmentation	0.001	64	0.4	98.97%	Augmentation hurt — MNIST is already large enough
+ BatchNorm	0.0005	64	0.4	99.18%	Architecture change broke old hyperparameters

Tech Stack

Layer	Technology
Frontend	HTML5 Canvas, Vanilla JS (no frameworks)
Backend	Python, Flask, flask-cors
AI Model	TensorFlow 2.16, Keras
Dataset	MNIST (70,000 handwritten digit images)
Image processing	Pillow (resize, normalize)
Visualization	Matplotlib

Key Concepts Implemented

Convolutional Neural Network (CNN) with 2 conv blocks
Dropout regularization to prevent overfitting
Early stopping with weight restoration to best epoch
Preprocessing pipeline — canvas PNG → normalized 28×28 tensor
REST API — browser communicates with model over HTTP/JSON
Singleton model loader — model loaded once, cached for all requests
Hyperparameter experiments — documented with hypothesis and results

License

MIT — feel free to use, modify, and build on this project.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
backend		backend
docs		docs
frontend		frontend
model		model
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DigitVision 🧠✍️

What it does

Demo

How it works

Model Architecture

Project Structure

Quick Start

Experiments

Tech Stack

Key Concepts Implemented

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DigitVision 🧠✍️

What it does

Demo

How it works

Model Architecture

Project Structure

Quick Start

Experiments

Tech Stack

Key Concepts Implemented

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages