A full-stack AI web application that recognizes handwritten digits (0–9) in real time.
Built from scratch — including model training, REST API, and interactive frontend.
Draw any digit on the canvas. The AI predicts it instantly — showing the predicted digit, confidence score, and a full probability distribution across all 10 classes.
No external AI APIs. The model is trained and served entirely locally.
Draw → Predict → See confidence scores across all 10 digits
How to run it locally: follow the Quick Start below, open frontend/index.html, draw any digit and click Predict. Results appear instantly with a full probability bar chart.
User draws on canvas
↓
canvas.toDataURL() → base64 PNG
↓
POST /api/predict (Flask)
↓
Pillow: resize to 28×28, normalize to [0,1]
↓
CNN model: 10 softmax probabilities
↓
JSON response → UI update
Input (28×28×1)
→ Conv2D(32, 3×3, relu) + MaxPool
→ Conv2D(64, 3×3, relu) + MaxPool
→ Flatten
→ Dense(128, relu) + Dropout(0.5)
→ Dense(10, softmax)
Trained on MNIST (60,000 images). Final test accuracy: 99.27% (73 wrong out of 10,000).
DigitVision/
├── frontend/
│ ├── index.html # Main page
│ ├── css/style.css # Dark theme UI
│ └── js/
│ ├── app.js # Entry point — event wiring
│ ├── canvas.js # HTML5 Canvas drawing logic
│ ├── predict.js # fetch() → Flask API
│ └── ui.js # DOM updates, probability bars
├── backend/
│ ├── app.py # Flask server + /api/predict route
│ ├── predict.py # Inference pipeline
│ ├── preprocess.py # base64 PNG → 28×28 float32 tensor
│ ├── model_loader.py # Singleton model cache
│ └── requirements.txt
├── model/
│ ├── cnn.py # CNN architecture definition
│ ├── train.py # Interactive training script
│ ├── evaluate.py # Confusion matrix + sample predictions
│ ├── visualize.py # Training curve plots
│ └── checkpoints/ # Saved model weights (.keras)
└── docs/
└── experiment_log.md # Hyperparameter experiment results
# 1. Clone
git clone https://github.com/murodovdev/digit-vision.git
cd digit-vision
# 2. Create virtual environment
python -m venv venv
venv\Scripts\activate # Windows
# source venv/bin/activate # macOS/Linux
# 3. Install dependencies
pip install -r backend/requirements.txt
# 4. Train the model (interactive — you set the hyperparameters)
cd model
python train.py
# 5. Start the backend
cd ../backend
python app.py
# 6. Open frontend/index.html in your browserPart of this project was running controlled experiments to understand how each hyperparameter affects training. All results are logged in docs/experiment_log.md.
| Run | Learning Rate | Batch Size | Dropout | Accuracy | Key Finding |
|---|---|---|---|---|---|
| Baseline | 0.001 | 64 | 0.5 | 99.27% | Best balance of speed and accuracy. Reproducible with SEED=42 |
| High LR | 0.05 | 64 | 0.5 | 10.28% | Loss stuck at ln(10) — pure random guessing |
| No dropout | 0.001 | 64 | 0.0 | 98.94% | Overfitting: train loss → 0.008, val plateaued |
| Tiny batch | 0.001 | 8 | 0.5 | 99.28% | Same accuracy, 3.7× slower (621s vs 169s) |
| + Augmentation | 0.001 | 64 | 0.4 | 98.97% | Augmentation hurt — MNIST is already large enough |
| + BatchNorm | 0.0005 | 64 | 0.4 | 99.18% | Architecture change broke old hyperparameters |
| Layer | Technology |
|---|---|
| Frontend | HTML5 Canvas, Vanilla JS (no frameworks) |
| Backend | Python, Flask, flask-cors |
| AI Model | TensorFlow 2.16, Keras |
| Dataset | MNIST (70,000 handwritten digit images) |
| Image processing | Pillow (resize, normalize) |
| Visualization | Matplotlib |
- Convolutional Neural Network (CNN) with 2 conv blocks
- Dropout regularization to prevent overfitting
- Early stopping with weight restoration to best epoch
- Preprocessing pipeline — canvas PNG → normalized 28×28 tensor
- REST API — browser communicates with model over HTTP/JSON
- Singleton model loader — model loaded once, cached for all requests
- Hyperparameter experiments — documented with hypothesis and results
MIT — feel free to use, modify, and build on this project.
