MiniLM + MLflow Demo (MLOps for DevOps Engineers)

This demo shows how DevOps engineers can get started with MLOps by building a simple text classification system. It demonstrates the complete ML pipeline from data preparation to model deployment tracking.

What This Demo Does

This demo builds a vehicle inspection classifier that predicts whether a vehicle will pass or fail based on text descriptions of issues. It's like a simplified version of what you might see in MOT (Ministry of Transport) inspection systems.

Example:

Input: "brakes worn and squealing" → Output: FAIL (0)
Input: "tyres in good condition" → Output: PASS (1)

Technologies Used

MiniLM for Text Embeddings

What it is: A lightweight transformer model that converts text into numerical vectors
Why we use it: Text can't be directly fed into machine learning algorithms - we need to convert words into numbers
What it does: Takes phrases like "brakes worn" and converts them into a 384-dimensional vector of numbers
In this demo: Converts our vehicle inspection text into embeddings that the classifier can understand

scikit-learn for Classification

What it is: A popular Python library for machine learning
Why we use it: Provides simple, reliable algorithms for classification tasks
What it does: Takes our text embeddings and learns patterns to predict pass/fail
In this demo: Uses Logistic Regression to classify vehicles as pass (1) or fail (0)

MLflow for Experiment Tracking

What it is: An open-source platform for managing the ML lifecycle
Why we use it: Tracks experiments, parameters, metrics, and models for reproducibility
What it does: Logs everything about our training runs so we can compare different approaches
In this demo: Records model accuracy, parameters, and saves the trained model for future use

Prerequisites

Required Software

For Docker approach (recommended):

Docker (version 20.10+)
Docker Compose (version 2.0+)
Make (for convenience commands)

For local Python approach:

Python 3.8+ (3.11 recommended)
pip (Python package manager)
Git (for MLflow experiment tracking)

System Requirements

RAM: 4GB minimum (8GB recommended)
Disk Space: 2GB free space
Network: Internet connection for model downloads (~90MB first run)

Platform Support

✅ Fully Supported:

macOS (Intel and Apple Silicon)
Linux (Ubuntu, CentOS, Debian, etc.)
Windows (with WSL2 or Docker Desktop)

⚠️ Windows Notes:

Use Docker Desktop for best experience
For local Python: Use WSL2 or Git Bash
Make commands: Use Git Bash or PowerShell with Make installed

Quick Platform Check

Check if you have the basics:

# Check Docker
docker --version
docker-compose --version

# Check Python (if running locally)
python3 --version
pip --version

# Check Make (optional, for convenience)
make --version

Need help installing prerequisites? See INSTALL.md for detailed installation instructions for Windows, macOS, and Linux.

Quick Start

The easiest way to run this demo:

make run

This will:

✅ Run the complete training pipeline with rich progress feedback
✅ Start MLflow web interface automatically
✅ Keep running until you stop it (Ctrl+C)
✅ Access results at http://localhost:5001

Model Caching: The first run downloads the MiniLM model (~90MB). For faster subsequent runs, you can pre-download the model with make download-model.

What Happens When Training Completes

When you see "Starting MLflow UI server...":

✅ Training is completely finished
✅ Model has been saved
✅ Only the web server is running
You can now explore results in the browser

Understanding the Results

What to Look for in MLflow UI

When you run make run and open http://localhost:5001, you'll see:

1. Experiments Page

MiniLM_Light_Demo: Your experiment containing all training runs
Run History: Each time you run the demo, it creates a new run with timestamp

2. Individual Run Details

Click on any run to see:

Parameters (What we configured):

model_type: "LogisticRegression"
embedding_model: "all-MiniLM-L6-v2"
dataset_size: 10 (number of training examples)
test_size: 0.3 (30% of data used for testing)

Metrics (How well it performed):

accuracy: Model accuracy on test data (e.g., 0.33 = 33%)
training_samples: Number of examples used for training
test_samples: Number of examples used for testing

Artifacts (What we saved):

model/: The trained classifier you can use for predictions
sample_predictions.json: Example predictions on test data

3. Model Registry

Saved Models: Your trained classifier ready for deployment
Model Signatures: Input/output format for the model
Sample Input: Example of what the model expects

Understanding the Performance

The demo uses a very small dataset (10 examples) for speed, so:

Low accuracy is expected (33% in this case)
Real-world systems would use thousands of examples
The point is the process, not perfect accuracy

What's Happening When UI Starts

When you see "Starting MLflow UI server...":

✅ Training is completely finished
✅ Model has been saved
✅ Only the web server is running
You can now explore results in the browser

Troubleshooting

Common Issues

ImportError: No module named 'google'

This happens when MLflow dependencies aren't properly installed
Solution: Use the updated requirements.txt or Docker approach

CUDA/GPU issues

The demo works fine on CPU
If you have GPU issues, the Docker approach handles this automatically

Port 5000 already in use (macOS Control Center)

Docker version automatically uses port 5001: http://localhost:5001
For local MLflow: mlflow ui --port 5001
Or stop other services using port 5000

Platform-Specific Issues

Windows:

"make is not recognized": Install Make via Chocolatey (choco install make) or use Git Bash
Docker Desktop not starting: Ensure WSL2 is enabled and updated
Permission denied: Run PowerShell as Administrator for Docker commands
Path issues: Use forward slashes in paths, or use WSL2

macOS:

Apple Silicon (M1/M2): Docker works natively, no special setup needed
Intel Macs: Full compatibility with Docker Desktop
Port conflicts: macOS Control Center uses port 5000, demo uses 5001

Linux:

Docker permission denied: Add user to docker group: sudo usermod -aG docker $USER
Make not found: Install via package manager (sudo apt install make on Ubuntu)
Python version: Ensure Python 3.8+ is installed (python3 --version)

Docker Issues:

"Cannot connect to Docker daemon": Start Docker Desktop or Docker service
Out of disk space: Clean up with docker system prune -a
Build failures: Try docker-compose build --no-cache

Health Check

Run the health check script to verify your environment:

python health_check.py

This will verify all dependencies are properly installed.

Additional Commands

Other Make Targets

make help          # Show all available commands
make health-check  # Verify your environment
make clean         # Clean up Docker containers
make demo          # Quick demo (training only, exits cleanly)
make train         # Training only (no UI)
make download-model # Pre-download model to cache (run once for faster runs)

Note: If make is not available on your system, you can run the Docker commands directly (see "Direct Docker Commands" section below).

Direct Docker Commands (Alternative to make)

# Full demo with UI (same as 'make run')
docker-compose up --build

# Quick demo (exits cleanly)
docker-compose --profile demo up --build demo

# Training only
docker-compose --profile train up --build train-only

Local Python (Without Docker)

Linux/macOS:

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Run demo
python train_light.py

# Start MLflow UI (in separate terminal)
mlflow ui

Windows:

# Create virtual environment
python -m venv venv
venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run demo
python train_light.py

# Start MLflow UI (in separate terminal)
mlflow ui

Windows (WSL2):

# Same as Linux/macOS commands
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python train_light.py
mlflow ui

Demo Versions

Lightweight Demo (`train_light.py`) - Recommended

Fast: Completes in under 30 seconds
Rich feedback: Clear progress indicators and step-by-step output
Reliable: Uses smaller model, less likely to fail
Better logging: Includes model signatures and sample predictions
Fixed warnings: No Git or MLflow signature warnings

Original Demo (`train.py`)

Educational: Shows basic MLOps concepts
Slower: Takes longer due to larger model downloads
Basic feedback: Minimal progress indicators

Takeaways

CI/CD concepts apply directly to ML: track everything, automate everything.
MLflow acts like Jenkins for ML experiments — keeping runs reproducible and comparable.
This example uses a toy dataset, but you can extend it with real data (e.g. DVLA MOT datasets).

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
Dockerfile		Dockerfile
INSTALL.md		INSTALL.md
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
download_model.py		download_model.py
health_check.py		health_check.py
requirements.txt		requirements.txt
train.py		train.py
train_light.py		train_light.py

DonaldSimpson/mlops_minilm_demo

Folders and files

Latest commit

History

Repository files navigation