This demo shows how DevOps engineers can get started with MLOps by building a simple text classification system. It demonstrates the complete ML pipeline from data preparation to model deployment tracking.
This demo builds a vehicle inspection classifier that predicts whether a vehicle will pass or fail based on text descriptions of issues. It's like a simplified version of what you might see in MOT (Ministry of Transport) inspection systems.
Example:
- Input: "brakes worn and squealing" → Output: FAIL (0)
- Input: "tyres in good condition" → Output: PASS (1)
- What it is: A lightweight transformer model that converts text into numerical vectors
- Why we use it: Text can't be directly fed into machine learning algorithms - we need to convert words into numbers
- What it does: Takes phrases like "brakes worn" and converts them into a 384-dimensional vector of numbers
- In this demo: Converts our vehicle inspection text into embeddings that the classifier can understand
- What it is: A popular Python library for machine learning
- Why we use it: Provides simple, reliable algorithms for classification tasks
- What it does: Takes our text embeddings and learns patterns to predict pass/fail
- In this demo: Uses Logistic Regression to classify vehicles as pass (1) or fail (0)
- What it is: An open-source platform for managing the ML lifecycle
- Why we use it: Tracks experiments, parameters, metrics, and models for reproducibility
- What it does: Logs everything about our training runs so we can compare different approaches
- In this demo: Records model accuracy, parameters, and saves the trained model for future use
For Docker approach (recommended):
- Docker (version 20.10+)
- Docker Compose (version 2.0+)
- Make (for convenience commands)
For local Python approach:
- Python 3.8+ (3.11 recommended)
- pip (Python package manager)
- Git (for MLflow experiment tracking)
- RAM: 4GB minimum (8GB recommended)
- Disk Space: 2GB free space
- Network: Internet connection for model downloads (~90MB first run)
✅ Fully Supported:
- macOS (Intel and Apple Silicon)
- Linux (Ubuntu, CentOS, Debian, etc.)
- Windows (with WSL2 or Docker Desktop)
- Use Docker Desktop for best experience
- For local Python: Use WSL2 or Git Bash
- Make commands: Use Git Bash or PowerShell with Make installed
Check if you have the basics:
# Check Docker
docker --version
docker-compose --version
# Check Python (if running locally)
python3 --version
pip --version
# Check Make (optional, for convenience)
make --versionNeed help installing prerequisites? See INSTALL.md for detailed installation instructions for Windows, macOS, and Linux.
The easiest way to run this demo:
make runThis will:
- ✅ Run the complete training pipeline with rich progress feedback
- ✅ Start MLflow web interface automatically
- ✅ Keep running until you stop it (Ctrl+C)
- ✅ Access results at http://localhost:5001
Model Caching: The first run downloads the MiniLM model (~90MB). For faster subsequent runs, you can pre-download the model with
make download-model.
When you see "Starting MLflow UI server...":
- ✅ Training is completely finished
- ✅ Model has been saved
- ✅ Only the web server is running
- You can now explore results in the browser
When you run make run and open http://localhost:5001, you'll see:
- MiniLM_Light_Demo: Your experiment containing all training runs
- Run History: Each time you run the demo, it creates a new run with timestamp
Click on any run to see:
Parameters (What we configured):
model_type: "LogisticRegression"embedding_model: "all-MiniLM-L6-v2"dataset_size: 10 (number of training examples)test_size: 0.3 (30% of data used for testing)
Metrics (How well it performed):
accuracy: Model accuracy on test data (e.g., 0.33 = 33%)training_samples: Number of examples used for trainingtest_samples: Number of examples used for testing
Artifacts (What we saved):
model/: The trained classifier you can use for predictionssample_predictions.json: Example predictions on test data
- Saved Models: Your trained classifier ready for deployment
- Model Signatures: Input/output format for the model
- Sample Input: Example of what the model expects
The demo uses a very small dataset (10 examples) for speed, so:
- Low accuracy is expected (33% in this case)
- Real-world systems would use thousands of examples
- The point is the process, not perfect accuracy
When you see "Starting MLflow UI server...":
- ✅ Training is completely finished
- ✅ Model has been saved
- ✅ Only the web server is running
- You can now explore results in the browser
ImportError: No module named 'google'
- This happens when MLflow dependencies aren't properly installed
- Solution: Use the updated
requirements.txtor Docker approach
CUDA/GPU issues
- The demo works fine on CPU
- If you have GPU issues, the Docker approach handles this automatically
Port 5000 already in use (macOS Control Center)
- Docker version automatically uses port 5001: http://localhost:5001
- For local MLflow:
mlflow ui --port 5001 - Or stop other services using port 5000
Windows:
- "make is not recognized": Install Make via Chocolatey (
choco install make) or use Git Bash - Docker Desktop not starting: Ensure WSL2 is enabled and updated
- Permission denied: Run PowerShell as Administrator for Docker commands
- Path issues: Use forward slashes in paths, or use WSL2
macOS:
- Apple Silicon (M1/M2): Docker works natively, no special setup needed
- Intel Macs: Full compatibility with Docker Desktop
- Port conflicts: macOS Control Center uses port 5000, demo uses 5001
Linux:
- Docker permission denied: Add user to docker group:
sudo usermod -aG docker $USER - Make not found: Install via package manager (
sudo apt install makeon Ubuntu) - Python version: Ensure Python 3.8+ is installed (
python3 --version)
Docker Issues:
- "Cannot connect to Docker daemon": Start Docker Desktop or Docker service
- Out of disk space: Clean up with
docker system prune -a - Build failures: Try
docker-compose build --no-cache
Run the health check script to verify your environment:
python health_check.pyThis will verify all dependencies are properly installed.
make help # Show all available commands
make health-check # Verify your environment
make clean # Clean up Docker containers
make demo # Quick demo (training only, exits cleanly)
make train # Training only (no UI)
make download-model # Pre-download model to cache (run once for faster runs)Note: If
makeis not available on your system, you can run the Docker commands directly (see "Direct Docker Commands" section below).
# Full demo with UI (same as 'make run')
docker-compose up --build
# Quick demo (exits cleanly)
docker-compose --profile demo up --build demo
# Training only
docker-compose --profile train up --build train-onlyLinux/macOS:
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Run demo
python train_light.py
# Start MLflow UI (in separate terminal)
mlflow uiWindows:
# Create virtual environment
python -m venv venv
venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Run demo
python train_light.py
# Start MLflow UI (in separate terminal)
mlflow uiWindows (WSL2):
# Same as Linux/macOS commands
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python train_light.py
mlflow ui- Fast: Completes in under 30 seconds
- Rich feedback: Clear progress indicators and step-by-step output
- Reliable: Uses smaller model, less likely to fail
- Better logging: Includes model signatures and sample predictions
- Fixed warnings: No Git or MLflow signature warnings
- Educational: Shows basic MLOps concepts
- Slower: Takes longer due to larger model downloads
- Basic feedback: Minimal progress indicators
- CI/CD concepts apply directly to ML: track everything, automate everything.
- MLflow acts like Jenkins for ML experiments — keeping runs reproducible and comparable.
- This example uses a toy dataset, but you can extend it with real data (e.g. DVLA MOT datasets).