Cloud agnostic tech stack for starting an MLOps platform (Level 1)
"We'll build a pipeline - after we deploy the model."
Model drift will hit when it's least convenient for you
To run: Make sure docker is running and you have Docker Compose installed.
-
Clone the project
git clone https://github.com/jmeisele/ml-ops.git
-
Change directories into the repo
cd ml-ops
-
Run database migrations and create the first Airflow user account.
docker-compose up airflow-init
-
Build our images and launch with docker compose
docker-compose pull && docker-compose up
-
Open a browser and log in to MinIO
user: minioadmin
password : minioadmin
Create a bucket called
mlflow
-
Open a browser and log in to Grafana
user: admin
password : admin
Both Promethus and InfluxDB data sources have already been provisioned along with an MLOps Demo Dashboard and a Notification Channel.
-
Start the
send_data.py
script which sends a POST request every 0.1 seconds -
Open a browser and turn on the Airflow DAG used to retrain our ML model
user: airflow
password : airflow
- Lower the alarm threshold to see the Airflow DAG pipeline get triggered
-
Check MLFlow after the Airflow DAG has run to see the model artifacts stored using MinIO as the object storage layer.
-
(Optional) Send a POST request to our model service API endpoint
curl -v -H "Content-Type: application/json" -X POST -d '{ "median_income_in_block": 8.3252, "median_house_age_in_block": 41, "average_rooms": 6, "average_bedrooms": 1, "population_per_block": 322, "average_house_occupancy": 2.55, "block_latitude": 37.88, "block_longitude": -122.23 }' http://localhost/model/predict
-
(Optional) If you are so bold, you can also simluate production traffic using locust, but keep in mind you have a lot of services running on your local machine, you would never deploy a production ML API on your local machine to handle production traffic.
- nginx: Load Balancer
- python-model-service1: FastAPI Machine Learning API 1
- python-model-service2: FastAPI Machine Learning API 2
- postgresql: RDBMS
- rabbitmq: Message Queue
- rabbitmq workers: Workers listening to RabbitMQ
- locust: Load testing and simulate production traffic
- prometheus: Metrics scraping
- minio: Object storage
- mlflow: Machine Learning Experiment Management
- influxdb: Time Series Database
- chronograf: Admin & WebUI for InxfluxDB
- grafana: Performance Monitoring
- redis: Cache
- airflow: Workflow Orchestrator
- bridge server: Receives webhook from Grafana and translates to Airflow REST API
Warning: scripts in /docker-entrypoint-initdb.d are only run if you start the container with a data directory that is empty; any pre-existing database will be left untouched on container startup.
Thanks goes to these incredible people: