Skip to content

This project builds an MLOps pipeline using Evidently for monitoring model performance and Prefect for task orchestration. It processes NYC taxi data, stores metrics in PostgreSQL, and visualizes results in Grafana via Docker Compose.

Notifications You must be signed in to change notification settings

Mannerow/mlops-homework-05

Repository files navigation

mlops-homework-05

📝 Description

This project develops an MLOps pipeline using Evidently to monitor key performance metrics of a machine learning model, including prediction drift and median fare amount. It employs Prefect for workflow orchestration, managing tasks such as database updates and metric calculations. The results are visualized through Grafana, providing interactive dashboards for real-time analysis, all supported by a Docker Compose environment that orchestrates the interplay between PostgreSQL, Adminer, and Grafana to handle data storage, management, and visualization.

🔧 Command to Run

docker-compose up

This will initiate the following steps:

1. Runs the Jupyter notebook: 'baseline_model_nyc_taxi_data.ipynb'

This notebook develops a regression model using the pandas library for data manipulation, scikit-learn for model building and evaluation, with matplotlib and seaborn for visualization. It processes New York City taxi data to predict trip durations or fare amounts through data cleaning, exploratory data analysis, feature engineering, model training, and validation. The notebook integrates Evidently to monitor performance drift, number of drifted columns, missing values, and regression performance quality, as well as tracking the median fare amount.

2. Runs the monitoring script: 'evidently_metrics.py'

This Python script utilizes the Evidently library to monitor key model performance metrics such as prediction drift, number of drifted columns, and median fare amount. It employs Prefect for orchestrating the pipeline to manage tasks like database preparation and daily metric calculations effectively. The metrics generated by Evidently are stored in a PostgreSQL database, managed via the psycopg library for SQL operations. For visualization, Grafana is integrated, providing interactive dashboards for real-time monitoring and analysis, all facilitated through a Docker Compose setup that includes services for PostgreSQL, Adminer, and Grafana, ensuring seamless interaction and data flow between these components.

Open the Grafana UI to View the Dashboards

Navigate to http://localhost:3000/

Open the Dashboard titled 'Taxi Duration Prediction'

image

About

This project builds an MLOps pipeline using Evidently for monitoring model performance and Prefect for task orchestration. It processes NYC taxi data, stores metrics in PostgreSQL, and visualizes results in Grafana via Docker Compose.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages