Skip to content

mathewsrc/machine-learning-monitoring-with-evidently

Repository files navigation

machine-learning-monitoring-with-evidently

ML Monitoring Capstone

Summary

  • Development of machine learning models with Pyspark ML.
  • Model monitoring (ML monitoring) (data drift, prediction drift, and model performance metrics) with Evidentlyai.
  • Extraction, Transformation, and loading of API data with Pyspark, Spark SQL, and DuckDB.
  • Data quality check with the Soda library.
  • Tracking and Model Registry with MLflow.

Sections

Problem Understading

TODO

Architecture Overview

TODO

Project Structure

TODO

How to Run this Project

Generating plots using CLI

Run the following command to see all CLI options

poetry run python src/visualizations/visualizer.py 

For example we can run the following command to visualize the Contracts over time

poetry run python src/visualizations/visualizer.py contracts-over-time

Prerequisites

TODO

Examples

TODO

Data Collection

TODO

Data Preparation

TODO

image

Model Training

TODO

image

Model Evaluation

Batches date period

┌─────────────────┬────────────┐
│ min(start_date) │  end_date  │
│      date       │    date    │
├─────────────────┼────────────┤
│ 2022-01-03      │ 2022-07-03 │
│ 2022-02-01      │ 2022-08-01 │
│ 2022-03-01      │ 2022-09-01 │
└─────────────────┴────────────┘

Reference dataset

1991-12-03 00:00:00
2021-01-01 00:00:00

image

Model Deployment

TODO

Model Monitoring

TODO

TestSuit result (partially)

image

Metrics result (partially)

image

image