ML Monitoring Capstone
- Development of machine learning models with Pyspark ML.
- Model monitoring (ML monitoring) (data drift, prediction drift, and model performance metrics) with Evidentlyai.
- Extraction, Transformation, and loading of API data with Pyspark, Spark SQL, and DuckDB.
- Data quality check with the Soda library.
- Tracking and Model Registry with MLflow.
Sections
- Problem Understading
- Architecture Overview
- How to Run this Project
- Project Structure
- Data Collection
- Data Preparation
- Model Training
- Model Evaluation
- Model Deployment
- Model Monitoring
TODO
TODO
TODO
Run the following command to see all CLI options
poetry run python src/visualizations/visualizer.py
For example we can run the following command to visualize the Contracts over time
poetry run python src/visualizations/visualizer.py contracts-over-time
TODO
TODO
TODO
TODO
TODO
Batches date period
┌─────────────────┬────────────┐
│ min(start_date) │ end_date │
│ date │ date │
├─────────────────┼────────────┤
│ 2022-01-03 │ 2022-07-03 │
│ 2022-02-01 │ 2022-08-01 │
│ 2022-03-01 │ 2022-09-01 │
└─────────────────┴────────────┘
Reference dataset
1991-12-03 00:00:00
2021-01-01 00:00:00
TODO
TODO
TestSuit result (partially)
Metrics result (partially)