The purpose of this project is to build out a simple classification model to predict the credit risk of bank customers. The resulting model will then be deployed to production using MLOps best practices.
The dataset can be downloaded from Kaggle via this link.
The dataset contains 1000 entries with 20 categorial/symbolic attributes. Each entry represents a person who takes a credit from a bank where the person is classified as having good or bad (class
) credit risk according to a set of attributes.
The advantage with using such a small dataset is that we get to experiment faster, using fewer resources and that we get to address other problems we often don't face when working on larger datasets. Additionally, many companies - in particular startups - have limited datasets to work with in the first place. This would better simulate a situation like that.
- Cloud: GCP
- Experiment Tracking: MLFlow
- Workflow Orchestration: Prefect
- Containerisation: Docker and Docker Compose
- Model Deployment: Flask, Docker, Dash, MLFlow
- Monitoring: Evidently, Grafana, Prometheus, MongoDB
- Best Practices: Unit and Integration Tests, Makefile
- (CI/CD: GitHub Actions)
- Build notebook with initial model development
- Data Preparation
- Exploratory Data Analysis
- Model Pipeline for Tuning and Training
- Evaluate and Analyse Model Performance
- Experiment Tracking
- Workflow Orchestration
- Web Service to Expose the Model Predictions
- Model Monitoring
- Tests (partially)
- Unit Tests
- Integration Tests
- [] CI/CD and [x] Makefile
- GCP Cloud Deployment (partially)
- Use Docker containers for each service
Each one of the implemented steps above can be further developed and improved on. However, some good practices have not been implemented at all and should be considered for more mature projects. These include:
- Complete deployment on the cloud. Currently everything is deployed in separate Docker containers. These can fairly easily be moved to the cloud as are.
- Host the generated Evidently reports for easier inspection.
- Adding CI/CD
- Adding IaC
Start all services by executing
docker compose -f docker-compose.yml up --build
This will create a separate Docker container for each service.
In order to add model training and Evidently report generation Flows to the queue, run the following commands:
make model-train-flow
and
make evidently-report-flow
At this stage, you can access the different services via the following urls:
- MLFlow: http://localhost:5051/
- Prefect UI: http://localhost:4200/
- Model UI, aka Risk-O-Meter (allows you to send data to the model and receive predictions. It's a simple simulation of how a bank clerk might use such a system): http://localhost:9696/
- Prometheus: http://localhost:9091/
- Grafana Dashboard: http://localhost:3000/ (default user/password: admin/admin)
The model was developed in a Jupter Notebook (Model-development.ipynb), and includes data cleaning, data visualisation, basic feature selection, model tuning and model evaluation using 10-fold cross-validation.
Three different algorithms are evaluated, Logistic Regression, Random Forest and LightGBM. LightGB was chosen due to its higher specificty and comparable other metrics. It was also significantly faster to train and resulted in a lighter model compared to the second best model.
Below are 10-fold cross-validated results for each of the evaluated models.
MLFlow is used to register model training runs. Artifacts, which includes models, plots and statistics, are stored in a Google Cloud Bucket, while the server is run in a local Docker container and accessible via the url specified above.
After each model training, tuning and cross-validation run, the 5 best models are logged with their corresponding accuracy, AUC, recall and specificity on both the validation and test sets. To speed things up, currently also only 5 different parameter combinations are evaluated, but this can easily be adjusted.
The single best model is then registred in the Model Registry and it's up to a human evaluator the decide whether to move it into production stage or not.
Prefect is used to orchestrate the runs, or DAGs. The model is currently set to automatically retrain every Sunday night to mitigate model degradation. Evidently reports are subsequently generated every 3 hours and stored in a Google Cloud Bucket where they can be downloaded and inspected.
The model training and report generation flows are deployed in different Docker containers.
In order to create a simple simulation of how a bank clerk might use a decision system like this, there's also a UI. The Risk-O-Meter, as it's called, allows the user to upload a .csv file with data about a list of clients and receive a prediction about their credit risk.
- Green credit risk means that the risk is lower for the bank and that they should make an offer to the client.
- Red means that the risk is currently a little too high.
The model's confidence in its prediction for each client is also displayed for the clerk to better asses the prediction.
The Production stage model is automatically fetched in the Risk-O-Meter.
- Grafana is used to monitor data drift and Prometheus stores these metrics.
- Evidently is used to calculate data drift and for more advanced monitoring analysis. The reports are stored in html format in a Google Cloud Bucket and locally in MongoDB in json format.
The html version is easier to visually inspect, while the metrics in json format can be very useful for creating custom visualisations and statistic if needed.
Some basic unit and integration tests are also available. These mainly check that the trained model returns predictions that are by and large accurate. However, this allows us to be fairly confident in that the entire data pre-processing pipeline and model training steps are working as expected, we are using correct package versions etc.
Since only AUC is taken into account during the tests, they shouldn't be used to decide whether a model is ready for production or not.
Run the tests by initiating them in the VSCode UI, or by executing the following script:
python -m pytest model_orchestration_and_tracking/tests/*