- Problem Explanation
- Enviroment
- Model: Classification model that predict if an employee is leaving the company.
- Tracking Experiment with Mlflow
- Orchestration of the project
- Deployment
- Monitoring
conda create -n project_enviroment python=3.9
conda activate project_enviroment
pip install -r requirements.txt
Download dataset here
wget https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset?resource=download
Working on model.ipynb
conda install -n project_enviroment ipykernel --update-deps --force-reinstall
Run the following command in your terminal to track the experiment in your local machine:
mlflow ui --backend-store-uri sqlite:///mydb.sqlite
That command create a database file called mydb.sqlite in the current directory that'll be used to store the experiment data.
Add this code to your notebook to track the experiment in your local machine using a SQLite database:
import mlflow
mlflow.set_tracking_uri('sqlite:///mydb.sqlite')
And start a run with:
mlflow.start_run()
I'm using a sklearn library, mlflow provides a way to register the model with the following command:
#Model Register
mlflow.sklearn.log_model(
sk_model = logreg,
artifact_path='models/logreg',
registered_model_name='sk-learn-logreg-model'
)
I'm going to use Prefect to orchestrate the project.
conda install prefect -c conda-forge
prefect auth login --key <YOUR-KEY>
prefect orion start
See the options wit the following command:
prefect deployment build --help
prefect deployment build .\model.py:applying_model --name Project-Deployment --tag MLOps
prefect deployment apply applying_model-deployment.yaml
We can't run the deployment from the UI yet. We nned a work queue and an agent to run the deployment.
Work queues and agents are the mechanisms by which the Prefect API orchestrates deployment flow runs in remote execution environments.
Work queues let you organize flow runs into queues for execution. Agents pick up work from queues and execute the flows
prefect agent start -t tag where tag is the tag you used to build the deployment.
Now, when you run a deployment with the -t tag
option, the agent will pick up the work from the queue and execute the flows.
- Go to the UI
- Select
Add Schedule
- I'm going to select
Cron
with a value of0 0 * * *
that means every day at 12:00 AM. Timezone
is important, so, be sure to select the correct timezone.
I'm going to use Evidently to monitor the experiment.
You can install it with the following command:
pip install evidently