- Hosts the model as an RESTful API service that allows users to submit new data and receive predictions from
- Model training is not important here, it can be any simple example model or pretrained one from the web.
- Pipeline components/steps:
Preprocessor- prepares dataModelScorer- performs the inference task, requests the model APIPostprocessor- saves inference results to a CSV file
- The pipeline should be modular and easily reused for additional model APIs or data sources
- Dockerization and Deployment:
- The entire inference pipeline should be packaged into Docker image(s)
- so it can be potentially used by different orchestration systems
- Describe how the dockerized pipeline could be integrated into different orchestration systems (Airflow, Kubeflow, Sagemaker, Vertex AI).
- The entire inference pipeline should be packaged into Docker image(s)
Implement (or describe) a monitoring and observability strategy:
- System to track the performance of the inference pipeline over time
- Set up rules to notify you of potential issues or performance degradation
- Monitor and analyze data to identify patterns and trends in pipeline behavior
- Testing
- Documentation
- Code Quality
- Fine-tune the selected model to improve its predictive performance.
- Evaluate the fine-tuned model's performance and compare it to the original model.
To create this Python dev environment from scratch run:
conda env createTo check conda envs:
conda env list && conda listTo update the env during development:
conda env update -n mlops-dev --pruneTo recreate this env:
conda activate base &&
conda env remove -n mlops-dev &&
conda env create &&
conda env list &&
conda activate mlops-dev &&
conda listGo to the ml-service directory to prepare and test a Model artifact.
Build the ml-service image with ModelServer that will be serving a model through the REST API:
docker buildx build -t ml-service --progress plain -f ml-service.Dockerfile .To verify the image:
docker image ls ml-serviceTo rebuild the image from scratch:
docker buildx build -t ml-service --progress plain --no-cache --pull -f ml-service.Dockerfile .To run the containerized Inference Service:
docker run -it -p 8080:8080 ml-serviceOnce the service starts you can open the /metrics endpoint in your browser: http://localhost:8080/metrics and observe how the endpoint behaves.
To test the REST API run the simple script (a cURL replacement on Windows):
python try_ml_service.pyThe output should be:
<Response [200]>
{"predictions":[8,9,8]}
Go to the ml-pipelines directory to build and test the pipelines components.
Build the ml-pipelines image with the components that will be used in the batch prediction pipeline:
docker buildx build -t ml-pipelines --progress plain -f ml-pipelines.Dockerfile .To verify the image:
docker image ls ml-pipelinesYou can also run it to see if it works as expected:
docker run -it ml-pipelinesTo rebuild the image from scratch:
docker buildx build -t ml-pipelines --progress plain --no-cache --pull -f ml-pipelines.Dockerfile .To run the pipeline, first you need to start the Inference Service:
docker run -it -p 8080:8080 ml-serviceOnce the service is ready, run the batch pipeline:
python batch_inference_pipeline.pyThe output should look like this:
(mlops-dev) ..\mlops-case-study>python batch_inference_pipeline.py
Starting the pipeline...
Pipeline: PreProcessor step starting...
Pipeline: PreProcessor step DONE!
Pipeline: ModelScore step starting...
Pipeline: ModelScore step DONE!
Pipeline: PostProcessor step starting...
Pipeline: PostProcessor step DONE!
Pipeline DONE!
- Logging, emitting metrics, proper app instrumentation and monitoring -> to be discussed
- Testing - showed a proper structure -> details to be discussed
- Code Quality
- focused on a good repo layout and code structure
- DRY needed (lack of time)
- details to be discussed
- Documentation
- skipped (lack of time) included only some docstrings -> to be discussed
- I recommend performing HPO with Optuna framework