Note: The purpose of the project to explore the libraries and learn how to use them. Not to build a SOTA model.

Requirements:

This project uses Python 3.8

Create a virtual env with the following command:

conda create --name project-setup python=3.8
conda activate project-setup

Install the requirements:

pip install -r requirements.txt

Running

Training

After installing the requirements, in order to train the model simply run:

python train.py

Monitoring

Once the training is completed in the end of the logs you will see something like:

wandb: Synced 5 W&B file(s), 4 media file(s), 3 artifact file(s) and 0 other file(s)
wandb:
wandb: Synced proud-mountain-77: https://wandb.ai/raviraja/MLOps%20Basics/runs/3vp1twdc

Follow the link to see the wandb dashboard which contains all the plots.

Versioning data

Refer to the blog: DVC Configuration

Exporting model to ONNX

Once the model is trained, convert the model using the following command:

python convert_model_to_onnx.py

Inference

Inference using standard pytorch

python inference.py

Inference using ONNX Runtime

python inference_onnx.py

S3 & ECR

Follow the instructions mentioned in the blog post for creating S3 bucket and ECR repository.

Configuring dvc

dvc init (this has to be done at root folder)
dvc remote add -d model-store s3://models-dvc/trained_models/

AWS credentials

Create the credentials as mentioned in the blog post

Do not share the secrets with others

Set the ACCESS key and id values in environment variables.

export AWS_ACCESS_KEY_ID=<ACCESS KEY ID>
export AWS_SECRET_ACCESS_KEY=<ACCESS SECRET>

Trained model in DVC

Sdd the trained model(onnx) to dvc using the following command:

cd dvcfiles
dvc add ../models/model.onnx --file trained_model.dvc

Push the model to remote storage

dvc push trained_model.dvc

Docker

Install the docker using the instructions here

Build the image using the command

docker build -t mlops-basics:latest .

The default command in dockerfile is modified to support the lambda. If you want to run without lambda use the last weeks dockerfile.

Then run the container using the command

docker run -p 8000:8000 --name inference_container mlops-basics:latest

(or)

Build and run the container using the command

docker-compose up

Pushing the image to ECR

Follow the instructions mentioned in blog post for creating ECR repository.

Authenticating docker client to ECR

aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin 246113150184.dkr.ecr.us-west-2.amazonaws.com

Tagging the image

docker tag mlops-basics:latest 246113150184.dkr.ecr.us-west-2.amazonaws.com/mlops-basics:latest

Pushing the image

docker push 246113150184.dkr.ecr.us-west-2.amazonaws.com/mlops-basics:latest

Refer to .github/workflows/build_docker_image.yaml file for automatically creating the docker image with trained model and pushing it to ECR.

Serveless - Lambda

Refer to the Blog Post here for detailed instructions on configuring lambda with the docker image and invoking it using a API.

Monitoring - Kibana

Refer to the Blog Post here for detailed instructions on configuring kibana using elasticsarch cluster and integrating with cloudwatch logs.

Running notebooks

I am using Jupyter lab to run the notebooks.

Since I am using a virtualenv, when I run the command jupyter lab it might or might not use the virtualenv.

To make sure to use the virutalenv, run the following commands before running jupyter lab

conda install ipykernel
python -m ipykernel install --user --name project-setup
pip install ipywidgets

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Requirements:

Running

Training

Monitoring

Versioning data

Exporting model to ONNX

Inference

Inference using standard pytorch

Inference using ONNX Runtime

S3 & ECR

Configuring dvc

AWS credentials

Trained model in DVC

Docker

Pushing the image to ECR

Serveless - Lambda

Monitoring - Kibana

Running notebooks

Files

README.md

Latest commit

History

README.md

File metadata and controls

Requirements:

Running

Training

Monitoring

Versioning data

Exporting model to ONNX

Inference

Inference using standard pytorch

Inference using ONNX Runtime

S3 & ECR

Configuring dvc

AWS credentials

Trained model in DVC

Docker

Pushing the image to ECR

Serveless - Lambda

Monitoring - Kibana

Running notebooks