Skip to content

Commit

Permalink
Monitoring in AWS is now ready
Browse files Browse the repository at this point in the history
  • Loading branch information
svpino committed Jan 22, 2025
1 parent 0eba784 commit 175f46b
Show file tree
Hide file tree
Showing 23 changed files with 514 additions and 631 deletions.
2 changes: 1 addition & 1 deletion .guide/aws/cleaning-up.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ If you aren't planning to return to the program, you can also remove the CloudFo
aws cloudformation delete-stack --stack-name mlschool
```

Finally, you can run the following command to delete the endpoint from SageMaker:
Finally, you can run the following command to delete the endpoint from Sagemaker:

```shell
just sagemaker-delete
Expand Down
16 changes: 8 additions & 8 deletions .guide/aws/deploying-model.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# Deploying Model To SageMaker
# Deploying Model To Sagemaker

You can use the Deployment pipeline to deploy the latest model from the Model Registry to different deployment targets. The pipeline will connect to the specified target platform, create a new endpoint to host the model, and run a few samples to test that everything works as expected.

To deploy the model to SageMaker, you'll need access to `ml.m4.xlarge` instances. By default, the quota for most new accounts is zero, so you might need to request a quota increase. You can do this in your AWS account under "Service Quotas" > "AWS Services" > "Amazon SageMaker". Find `ml.m4.xlarge for endpoint usage` and request a quota increase of 8 instances.
To deploy the model to Sagemaker, you'll need access to `ml.m4.xlarge` instances. By default, the quota for most new accounts is zero, so you might need to request a quota increase. You can do this in your AWS account under "Service Quotas" > "AWS Services" > "Amazon Sagemaker". Find `ml.m4.xlarge for endpoint usage` and request a quota increase of 8 instances.

Start by creating an environment variable with the endpoint name you want to create. The following command will append the variable to the `.env` file and export it in your current shell:

```shell
export $((echo "ENDPOINT_NAME=penguins" >> .env; cat .env) | xargs)
```
Before deploying the model to SageMaker, you must build a Docker image and push it to the [Elastic Container Registry](https://aws.amazon.com/ecr/) (ECR). You can do this by running the following command:
Before deploying the model to Sagemaker, you must build a Docker image and push it to the [Elastic Container Registry](https://aws.amazon.com/ecr/) (ECR). You can do this by running the following command:
```shell
uv run -- mlflow sagemaker build-and-push-container
Expand All @@ -22,9 +22,9 @@ Once the image finishes uploading, run the [Training pipeline](.guide/training-p
```shell
uv run -- python pipelines/deployment.py \
--config endpoint config/sagemaker.json \
--config backend config/sagemaker.json \
--environment conda run \
--endpoint endpoint.Sagemaker
--backend backend.Sagemaker
```
You can also run the pipeline using `just` along with the `sagemaker` recipe:
Expand All @@ -33,10 +33,10 @@ You can also run the pipeline using `just` along with the `sagemaker` recipe:
just sagemaker-deploy
```
To deploy the model to SageMaker, we need to use the `--endpoint` parameter to specify the `endpoint.Sagemaker` class. We'll use the `config/sagemaker.json` configuration file to set up the instance. Here are the available configuration parameters:
To deploy the model to Sagemaker, we need to use the `--backend` parameter to specify the `backend.Sagemaker` class. We'll use the `config/sagemaker.json` configuration file to set up the instance. Here are the available configuration parameters:
* `target`: The target endpoint name. You can set this parameter to the `ENDPOINT_NAME` environment variable.
* `data-capture-destination`: The S3 bucket where SageMaker will store the input data and predictions. This parameter is optional. If you specify it, SageMaker will automatically capture the input data received by the endpoint and the predictions generated by the model. This information will be stored in the specified location. You can use this later to monitor the model's performance.
* `data-capture-destination`: The S3 bucket where Sagemaker will store the input data and predictions. This parameter is optional. If you specify it, Sagemaker will automatically capture the input data received by the endpoint and the predictions generated by the model. This information will be stored in the specified location. You can use this later to monitor the model's performance.
* `region`: The AWS region where the endpoint will be created. You can set this parameter to the `AWS_REGION` environment variable.
After the pipeline finishes running, you can test the endpoint from your terminal using the following command:
Expand All @@ -45,4 +45,4 @@ After the pipeline finishes running, you can test the endpoint from your termina
just sagemaker-invoke
```
As soon as you are done with the SageMaker endpoint, delete it to avoid unnecessary costs. Check the [Cleaning up AWS resources](.guide/aws/cleaning-up.md) section for more information.
As soon as you are done with the Sagemaker endpoint, delete it to avoid unnecessary costs. Check the [Cleaning up AWS resources](.guide/aws/cleaning-up.md) section for more information.
45 changes: 29 additions & 16 deletions .guide/aws/monitoring-model.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
### Monitoring Model In SageMaker
# Monitoring Model In Sagemaker

The Monitoring pipeline supports monitoring models hosted in SageMaker. For more information on how the pipeline works, check the [Monitoring pipeline](.guide/monitoring-pipeline/introduction.md) section.
The Monitoring pipeline supports monitoring models hosted in Sagemaker. For more information on how the pipeline works, check the [Monitoring pipeline](.guide/monitoring-pipeline/introduction.md) section.

Before running the Monitoring pipeline, we'll generate some traffic for the hosted model. You can do that by running the following command:

```shell
uv run -- python pipelines/traffic.py \
--config backend config/sagemaker.json \
--environment conda run \
--endpoint endpoint.Sagemaker
--backend backend.Sagemaker
```

You can also run the pipeline using `just` along with the `sagemaker-traffic` recipe:
Expand All @@ -16,30 +17,42 @@ You can also run the pipeline using `just` along with the `sagemaker-traffic` re
just sagemaker-traffic
```

It will take a few minutes for SageMaker to store the captured data in the location specified when you deployed the model. After that, you can run the following command to generate fake ground truth labels for the captured data:
It will take a few minutes for Sagemaker to store the captured data in the location specified when you deployed the model. After that, you can run the following command to generate fake ground truth labels for the captured data:

```shell
python3 pipelines/endpoint.py --environment=pypi run \
--action labeling \
--target sagemaker \
--target-uri s3://$BUCKET/datastore \
--ground-truth-uri s3://$BUCKET/ground-truth
uv run -- python pipelines/labels.py \
--config backend config/sagemaker.json \
--environment conda run \
--backend backend.Sagemaker
```

The `--target-uri` parameter should point to the location where SageMaker stores the data captured from the endpoint. The `--ground-truth-uri` parameter should point to the S3 location where you want to store the generated labels.

Set up Metaflow's built-in viewer for the Monitoring pipeline by running the command below and navigating in your browser to [localhost:8324](http://localhost:8324/):

```shell
python3 pipelines/monitoring.py --environment=pypi card server
uv run -- python pipelines/monitoring.py \
--environment conda card server \
```


You can also use the `just` command with the `sagemaker-monitor-viewer` recipe:

```shell
just sagemaker-monitor-viewer
```

Finally, run the Monitoring pipeline using the command below:

```shell
uv run -- python pipelines/monitoring.py \
--config backend config/sagemaker.json \
--environment conda run \
--backend backend.Sagemaker
```

Finally, run the Monitoring pipeline using the command below. Replace the location of the captured data and the ground truth labels with the values you specified before:
You can also use the `just` command with the `sagemaker-monitor` recipe:

```shell
python3 pipelines/monitoring.py --environment=pypi run \
--datastore-uri s3://$BUCKET/datastore \
--ground-truth-uri s3://$BUCKET/ground-truth
just sagemaker-monitor
```

You will see every report generated by the pipeline in the built-in viewer opened in your browser.
4 changes: 2 additions & 2 deletions .guide/inference-pipeline/initializing-backend.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,14 @@ The first step is to initialize the backend instance that the pipeline will use

The custom model relies on the `MODEL_BACKEND` environment variable to determine which class it should dynamically load to store the data. By default, the pipeline will not store the inputs and predictions.

One of the backend implementations included as part of the code available to the pipeline is a SQLite backend. You can use this backend by setting the `MODEL_BACKEND` environment variable to `backend.SQLite` in the environment where the model is running.
One of the backend implementations included as part of the code available to the pipeline is a Local backend that stores the data in a SQLite database. You can use this backend by setting the `MODEL_BACKEND` environment variable to `backend.Local` in the environment where the model is running.

If `MODEL_BACKEND` is specified, the pipeline will create and initialize an instance of the class:

```python
module, cls = backend_class.rsplit(".", 1)
module = importlib.import_module(module)
backend = getattr(module, cls)(config=backend_config)
backend = getattr(module, cls)(config=...)
```

If the `MODEL_BACKEND_CONFIG` environment variable is specified, the pipeline will attempt to load it as a JSON file and pass a dictionary of settings to the backend implementation for initialization.
Expand Down
12 changes: 6 additions & 6 deletions .guide/monitoring-pipeline/generating-fake-labels.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,21 +17,21 @@ just labels

The Labels pipeline will connect to the model backend, load any unlabeled data, and generate a fake label to serve as the ground truth for that particular sample. This pipeline is only helpful for testing the Monitoring pipeline. In a production environment, you must determine the actual ground truth for every sample.

By default, the pipeline uses the `backend.SQLite` implementation to load the production data from a SQLite database. You can change the [backend implementation](pipelines/inference/backend.py) by specifying the `--backend` property:
By default, the pipeline uses the `backend.Local` implementation to load the production data from a SQLite database. You can change the [backend implementation](pipelines/inference/backend.py) by specifying the `--backend` property:

```shell
uv run -- python pipelines/labels.py \
--environment conda run \
--backend backend.SQLite
--backend backend.Local
```

To provide configuration settings to a specific backend implementation, you can use the `--config` parameter to supply a JSON configuration file to the pipeline. The [`config/sqlite.json`](config/sqlite.json) file is an example configuration file for the [`backend.SQLite`](pipelines/inference/backend.py) backend. You can use this file as follows:
To provide configuration settings to a specific backend implementation, you can use the `--config` parameter to supply a JSON configuration file to the pipeline. The [`config/local.json`](config/local.json) file is an example configuration file for the [`backend.Local`](pipelines/inference/backend.py) backend. You can use this file as follows:

```shell
uv run -- python pipelines/labels.py \
--environment conda \
--config backend config/sqlite.json run \
--backend backend.SQLite
--config backend config/local.json \
--environment conda run \
--backend backend.Local
```

The Labels pipeline relies on the `--ground-truth-quality` parameter to determine how close the fake ground truth information should be to the predictions the model generated. Setting this parameter to a value less than `1.0` will introduce noise to simulate inaccurate model predictions. By default, this parameter has a value of `0.8`.
14 changes: 7 additions & 7 deletions .guide/monitoring-pipeline/generating-fake-traffic.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,15 @@ just traffic

This pipeline loads the original dataset, randomly selects a number of samples, and sends them to the hosted model in batches.

Using the `--endpoint` parameter, you can specify how to communicate with the hosted model. This parameter expects the name of a class implementing the [`endpoint.Endpoint`](pipelines/inference/endpoint.py) abstract class. By default, this parameter will use the [`endpoint.Server`](pipelines/inference/endpoint.py) implementation, which knows how to submit requests to an inference server created using the `mlflow models serve` command.
Using the `--backend` parameter, you can specify how to communicate with the hosted model. This parameter expects the name of a class implementing the [`backend.Backend`](pipelines/inference/backend.py) abstract class. By default, this parameter will use the [`backend.Local`](pipelines/inference/backend.py) implementation, which knows how to submit requests to an inference server created using the `mlflow models serve` command.

To specify the location of the hosted model, you can use the `--target` parameter. By default, the pipeline assumes you are running the model locally, on port `8080`, on the same computer from where you are running the Traffic pipeline:
To provide configuration settings to a specific backend implementation, you can use the `--config` parameter to supply a JSON configuration file to the pipeline. The [`config/local.json`](config/local.json) file is an example configuration file for the [`backend.Local`](pipelines/inference/backend.py) backend. You can use this file as follows:

```python
target = Parameter(
"target",
default="http://127.0.0.1:8080/invocations",
)
```shell
uv run -- python pipelines/traffic.py \
--config backend config/local.json \
--environment conda run \
--backend backend.Local
```

By default, the Traffic pipeline will send 200 samples to the hosted model. If you want to send a different number, use the `--samples` parameter:
Expand Down
10 changes: 5 additions & 5 deletions .guide/monitoring-pipeline/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,21 +19,21 @@ You can also use the `just` command with the `monitor` recipe:
just monitor
```

The pipeline will load the reference and production datasets and generate a series of reports to evaluate the quality of the data and the model's performance. By default, the pipeline uses the `backend.SQLite` implementation to load the production data from a SQLite database. You can change the [backend implementation](pipelines/inference/backend.py) by specifying the `--backend` property:
The pipeline will load the reference and production datasets and generate a series of reports to evaluate the quality of the data and the model's performance. By default, the pipeline uses the `backend.Local` implementation to load the production data from a SQLite database. You can change the [backend implementation](pipelines/inference/backend.py) by specifying the `--backend` property:

```shell
uv run -- python pipelines/monitoring.py \
--environment conda run \
--backend backend.SQLite
--backend backend.Local
```

To provide configuration settings to a specific backend implementation, you can use the `--config` parameter to supply a JSON configuration file to the pipeline. The [`config/sqlite.json`](config/sqlite.json) file is an example configuration file for the [`backend.SQLite`](pipelines/inference/backend.py) backend. You can use this file as follows:
To provide configuration settings to a specific backend implementation, you can use the `--config` parameter to supply a JSON configuration file to the pipeline. The [`config/local.json`](config/local.json) file is an example configuration file for the [`backend.Local`](pipelines/inference/backend.py) backend. You can use this file as follows:

```shell
uv run -- python pipelines/monitoring.py \
--environment conda \
--config backend config/sqlite.json run \
--backend backend.SQLite
--config backend config/local.json run \
--backend backend.Local
```

By default, the pipeline will load the latest 500 samples stored in the backend and use them to generate the reports. You can change the number of samples to load by using the `--limit` parameter when running the pipeline:
Expand Down
6 changes: 3 additions & 3 deletions .guide/serving-model/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,13 @@ You can see the actual command behind the `serve` recipe by opening the [`justfi
If we want the model to capture the input data and the predictions it generates, we must specify a backend implementation using the `MODEL_BACKEND` environment variable. You can do that by running the following command:

```shell
MODEL_BACKEND=backend.SQLite just serve
MODEL_BACKEND=backend.Local just serve
```

The command above will use the `backend.SQLite` implementation to capture the data in a SQLite database. You can also export the `MODEL_BACKEND` environment variable in your shell to avoid specifying it every time you run the command:
The command above will use the `backend.Local` implementation and will capture the data in a SQLite database. You can also export the `MODEL_BACKEND` environment variable in your shell to avoid specifying it every time you run the command:

```shell
export MODEL_BACKEND=backend.SQLite
export MODEL_BACKEND=backend.Local
just serve
```

Expand Down
6 changes: 3 additions & 3 deletions .guide/serving-model/invoking-the-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@ After the model server is running, we can invoke it by sending a request with a
just invoke
```

You can see the actual command behind the `serve` recipe by opening the [`justfile`](/justfile) file. Notice we are using a simple `curl` command to send a request to the model server.
You can see the actual command behind the `invoke` recipe by opening the [`justfile`](/justfile) file. Notice we are using a simple `curl` command to send a request to the model server.

If the model is capturing data, we can check whether the data was stored correctly. For example, if we are using the SQLite backend, we can query the database to make sure every new request and prediction is being stored.
If the model is capturing data, we can check whether the data was stored correctly. For example, if we are using the `backend.Local` backend, we can query the SQLite database to make sure every new request and prediction is being stored.

By default, the SQLite backend implementation stores the data in a file named `penguins.db` located in the repository's root directory. We can display the number of samples in the SQLite database by running the following command:
By default, `backend.Local` stores the data in a file named `penguins.db` located in the repository's root directory. We can display the number of samples in the SQLite database by running the following command:

```shell
uv run -- sqlite3 penguins.db "SELECT COUNT(*) FROM data;"
Expand Down
26 changes: 20 additions & 6 deletions .guide/toc.json
Original file line number Diff line number Diff line change
Expand Up @@ -498,7 +498,7 @@
"actions": [
{
"label": "Serve model",
"target": "MODEL_BACKEND=backend.SQLite just serve"
"target": "MODEL_BACKEND=backend.Local just serve"
}
],
"lessons": [
Expand Down Expand Up @@ -632,12 +632,26 @@
"markdown": ".guide/aws/monitoring-model.md",
"actions": [
{
"label": "Run deployment pipeline",
"target": "just sagemaker-deploy"
"label": "Generate fake traffic",
"target": "just sagemaker-traffic"
},
{
"label": "Delete Sagemaker endpoint",
"target": "just sagemaker-delete"
"label": "Generate fake labels",
"target": "just sagemaker-labels"
},
{
"label": "Run Monitoring pipeline",
"target": "just sagemaker-monitor"
},
{
"label": "Run Monitoring card server",
"target": "just sagemaker-monitor-viewer",
"terminal": "Monitoring Card Server"
},
{
"label": "Open card viewer",
"action": "browser",
"target": "http://localhost:8324"
}
]
},
Expand All @@ -655,7 +669,7 @@
]
},
{
"label": "Deploying the model to SageMaker",
"label": "Deploying the model to Sagemaker",
"description": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.",
"file": "README.md",
"markdown": "README.md",
Expand Down
Loading

0 comments on commit 175f46b

Please sign in to comment.