Skip to content

Clarify how deployment to AWS works #1098

Closed
@ssami

Description

@ssami

Description

This is a request to enhance the documentation that describes deploying to AWS and calling a remote endpoint.

Motivation

A lot goes on behind the scenes in the AWS deploy that does not in the local deployment. Between the cortex cluster up command and cortex deploy, it takes 15 mins and many resources are created.

It's also not totally clear how to get the remote endpoint to call, so it might be good to remind users about the usefulness of the cortex get sentiment-classifier command.

Additional context

Could we add more documentation? e.g.

cortex cluster up

This brings up a cluster in AWS by deploying a number of CloudFormation stacks and takes about 15 mins. Additional information about deployment and resources is shown on the command line.

After the cluster is deployed, you'll see:

cortex is ready!

an environment named "aws" has been configured for this cluster; append `--env=aws` to cortex commands to reference it, or set it as your default with `cortex env default aws`

To get the right endpoint to call, use cortex get:

cortex get sentiment-classifer --env aws

This returns a lot of useful information like:

cortex get sentiment-classifier --env aws

status     up-to-date   stale   requested   last update   avg request   2XX
updating   0            1       1           21s           -             -

endpoint: http://*****elb.us-east-1.amazonaws.com/sentiment-classifier
curl: curl http://*****.elb.us-east-1.amazonaws.com/sentiment-classifier -X POST -H "Content-Type: application/json" -d @sample.json

configuration
name: sentiment-classifier
endpoint: /sentiment-classifier
predictor:
  type: python
  path: predictor.py
  image: cortexlabs/python-predictor-cpu:0.16.1
compute:
  cpu: 1
  mem: 2G
autoscaling:
  min_replicas: 1
  max_replicas: 100
  init_replicas: 1
  workers_per_replica: 1
  threads_per_worker: 1
  target_replica_concurrency: 1.0
  max_replica_concurrency: 1024
  window: 1m0s
  downscale_stabilization_period: 5m0s
  upscale_stabilization_period: 1m0s
  max_downscale_factor: 0.75
  max_upscale_factor: 1.5
  downscale_tolerance: 0.05
  upscale_tolerance: 0.05
update_strategy:
  max_surge: 25%
  max_unavailable: 25%

Then, to serve predictions in AWS:

curl http://***.amazonaws.com/iris-classifier \
-X POST -H "Content-Type: application/json" \
    -d '{"sepal_length": 5.2, "sepal_width": 3.6, "petal_length": 1.4, "petal_width": 0.3}'

"setosa"

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions