Clarify how deployment to AWS works

#### Description

This is a request to enhance the documentation that describes [deploying to AWS](https://docs.cortex.dev/iris-classifier#deploy-your-model-to-aws) and calling a remote endpoint. 

#### Motivation

A lot goes on behind the scenes in the AWS deploy that does not in the local deployment. Between the `cortex cluster up` command and `cortex deploy`, it takes 15 mins and many resources are created. 

It's also not totally clear how to get the remote endpoint to call, so it might be good to remind users about the usefulness of the `cortex get sentiment-classifier` command. 

#### Additional context

Could we add more documentation? e.g. 

> 
> `cortex cluster up`
> 
> This brings up a cluster in AWS by deploying a number of CloudFormation stacks and takes about 15 mins. Additional information about deployment and resources is shown on the command line. 
> 
> After the cluster is deployed, you'll see: 
> 
> ```
> cortex is ready!
> 
> an environment named "aws" has been configured for this cluster; append `--env=aws` to cortex commands to reference it, or set it as your default with `cortex env default aws`
> ```
> 
> To get the right endpoint to call, use `cortex get`: 
> 
> ```
> cortex get sentiment-classifer --env aws
> ```
> 
> This returns a lot of useful information like: 
> 
> ```
> cortex get sentiment-classifier --env aws
> 
> status     up-to-date   stale   requested   last update   avg request   2XX
> updating   0            1       1           21s           -             -
> 
> endpoint: http://*****elb.us-east-1.amazonaws.com/sentiment-classifier
> curl: curl http://*****.elb.us-east-1.amazonaws.com/sentiment-classifier -X POST -H "Content-Type: application/json" -d @sample.json
> 
> configuration
> name: sentiment-classifier
> endpoint: /sentiment-classifier
> predictor:
>   type: python
>   path: predictor.py
>   image: cortexlabs/python-predictor-cpu:0.16.1
> compute:
>   cpu: 1
>   mem: 2G
> autoscaling:
>   min_replicas: 1
>   max_replicas: 100
>   init_replicas: 1
>   workers_per_replica: 1
>   threads_per_worker: 1
>   target_replica_concurrency: 1.0
>   max_replica_concurrency: 1024
>   window: 1m0s
>   downscale_stabilization_period: 5m0s
>   upscale_stabilization_period: 1m0s
>   max_downscale_factor: 0.75
>   max_upscale_factor: 1.5
>   downscale_tolerance: 0.05
>   upscale_tolerance: 0.05
> update_strategy:
>   max_surge: 25%
>   max_unavailable: 25%
> ```
> Then, to serve predictions in AWS: 
> ```
> curl http://***.amazonaws.com/iris-classifier \
> -X POST -H "Content-Type: application/json" \
>     -d '{"sepal_length": 5.2, "sepal_width": 3.6, "petal_length": 1.4, "petal_width": 0.3}'
> 
> "setosa"
> ```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clarify how deployment to AWS works #1098

Description

Motivation

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clarify how deployment to AWS works #1098

Description

Description

Motivation

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions