install • docs • examples • contact (we'll respond quickly)
Cortex is a machine learning deployment platform that runs in your AWS account. It takes exported models from S3 and deploys them as web APIs. It handles autoscaling, rolling updates, log streaming, inference on CPUs or GPUs, and more.
Cortex is maintained by a venture-backed team of infrastructure engineers and we're hiring.
Define your deployment using declarative configuration:
# cortex.yaml
- kind: api
name: my-api
model: s3://my-bucket/my-model.onnx
request_handler: handler.py
compute:
gpu: 1
Customize request handling:
# handler.py
# Load data for preprocessing or postprocessing. For example:
labels = download_labels_from_s3()
def pre_inference(sample, metadata):
# Python code
def post_inference(prediction, metadata):
# Python code
Deploy to AWS:
$ cortex deploy
Deploying ...
http://***.amazonaws.com/my-api # Your API is ready!
Serve real-time predictions via autoscaling JSON APIs:
$ curl http://***.amazonaws.com/my-api -d '{"a": 1, "b": 2, "c": 3}'
{ prediction: "def" }
# Download the install script
$ curl -O https://raw.githubusercontent.com/cortexlabs/cortex/0.7/cortex.sh && chmod +x cortex.sh
# Install the Cortex CLI on your machine
$ ./cortex.sh install cli
# Set your AWS credentials
$ export AWS_ACCESS_KEY_ID=***
$ export AWS_SECRET_ACCESS_KEY=***
# Configure AWS instance settings
$ export CORTEX_NODE_TYPE="p2.xlarge"
$ export CORTEX_NODES_MIN="1"
$ export CORTEX_NODES_MAX="3"
# Provision infrastructure on AWS and install Cortex
$ ./cortex.sh install
-
Minimal declarative configuration: Deployments can be defined in a single
cortex.yaml
file. -
Autoscaling: Cortex can automatically scale APIs to handle production workloads.
-
Multi framework: Cortex supports TensorFlow, Keras, PyTorch, Scikit-learn, XGBoost, and more.
-
Rolling updates: Cortex updates deployed APIs without any downtime.
-
Log streaming: Cortex streams logs from your deployed models to your CLI.
-
Prediction Monitoring: Cortex can monitor network metrics and track predictions.
-
CPU / GPU support: Cortex can run inference on CPU or GPU infrastructure.
-
Text generation with GPT-2
-
Sentiment analysis with BERT
-
Image classification with ImageNet