Cortex is a highly-scalable and cost-effective serverless computing platform that runs on your AWS account. It scales microservices, data processing, machine learning, and other compute-intensive realtime and batch workloads. Cortex is designed to handle production traffic of up to 20M QPS and is up to 90% less expensive than AWS Lambda.
Workload autoscaling - set autoscaling policies per workload based on its traffic.
Resource requests - configure CPU, GPU, and memory requests per workload, without limits.
Container deployments - customize the runtime and request concurrency for each container.
Cluster autoscaling - elastically scale your cluster to meet demand.
Spot instances - run workloads on spot instances without sacrificing reliability.
Multi-instance - use multiple instance types to optimize price-performance ratio per workload.
Workload observability - monitor latency and resource utilization with pre-built dashboards.
Cost transparency - visualize your costs using the latest AWS pricing information.
Predictable spend - set limits on resource consumption globally and per workload.