Cortex is a highly-scalable serverless computing platform that is up to 90% less expensive than AWS Lambda. It is designed for scaling compute-intensive realtime, async, and batch workloads on your AWS account. Organizations worldwide use Cortex to scale microservices, data processing, machine learning, and other applications in production.
Workload autoscaling - set autoscaling policies per workload based on its traffic.
Resource requests - configure CPU, GPU, and memory requests per workload, without limits.
Container deployments - customize the runtime and request concurrency for each container.
Cluster autoscaling - elastically scale your cluster to meet demand.
Spot instances - run workloads on spot instances without sacrificing reliability.
Multi-instance - use multiple instance types to optimize price-performance ratio per workload.
Metrics dashboard - monitor latency and resource utilization with pre-built dashboards.
Spend visibility - visualize your spend using the latest AWS pricing information.
Resource limits - set limits on resource consumption globally and per workload.