Skip to content

cortexlabs/cortex

Repository files navigation

WebsiteSlackDocs



Serverless containers on AWS

Deploy, manage, and scale containers without managing infrastructure.


Build APIs faster

  • Realtime APIs - respond to requests in real-time and autoscale based on in-flight request volumes.
  • Batch APIs - run distributed and fault-tolerant batch processing jobs on-demand.
  • Async APIs - process requests asynchronously and autoscale based on request queue length.

Scale without limits

  • No resource limits - allocate as much CPU, GPU, and memory as each workload requires.
  • No cold starts - keep a minimum number of API replicas running to ensure that requests are handled in real-time.
  • No timeouts - run workloads for as long as you want.

Reduce your AWS spend

  • Spot instance management - Cortex runs workloads on spot instances and fall back to on-demand instances to ensure reliability.
  • Multi-instance type clusters - Cortex runs different workloads on different EC2 instances to ensure efficient resource utilization.
  • Efficient autoscaling - Cortex optimizes the autoscaling behavior of each workload to minimize idle resources.

About

Production infrastructure for machine learning at scale

Topics

Resources

License

Stars

Watchers

Forks

Contributors 22