Skip to content

A multi-cluster batch queuing system for high-throughput workloads on Kubernetes.

License

Notifications You must be signed in to change notification settings

behouba/armada

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CircleCI Go Report Card

Armada

Armada is a multi-Kubernetes cluster batch job scheduler.

Armada is designed to address the following issues:

  1. A single Kubernetes cluster can not be scaled indefinitely, and managing very large Kubernetes clusters is challenging. Hence, Armada is a multi-cluster scheduler built on top of several Kubernetes clusters.
  2. Acheiving very high throughput using the in-cluster storage backend, etcd, is challenging. Hence, queueing and scheduling is performed partly out-of-cluster using a specialized storage layer.

Armada is designed primarily for machine learning, AI, and data analytics workloads, and to:

  • Manage compute clusters composed of tens of thousands of nodes in total.
  • Schedule a thousand or more pods per second, on average.
  • Enqueue tens of thousands of jobs over a few seconds.
  • Divide resources fairly between users.
  • Provide visibility for users and admins.
  • Ensure near-constant uptime.

Armada is a CNCF Sandbox project used in production at G-Research.

For an overview of Armada, see these videos:

Armada adheres to the CNCF Code of Conduct.

Documentation

For an overview of the architecture and design of Armada, and instructions for submitting jobs, see:

There are two methods of setting Armada up for local development:

For API reference, see:

We expect readers of the documentation to have a basic understanding of Docker and Kubernetes; see, e.g., the following links:

About

A multi-cluster batch queuing system for high-throughput workloads on Kubernetes.

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 78.9%
  • TypeScript 9.1%
  • C# 8.7%
  • Python 1.5%
  • Makefile 0.6%
  • CSS 0.3%
  • Other 0.9%