Skip to content

JobSet: a k8s native API for distributed ML training and HPC workloads

License

Notifications You must be signed in to change notification settings

danielvegamyhre/jobset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JobSet

JobSet is a Kubernetes-native API for managing a group of k8s Jobs as a unit. It aims to offer a unified API for deploying HPC (e.g., MPI) and AI/ML training workloads (PyTorch, Jax, Tensorflow etc.) on Kubernetes.

Take a look at the concepts page for a brief description of how to use JobSet.

Installation

Read the installation guide to learn more.

Troubleshooting common issues

See the FAQ for help with troubleshooting common issues.

Community, discussion, contribution, and support

Learn how to engage with the Kubernetes community on the community page.

You can reach the maintainers of this project at:

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.

About

JobSet: a k8s native API for distributed ML training and HPC workloads

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 63.5%
  • Python 27.5%
  • Shell 3.0%
  • HTML 2.7%
  • Makefile 1.7%
  • SCSS 0.9%
  • Other 0.7%