Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.
-
Updated
Sep 19, 2025 - Shell
Collection of best practices, reference architectures, model training examples and utilities to train large models on AWS.
Experimental scripts for Amazon SageMaker HyperPod
Infrastructure deployment automation of SageMaker HyperPod clusters based on EKS and SLURM orchestration and Protein Language ESM-2 model training job definitions including NVIDIA BioNemo
Add a description, image, and links to the hyperpod topic page so that developers can more easily learn about it.
To associate your repository with the hyperpod topic, visit your repo's landing page and select "manage topics."