Ph.D. student at The University of Hong Kong
- The University of Hong Kong
Highlights
- Pro
Pinned Loading
-
awslabs/optimizing-multitask-training-through-dynamic-pipelines
awslabs/optimizing-multitask-training-through-dynamic-pipelines PublicOfficial repository for the paper DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines
-
Megatron-LM
Megatron-LM PublicForked from NVIDIA/Megatron-LM
Artifact for DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines
-
awslabs/Lancet-Accelerating-MoE-Training-via-Whole-Graph-Computation-Communication-Overlapping
awslabs/Lancet-Accelerating-MoE-Training-via-Whole-Graph-Computation-Communication-Overlapping PublicOfficial implementation for the paper Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapping, published in MLSys'24.
-
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.