You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current implementation is based on Megatron and only supports GPUs. Now that we’re migrating fairseq to this implementation, we should add TPU support here as well.
Motivation
Several fairseq users and internal projects would benefit from TPU support. For example, see facebookresearch/fairseq#2503.
Pitch
Replace all the CUDA-specific calls with device-agnostic versions that are compatible with PyTorch/XLA.
Alternatives
Not support TPUs.
Additional context
I have a preliminary version of the needed changes, but they are based on an old version of Megatron, so would need to be rebased over the (newer) Megatron fork in fairscale.
The text was updated successfully, but these errors were encountered:
myleott
changed the title
Support TPUs (PyTorch/XLA) for model parallel training
Support TPUs (PyTorch/XLA) for intra-layer model parallel training
Sep 14, 2020
myleott
changed the title
Support TPUs (PyTorch/XLA) for intra-layer model parallel training
Support TPUs for intra-layer model parallel training
Sep 14, 2020
min-xu-ai
changed the title
Support TPUs for intra-layer model parallel training
[feat] Support TPUs for intra-layer model parallel training
Dec 16, 2020
* two small changes
- link to deepspeed in the doc
- added a quick test in the init to catch a common user error
* addressed comments
* remove an overly strong assert
* addressed comments
🚀 Feature
The current implementation is based on Megatron and only supports GPUs. Now that we’re migrating fairseq to this implementation, we should add TPU support here as well.
Motivation
Several fairseq users and internal projects would benefit from TPU support. For example, see facebookresearch/fairseq#2503.
Pitch
Replace all the CUDA-specific calls with device-agnostic versions that are compatible with PyTorch/XLA.
Alternatives
Not support TPUs.
Additional context
I have a preliminary version of the needed changes, but they are based on an old version of Megatron, so would need to be rebased over the (newer) Megatron fork in fairscale.
The text was updated successfully, but these errors were encountered: