Skip to content

Conversation

@tjruwase
Copy link

@tjruwase tjruwase commented Aug 9, 2021

Calling DeepSpeedConfig() without the available mpu can lead to avoidable assertions concerning batch size and gradient accumulation steps. This avoidable assertions occur because mpu determines world size.
Paired with deepspeedai/DeepSpeed#1271.

mpatwary and others added 26 commits June 9, 2021 23:19
…com:12051/ADLR/megatron-lm into main_retriver_merge_dpr
DPR evaluation hangs and Readme

See merge request ADLR/megatron-lm!280
Update T5 scripts

See merge request ADLR/megatron-lm!279
Changes in Readme (Retriever)

See merge request ADLR/megatron-lm!281
Pull in some GitHub PRs

See merge request ADLR/megatron-lm!282
…eed into olruwase/zero_init_mpu"

This reverts commit 3e10eba, reversing
changes made to 3644a9d.
@tjruwase
Copy link
Author

tjruwase commented Aug 9, 2021

Closing because of strange commits.

@tjruwase tjruwase closed this Aug 9, 2021
saforem2 referenced this pull request in saforem2/Megatron-DeepSpeed Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants