-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DeepSpeed] Remove partitioning of model in ZeRO 3 #10655
Conversation
…to use parametrization
Seems this is not so simple. If you load the model in inference mode you cannot save (which makes sense as it's the optimizers responsibility to save the ZeRO sharded weights). Do we think that the case where the user does not call fit before saving should be supported? trianer = Trainer(...)
trainer.test(model)
trainer.save_checkpoint(...) If so I think it's better we keep the optimizer code as is and use the |
Even if we do that, we do not force users to define configure_optimizers during inference so possibility of not getting an optimizer still exists. So there are 2 cases now:
Maybe you can add an error/warning for the case with no optimizer and saving under deepspeed config. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM !
Pulled out the other issue, will make a separate PR for the deepspeed inference/optimizer fix. |
# Conflicts: # CHANGELOG.md # pytorch_lightning/plugins/training_type/deepspeed.py # tests/plugins/test_deepspeed_plugin.py
This PR will partially fix the related issue, but there is still the case of partially defined partitioned weights to tackle. |
(cherry picked from commit c66cd12)
(cherry picked from commit c66cd12)
What does this PR do?
Partially Fixes #10510
Also adds parametrization to a few tests since it's supported, and handles deprecated args with the latest DeepSpeed.
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃
cc @Borda @justusschock @awaelchli @akihironitta @SeanNaren