-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split parameter offload from z3 #2009
Conversation
I can confirm that with this PR, I'm able to run CPU-offload on a 6B parameters models with limited RAM (32GB). Usage peaks at 24GB before going down to a constant 20GB. The same test script crashes due to OOM on the last version of DeepSpeed. |
Thank you for validating the fix, @sgugger! Tunji, is it good to merge? as quite a few users are impacted by this memory issue and your PR would help to many. Thank you! |
…pSpeed into olruwase/parameter_offload
Parameter offloading can live separately from stage 3 optimizer, right?