Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

freeze modules does not reduce memory in dllib #8884

Open
emartinezs44 opened this issue Sep 4, 2023 · 2 comments
Open

freeze modules does not reduce memory in dllib #8884

emartinezs44 opened this issue Sep 4, 2023 · 2 comments
Assignees

Comments

@emartinezs44
Copy link

Some approaches like Lora have the objective of reducing the memory in the training phase, but in dllib it does not work as expected. Freezing the 99% of the model consumes the same memory as the model without freezing any node. It seems that the only thing that freeze does is to avoid to update the weights, but using more threads increases the memory and it sohuldn´t because weights. Any suggestion of changing this behaviour?

@qiuxin2012
Copy link
Contributor

I just check the code, like linear, the gradWeight is resized to (outputSize, inputSize), but the values are all zeros.
The memory comsumption is as the same as the unfreezed module.
I will info you if I find any solution.

@emartinezs44
Copy link
Author

It is normal, the problem happens if you start more threads than one in training phase. Every thread creates a copy of the output of each module in the forward phase, no matter is frozen or not and that is the problem. These weights are frozen, every thread doesn´t need to store a copy of the result of forward phase in every frozen layer because its weights won´t change. Besides, it sould be the same in weigths sync. So the changes are not trivial. I will keep the ticket open if you have any suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants