freeze modules does not reduce memory in dllib #8884

emartinezs44 · 2023-09-04T17:13:23Z

Some approaches like Lora have the objective of reducing the memory in the training phase, but in dllib it does not work as expected. Freezing the 99% of the model consumes the same memory as the model without freezing any node. It seems that the only thing that freeze does is to avoid to update the weights, but using more threads increases the memory and it sohuldn´t because weights. Any suggestion of changing this behaviour?

qiuxin2012 · 2023-09-06T02:28:33Z

I just check the code, like linear, the gradWeight is resized to (outputSize, inputSize), but the values are all zeros.
The memory comsumption is as the same as the unfreezed module.
I will info you if I find any solution.

emartinezs44 · 2023-09-06T08:15:02Z

It is normal, the problem happens if you start more threads than one in training phase. Every thread creates a copy of the output of each module in the forward phase, no matter is frozen or not and that is the problem. These weights are frozen, every thread doesn´t need to store a copy of the result of forward phase in every frozen layer because its weights won´t change. Besides, it sould be the same in weigths sync. So the changes are not trivial. I will keep the ticket open if you have any suggestion.

liu-shaojun added the user issue label Sep 5, 2023

liu-shaojun assigned qiuxin2012 Sep 5, 2023

shane-huang added the DLlib label Jan 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

freeze modules does not reduce memory in dllib #8884

freeze modules does not reduce memory in dllib #8884

emartinezs44 commented Sep 4, 2023

qiuxin2012 commented Sep 6, 2023

emartinezs44 commented Sep 6, 2023

freeze modules does not reduce memory in dllib #8884

freeze modules does not reduce memory in dllib #8884

Comments

emartinezs44 commented Sep 4, 2023

qiuxin2012 commented Sep 6, 2023

emartinezs44 commented Sep 6, 2023