Engine side fix for loading llama checkpoint fine-tuned with zero3 #3981

minjiaz · 2023-07-18T19:01:02Z

The loading of the Llama checkpoint is broken if we fine-tune it with zero-3. This issue prevents the loading of the Llama fine-tuned actor and critic models in DS-Chat for PPO stage 3 fine-tuning. The problem has been resolved by making the Llama container policy aware of zero-3.

deepspeed/module_inject/containers/llama.py

XuehaiPan · 2023-07-24T17:30:35Z

Hi, the DeepSpeed Team, I encountered the same issue while enabling HybridEngine for the LLaMA model with ZeRO-3.

After several hours of debugging, I found a similar fix:

    def get_hidden_heads(self):
-       return self.client_module.self_attn.q_proj.weight.shape[1], \
+       return self.client_module.self_attn.q_proj.in_features, \
                self.client_module.self_attn.num_heads, \
                self.client_module.input_layernorm.variance_epsilon, \
-               self.client_module.mlp.gate_proj.weight.shape[0]
+               self.client_module.mlp.gate_proj.out_features

XuehaiPan · 2023-07-24T17:35:47Z

deepspeed/module_inject/containers/llama.py

+        try: # for zero stage 3
+            return self.client_module.self_attn.q_proj.weight.ds_shape[1], \
+            self.client_module.self_attn.num_heads, \
+            self.client_module.input_layernorm.variance_epsilon, \
+            self.client_module.mlp.gate_proj.weight.ds_shape[0]
+        except:
+            return self.client_module.self_attn.q_proj.weight.shape[1], \
+                    self.client_module.self_attn.num_heads, \
+                    self.client_module.input_layernorm.variance_epsilon, \
+                    self.client_module.mlp.gate_proj.weight.shape[0]


It would be nicer with temp variable + getattr.

Suggested change

try: # for zero stage 3

return self.client_module.self_attn.q_proj.weight.ds_shape[1], \

self.client_module.self_attn.num_heads, \

self.client_module.input_layernorm.variance_epsilon, \

self.client_module.mlp.gate_proj.weight.ds_shape[0]

except:

return self.client_module.self_attn.q_proj.weight.shape[1], \

self.client_module.self_attn.num_heads, \

self.client_module.input_layernorm.variance_epsilon, \

self.client_module.mlp.gate_proj.weight.shape[0]

q_proj_weight = self.client_module.self_attn.q_proj.weight

gate_proj_weight = self.client_module.mlp.gate_proj.weight

return getattr(q_proj_weight, "ds_shape", q_proj_weight.shape)[1], \

self.client_module.self_attn.num_heads, \

self.client_module.input_layernorm.variance_epsilon, \

getattr(gate_proj_weight, "ds_shape", gate_proj_weight.shape)[0]

deepspeed/module_inject/containers/llama.py

…icrosoft#3981) * Engine side fix for loading llama checkpoint fine-tuned with zero3 * Fixes to support llama fine-tuning in ds-chat * Refactored the code to avoid using an except block. * formatting * revert permissions change --------- Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>

Engine side fix for loading llama checkpoint fine-tuned with zero3

056d959

minjiaz requested review from RezaYazdaniAminabadi, jeffra, mrwyattii, awan-10, cmikeh2 and arashb as code owners July 18, 2023 19:01

mrwyattii reviewed Jul 18, 2023

View reviewed changes

deepspeed/module_inject/containers/llama.py Outdated Show resolved Hide resolved

XuehaiPan reviewed Jul 24, 2023

View reviewed changes

XuehaiPan mentioned this pull request Jul 24, 2023

Pass missing positional arguments in DeepSpeedHybridEngine.generate() #4026

Merged

minjiaz added 2 commits July 25, 2023 17:38

Fixes to support llama fine-tuning in ds-chat

fe7503b

Refactored the code to avoid using an except block.

7a8088a

XuehaiPan reviewed Jul 25, 2023

View reviewed changes

deepspeed/module_inject/containers/llama.py Show resolved Hide resolved

mrwyattii added 2 commits July 26, 2023 13:52

formatting

43787da

revert permissions change

aefe074

mrwyattii approved these changes Jul 26, 2023

View reviewed changes

mrwyattii enabled auto-merge July 26, 2023 22:16

mrwyattii added this pull request to the merge queue Jul 26, 2023

Merged via the queue into master with commit 15f94ae Jul 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Engine side fix for loading llama checkpoint fine-tuned with zero3 #3981

Engine side fix for loading llama checkpoint fine-tuned with zero3 #3981

minjiaz commented Jul 18, 2023

XuehaiPan commented Jul 24, 2023

XuehaiPan Jul 24, 2023

Engine side fix for loading llama checkpoint fine-tuned with zero3 #3981

Engine side fix for loading llama checkpoint fine-tuned with zero3 #3981

Conversation

minjiaz commented Jul 18, 2023

XuehaiPan commented Jul 24, 2023

XuehaiPan Jul 24, 2023

Choose a reason for hiding this comment