You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When I use the fine-tuned LLAMA3 model to run the examples/raft_align.py script, I encountered the following error:
Traceback (most recent call last):
File "/home/work/user-job-dir/app/liubiao/llm/LMflow/examples/raft_align.py", line 220, in <module>
main()
File "/home/work/user-job-dir/app/liubiao/llm/LMflow/examples/raft_align.py", line 183, in main
outputs = model.generate(**inputs, **generation_kwargs)
File "/home/naie/.local/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/naie/.local/lib/python3.9/site-packages/transformers/generation/utils.py", line 1758, in generate
result = self._sample(
File "/home/naie/.local/lib/python3.9/site-packages/transformers/generation/utils.py", line 2397, in _sample
outputs = self(
File "/home/naie/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/naie/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/naie/.local/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 1164, in forward
outputs = self.model(
File "/home/naie/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/naie/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/naie/.local/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 968, in forward
layer_outputs = decoder_layer(
File "/home/naie/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/naie/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/naie/.local/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 713, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/home/naie/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/naie/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/naie/.local/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 331, in forward
query_states = query_states.view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
RuntimeError: shape '[2, 206, 32, 128]' is invalid for input of size 412
However, when I use the following test script, the text is generated successfully during the generate step without any errors:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
model_name = "/home/work/user-job-dir/app/liubiao/huggingface/merge_instruct_llama3_sft"
tokenizer_name = "/home/naie/work/liubiao/huggingface/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(tokenizer_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
device = "npu"
model.to(device)
input_texts = ["<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nShould you buy a case to protect your cell phone?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nIt depends on your circumstances. If you carry your phone in a pocket or a purse then you probably want a case. But if you only need a phone for quick interactions, a case may actually cause more harm than good. What do you need the phone for? Are you a parent, or do you work from home?<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nWhat harm could it do?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nA phone case can damage the screen, for one thing. It can also get you in trouble if you have your phone turned off for some reason. Then you will turn it back on and it won’t do anything. If you can afford to replace it, then you need a case to protect it. The problem is that most people aren’t able to afford to replace their phones all the time.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nThanks for letting me know.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n"] * 2
# generation_kwargs = {
# "max_new_tokens": 50
# }
stop_token = "<|eot_id|>"
stop_token_id = tokenizer.encode(stop_token)[0]
# tokenizer.add_special_tokens({"eos_token": "<|eot_id|>"})
# print(tokenizer.eos_token)
generation_kwargs = {
"max_new_tokens": 96,
"min_length": 1,
"top_k": 0.0,
"top_p": 1.0,
"do_sample": True,
"pad_token_id": tokenizer.eos_token_id,
"eos_token_id": stop_token_id,
"temperature":0.85,
"repetition_penalty": 1.2
}
tokenizer.pad_token = tokenizer.eos_token
inputs = tokenizer(input_texts, return_tensors="pt", padding=True).to(device)
print("Input IDs size:", inputs["input_ids"].size())
with torch.no_grad():
outputs = model.generate(**inputs, **generation_kwargs)
print("Generated Outputs size:", outputs.size())
outputs = outputs.cpu()
generated_texts = tokenizer.batch_decode(outputs, skip_special_tokens=True)
for i, text in enumerate(generated_texts):
print(f"Generated text {i+1}: {text}")
Expected behavior
Text is generated successfully during the Raft step.
The text was updated successfully, but these errors were encountered:
It seems to be a problem with DeepSpeed. When I use the zero3 mode, model.generate does not work properly. However, when I use multi_gpu mode, it works well.
Describe the bug
When I use the fine-tuned LLAMA3 model to run the
examples/raft_align.py
script, I encountered the following error:This is my running script:
However, when I use the following test script, the text is generated successfully during the generate step without any errors:
Expected behavior
Text is generated successfully during the Raft step.
The text was updated successfully, but these errors were encountered: