-
-
Couldn't load subscription status.
- Fork 10.9k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Your current environment
ValueError: The decoder prompt (length 8274) is longer than the maximum model length of 4096. Make sure that `max_model_len` is no smaller than the number of text tokens.
🐛 Describe the bug
I used 'reward' task of Qwen2.5-Math-RM-72B to process long prompts which is normal in the original huggingface implementation, but get the above error in vllm. Rope scaling can enable a normal run, but i'm not sure this is right. BTW, i tried to set max_model_len, no error is reported but can get nan tensor, which is wired. Also, i check the original config of Qwen2.5-Math-RM-72B, the max position embedding is indeed 4096, and i use their tokenizer to print rm_tokenizer.model_max_length, the result is 131072. So I'm really confused what is wrong...
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working