-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: New bug in last few days for phi-3-vision. The model's max seq len (131072) is larger than the maximum number of tokens that can be stored in KV cache (50944) #5976
Comments
Basically something is wrong now that was ok before. Can't even run phi-3 vision on 80GB H100 now. |
Hi @pseudotensor! This is in fact not a bug, but a fix to a previous bug in the initial Phi-3 PR that image payload was always If you limit your |
I've also made #5981 to avoid this confusion. |
Ok, I've misunderstood max_num_seqs then. I thought hat was a max, not a required limit. So I would have expected context length to supersede the number of sequences, and so the number of sequences to be automatically reduced to accommodate my chosen context length. |
Your current environment
🐛 Describe the bug
Same launching as #5969
Only difference is hash 2cd402e (latest main as of earlier today).
GPU is totally free, so just new bug in vLLM between the e9de9dd and 2cd402e hashes
The text was updated successfully, but these errors were encountered: