You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It is something related to this I think, maybe not much needs to be done here, just implement this code , I will try to test if it doesn't breaks anything else , here is the git in vllm for this feature vllm-project/vllm#4638
Currently, we auto-scale using the
--max-model-len
argument. It may be more appropriate to have specific options for the scaling factor, etc.The text was updated successfully, but these errors were encountered: