-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[New Model]: Qwen2-VL #8139
Comments
It will be released once |
@DarkLight1337 it is supported in transformers: https://huggingface.co/docs/transformers/main/en/model_doc/qwen2_vl#qwen2vl |
I mean we need to wait until they release a new version with the change. It is not in v4.44.2. |
@DarkLight1337
but where should I get the image_embeds dynamically? Thanks for help |
This code is designed for precomputed embedding inputs. You can get the embeddings by running just the Qwen2-VL visual encoder + projection on images/videos (outside of vLLM) to get their visual token embeddings. If you have a mechanism to cache the embeddings of particular input images/videos, this can speed up inference as you don't need to run the visual encoder again. Most users won't be using this though. |
Thanks for the detailed explanation |
Process SpawnProcess-1: |
Please install vLLM from source as mentioned in #7905. |
Please use the latest version of vLLM. It supports Qwen2-VL without this error. |
@DarkLight1337 |
You may increase |
The model to consider.
https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct
The closest model vllm already supports.
https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/qwen2.py
What's your difficulty of supporting the model you want?
No response
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: