-
-
Notifications
You must be signed in to change notification settings - Fork 8.4k
[Model] Aya Vision #15441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model] Aya Vision #15441
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Signed-off-by: Jennifer Zhao <ai.jenniferzhao@gmail.com>
Ready for review. It took me something to get the Do you know how could I get the max_model_len for both 8b and 32b? Thank you! |
Hi! Also, #15441 (comment) |
Added the key: https://huggingface.co/CohereForAI/aya-vision-8b/blob/main/config.json#L26 |
thank you! testing it now |
@saurabhdash just to confirm the https://huggingface.co/CohereForAI/aya-vision-32b/blob/main/config.json#L22 |
IIrc, back when we merged the first commandR model, we wanted to enable 128k context on vLLM while keeping it capped to 8k for HF -- thus we came about the 2 different lengths. I believe |
in that case I think this vllm logic needs to be updated Lines 2693 to 2719 in f98a492
model_max_length only
|
Signed-off-by: Jennifer Zhao <ai.jenniferzhao@gmail.com>
all updated and tested again
@saurabhdash I wonder if we should verify on your eval as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great work! I left some final nits.
trust_remote_code=True) | ||
messages = [[{ | ||
'role': 'user', | ||
'content': f"<image>\n{question}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JenZhao Can we actually change this to the plain prompt? We have it already in the test pipeline too.
Signed-off-by: Jennifer Zhao <ai.jenniferzhao@gmail.com>
updated
|
|
ci needs the aya model access
|
I can verify this today to make sure things look good! Looking at the generations, things should be okay but would be nice to confirm. |
Thank you! Please let me know if you notice any discrepancies or regression in the metrics. |
Signed-off-by: Jennifer Zhao <ai.jenniferzhao@gmail.com> Signed-off-by: Roger Wang <ywang@roblox.com> Co-authored-by: Roger Wang <ywang@roblox.com> Signed-off-by: xinyuxiao <xinyuxiao2024@gmail.com>
Signed-off-by: Jennifer Zhao <ai.jenniferzhao@gmail.com> Signed-off-by: Roger Wang <ywang@roblox.com> Co-authored-by: Roger Wang <ywang@roblox.com> Signed-off-by: Louis Ulmer <ulmerlouis@gmail.com>
Signed-off-by: Jennifer Zhao <ai.jenniferzhao@gmail.com> Signed-off-by: Roger Wang <ywang@roblox.com> Co-authored-by: Roger Wang <ywang@roblox.com>
Signed-off-by: Jennifer Zhao <ai.jenniferzhao@gmail.com> Signed-off-by: Roger Wang <ywang@roblox.com> Co-authored-by: Roger Wang <ywang@roblox.com>
Signed-off-by: Jennifer Zhao <ai.jenniferzhao@gmail.com> Signed-off-by: Roger Wang <ywang@roblox.com> Co-authored-by: Roger Wang <ywang@roblox.com> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
CLOSES #14216
Introduction
This PR introduces support for the Aya Vision models by CohereForAI. Aya Vision models excel in multilingual and multimodal tasks, significantly advancing performance in vision-language understanding.
Supported models:
For more details on Aya Vision training, see: A Deep Dive into Aya Vision: Advancing the Frontier of Multilingual Multimodality
Example Usage
Example inference with single or multiple images:
Single image inference:
Multi-image inference:
Serving Aya Vision 32B model: