[aya vision] fix processor for vLLM #38371

zucchini-nlp · 2025-05-26T09:31:17Z

What does this PR do?

@DarkLight1337 , one quick question. I am trying to add a test on our side, so we don't break anything that vLLM relies on. For the case of aya-vision the error log shows inputs as (text="<image>, images=None") but doing so will fail on other processors where we have extra checks that "num input images == num image tokens"

I wonder how that is bypassed on vLLM and how can I add a test to cover assumptions vLLM has about our processors

DarkLight1337 · 2025-05-26T09:36:28Z

doing so will fail on other processors where we have extra checks that "num input images == num image tokens"

This behavior seems quite inconsistent across models. I found that most processors are able to take extra <image> tokens (etc.) in the text without problem, i.e. the condition is loosened to num_input_images <= num_image_tokens. As a result, we also adopt this loosened assumption in vLLM.

For the processors that perform this check more strictly, in vLLM we override _call_hf_processor with a special case to call the tokenizer directly if no multi-modal data is provided.

Example:

https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/idefics3.py#L321-L325

HuggingFaceDocBuilderDev · 2025-05-26T09:44:33Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp · 2025-05-26T09:51:21Z

Ah oke, that makes sense. We haven't been consistent in checking placeholder tokens in processors, and each model contributor added the model as it was in original implementation

I think in that case I can't add a robust test. I believe integration with transformers backend will auto-resolve future issues, I am planning to get it ready in this week

DarkLight1337 · 2025-05-26T09:53:21Z

Sounds good, thanks for fixing!

src/transformers/models/paligemma/processing_paligemma.py

ArthurZucker · 2025-05-26T13:04:24Z

Thanks for adding the for-patch tag as well!

DarkLight1337 · 2025-06-02T02:27:19Z

Looks like this PR didn't make it into the release :(

zucchini-nlp · 2025-06-02T11:04:24Z

Hmm, cc @ArthurZucker. On main branch it is fine, since I merged last Friday the last PR for vLLM. The last patch isn't though 🥲

ArthurZucker · 2025-06-02T11:53:22Z

It did but the diff was empty

zucchini-nlp requested a review from Cyrilvallez May 26, 2025 09:51

zucchini-nlp added the for patch Tag issues / labels that should be included in the next patch label May 26, 2025

DarkLight1337 reviewed May 26, 2025

View reviewed changes

src/transformers/models/paligemma/processing_paligemma.py Outdated Show resolved Hide resolved

accidentally merged two PRs in one (；－＿－)

94eb79a

zucchini-nlp force-pushed the vllm-ci-fix branch from 46da772 to 94eb79a Compare May 26, 2025 09:58

ArthurZucker approved these changes May 26, 2025

View reviewed changes

Merge branch 'main' into vllm-ci-fix

0059fab

zucchini-nlp enabled auto-merge (squash) May 27, 2025 09:31

zucchini-nlp merged commit 1a5be2f into huggingface:main May 27, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[aya vision] fix processor for vLLM #38371

[aya vision] fix processor for vLLM #38371

Uh oh!

zucchini-nlp commented May 26, 2025

Uh oh!

DarkLight1337 commented May 26, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented May 26, 2025

Uh oh!

zucchini-nlp commented May 26, 2025

Uh oh!

DarkLight1337 commented May 26, 2025

Uh oh!

Uh oh!

ArthurZucker commented May 26, 2025

Uh oh!

Uh oh!

DarkLight1337 commented Jun 2, 2025 •

edited

Loading

Uh oh!

zucchini-nlp commented Jun 2, 2025

Uh oh!

ArthurZucker commented Jun 2, 2025

Uh oh!

Uh oh!

[aya vision] fix processor for vLLM #38371

[aya vision] fix processor for vLLM #38371

Uh oh!

Conversation

zucchini-nlp commented May 26, 2025

What does this PR do?

Uh oh!

DarkLight1337 commented May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented May 26, 2025

Uh oh!

zucchini-nlp commented May 26, 2025

Uh oh!

DarkLight1337 commented May 26, 2025

Uh oh!

Uh oh!

ArthurZucker commented May 26, 2025

Uh oh!

Uh oh!

DarkLight1337 commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zucchini-nlp commented Jun 2, 2025

Uh oh!

ArthurZucker commented Jun 2, 2025

Uh oh!

Uh oh!

DarkLight1337 commented May 26, 2025 •

edited

Loading

DarkLight1337 commented Jun 2, 2025 •

edited

Loading