Skip to content

Conversation

@DarkLight1337
Copy link
Member

@DarkLight1337 DarkLight1337 commented Oct 13, 2025

Purpose

Since vLLM doesn't support the special attention mask used by PaliGemma and Gemma3-MM (not to be confused with Gemma3n), this PR removes our custom implementations so Transformers backend is used for these models.

cc @hmellor

@NickLucche it would be great if you could test if gemma3 works with Transformers backend on TPU!

Test Plan

Transformers backend tests should pass.

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
@DarkLight1337 DarkLight1337 requested a review from hmellor October 13, 2025 16:25
@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 13, 2025
@mergify
Copy link

mergify bot commented Oct 13, 2025

Documentation preview: https://vllm--26715.org.readthedocs.build/en/26715/

@mergify mergify bot added documentation Improvements or additions to documentation multi-modality Related to multi-modality (#4194) new-model Requests to new models rocm Related to AMD ROCm labels Oct 13, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly identifies that the custom vLLM implementations for PaliGemma and Gemma3-MM do not properly handle their special attention masks. The solution to remove these custom implementations and fall back to the Hugging Face Transformers backend is a sound approach that prioritizes correctness. The changes are implemented thoroughly, with corresponding updates to model registries, documentation, and test suites. Notably, the removal of now-obsolete skipped tests and the addition of new tests for the Transformers backend demonstrate good testing practices. I find no high or critical issues in this pull request; it is a solid improvement.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
@DarkLight1337 DarkLight1337 changed the title [Chore] [Chore] Always use Transformers backend for PaliGemma and Gemma3-MM Oct 13, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
@DarkLight1337 DarkLight1337 removed the ready ONLY add when PR is ready to merge/full CI is needed label Oct 13, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
@DarkLight1337 DarkLight1337 changed the title [Chore] Always use Transformers backend for PaliGemma and Gemma3-MM [Model] Always use Transformers backend for PaliGemma and Gemma3-MM Oct 13, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
@DarkLight1337
Copy link
Member Author

@hmellor @zucchini-nlp Seems that do_pan_and_scan=True doesn't work for the Transformers backend of Gemma 3, can you take a look?

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Copy link
Member

@hmellor hmellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zucchini-nlp
Copy link
Contributor

Yeah, option with do_pan_and_scan=True is not coded for the utility functions since the official released checkpoints have it set as False. I haven't seen much usage of this flag personally, but I can add the code in transformers repo if we want to support it

Though it'll be in v5 release with several breaking changes

@mergify mergify bot added the needs-rebase label Oct 16, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
@mergify mergify bot removed the needs-rebase label Oct 17, 2025
@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 17, 2025
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) October 17, 2025 03:08
@DarkLight1337 DarkLight1337 merged commit 8c017b3 into vllm-project:main Oct 17, 2025
56 checks passed
@DarkLight1337 DarkLight1337 deleted the drop-gemma branch October 17, 2025 05:03
@github-project-automation github-project-automation bot moved this from In Progress to Done in Transformers backend Oct 17, 2025
Zhuul pushed a commit to Zhuul/vllm that referenced this pull request Oct 17, 2025
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
vllm-bot pushed a commit that referenced this pull request Oct 22, 2025
#27309)

Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
JorgenTrondsen pushed a commit to JorgenTrondsen/vllm that referenced this pull request Oct 22, 2025
…mma3-MM impl… (vllm-project#27309)

Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Signed-off-by: jorgentrondsen <jrtrondsen@gmail.com>
JorgenTrondsen pushed a commit to JorgenTrondsen/vllm that referenced this pull request Oct 22, 2025
…mma3-MM impl… (vllm-project#27309)

Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Signed-off-by: jorgentrondsen <jrtrondsen@gmail.com>
JorgenTrondsen pushed a commit to JorgenTrondsen/vllm that referenced this pull request Oct 22, 2025
…mma3-MM impl… (vllm-project#27309)

Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Signed-off-by: jorgentrondsen <jrtrondsen@gmail.com>
usberkeley pushed a commit to usberkeley/vllm that referenced this pull request Oct 23, 2025
…mma3-MM impl… (vllm-project#27309)

Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
albertoperdomo2 pushed a commit to albertoperdomo2/vllm that referenced this pull request Oct 23, 2025
…llm-project#26715)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
albertoperdomo2 pushed a commit to albertoperdomo2/vllm that referenced this pull request Oct 23, 2025
…mma3-MM impl… (vllm-project#27309)

Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
…llm-project#26715)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
…llm-project#26715)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
kingsmad pushed a commit to kingsmad/vllm that referenced this pull request Oct 25, 2025
…mma3-MM impl… (vllm-project#27309)

Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…llm-project#26715)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…mma3-MM impl… (vllm-project#27309)

Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…llm-project#26715)

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…mma3-MM impl… (vllm-project#27309)

Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Chenyaaang pushed a commit to Chenyaaang/vllm that referenced this pull request Oct 28, 2025
…mma3-MM impl… (vllm-project#27309)

Signed-off-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Co-authored-by: Luciano Martins <lucianommartins@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation multi-modality Related to multi-modality (#4194) new-model Requests to new models ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants