Refactor Transformers backend to use mixins #26906

hmellor · 2025-10-15T10:46:50Z

This PR refactors the Transformers backend to use mixin classes to add model specific functionality like multi-modality, mixture-of-experts or pooling.

This enables us to compose these functionalities to create new Transformers backend classes composing any of these features. The two new classes added by this PR are for multimodal embedding and multimodal sequence classification.

It also adds __getattr__ to __init__.py so that if a user requests a new combination of these functionalities, they are prompted to create an issue asking us to add it.

This PR enables #26715 by adding support for multimodal embedding.

This PR also:

Fixes M-RoPE for the Transformers backend that was recently broken (4 days ago).
Fixes [Bug]: google/embeddinggemma-300m when using transformers backend doesn't match the output of Sentence Transformers (or model_impl="vllm") #26945

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

gemini-code-assist

Code Review

This pull request is a significant and well-executed refactoring of the Transformers backend. It modularizes the backend by introducing mixin classes for different functionalities like multi-modality, Mixture-of-Experts (MoE), and pooling. This greatly improves code organization, maintainability, and extensibility. The new structure, with a dedicated transformers package, is much cleaner. I've found one potential high-severity issue related to the torch.compile configuration for multimodal models that could lead to incorrect behavior or errors.

vllm/model_executor/models/transformers/__init__.py

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

vllm/config/model.py

DarkLight1337

Thanks for the clean up, just one comment

tests/models/test_transformers.py

hmellor · 2025-10-15T11:13:09Z

Still running local testing, so no need to enable CI just yet

mergify · 2025-10-15T11:18:50Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @hmellor.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

vllm/model_executor/models/transformers/__init__.py

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor · 2025-10-15T16:33:41Z

#24172 broke Qwen VL models for the Transformers backend because it relied on MRotaryEmbedding.get_input_positions_tensor, which has now been removed.

We will forward fix this.

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor · 2025-10-15T17:52:27Z

It ended up not being too hard, I've fixed mrope for Transformers backend in this PR

hmellor · 2025-10-16T17:19:14Z

royokong/e5-v has a very old config that is not compatible with the way processors currently work in Transformers.

I've switched to the same Gemma 3 checkpoint used in the multimodal classifier test from your PR @DarkLight1337

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

hmellor added 3 commits October 15, 2025 12:16

Allow ModelConfig to create any Transformers backend cls name

156c631

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Update the Transformers backend to use mixins (+mm pooling classes)

a0ed0ca

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Use compilation in transformers test to catch dynamo issues

9cf3b62

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor requested review from DarkLight1337, ProExpertProg, WoosukKwon, houseroad, mgoin, robertgshaw2-redhat, simon-mo, tlrmchlsmth, yewentao256, youkaichao and ywang96 as code owners October 15, 2025 10:46

github-project-automation bot added this to Transformers backend Oct 15, 2025

github-project-automation bot moved this to Todo in Transformers backend Oct 15, 2025

mergify bot added ci/build deepseek Related to DeepSeek models new-model Requests to new models labels Oct 15, 2025

typo

6a2c728

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

gemini-code-assist bot reviewed Oct 15, 2025

View reviewed changes

vllm/model_executor/models/transformers/__init__.py Outdated Show resolved Hide resolved

Revert change that should not have been committed

69a5877

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

chatgpt-codex-connector bot reviewed Oct 15, 2025

View reviewed changes

vllm/config/model.py Show resolved Hide resolved

DarkLight1337 reviewed Oct 15, 2025

View reviewed changes

tests/models/test_transformers.py Show resolved Hide resolved

mergify bot added the needs-rebase label Oct 15, 2025

hmellor added 3 commits October 15, 2025 12:29

Fix syntax error

95fb99d

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Add hf_config and hf_text_config as non-init fields to ModelConfig

3d0ddf4

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Fix forward references

98297b5

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Move torch compile util function to utils.py

eda1b76

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Isotr0py reviewed Oct 15, 2025

View reviewed changes

vllm/model_executor/models/transformers/__init__.py Outdated Show resolved Hide resolved

Isotr0py approved these changes Oct 15, 2025

View reviewed changes

DarkLight1337 approved these changes Oct 15, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) October 15, 2025 15:30

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 15, 2025

hmellor added 2 commits October 15, 2025 16:14

Fix get_language_model

a39cde7

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Skip Qwen because mrope was changed recently

5af9639

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

mergify bot added the multi-modality Related to multi-modality (#4194) label Oct 15, 2025

Fix mrope support

536a2ae

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Switch to Gemma 3

c28097d

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

hmellor mentioned this pull request Oct 16, 2025

[Bug]: google/embeddinggemma-300m when using transformers backend doesn't match the output of Sentence Transformers (or model_impl="vllm") #26945

Closed

1 task

hmellor moved this from Todo to In Progress in Transformers backend Oct 16, 2025

hmellor added 2 commits October 16, 2025 21:57

Gemma 3 scaling fix

b5819a1

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

Merge branch 'main' into transformers-backend-mixins

66427a6

DarkLight1337 merged commit fb5e10d into vllm-project:main Oct 16, 2025
53 checks passed

github-project-automation bot moved this from In Progress to Done in Transformers backend Oct 16, 2025

hmellor deleted the transformers-backend-mixins branch October 16, 2025 21:51

Zhuul pushed a commit to Zhuul/vllm that referenced this pull request Oct 17, 2025

Refactor Transformers backend to use mixins (vllm-project#26906)

8dcec71

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

chaunceyjiang mentioned this pull request Oct 17, 2025

[CI] fix docs build failed #27082

Merged

5 tasks

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025

Refactor Transformers backend to use mixins (vllm-project#26906)

470b10e

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025

Refactor Transformers backend to use mixins (vllm-project#26906)

04166e4

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025

Refactor Transformers backend to use mixins (vllm-project#26906)

7357a97

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025

Refactor Transformers backend to use mixins (vllm-project#26906)

e4d385e

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Refactor Transformers backend to use mixins #26906

Refactor Transformers backend to use mixins #26906

Uh oh!

hmellor commented Oct 15, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

DarkLight1337 left a comment

Uh oh!

Uh oh!

hmellor commented Oct 15, 2025

Uh oh!

mergify bot commented Oct 15, 2025

Uh oh!

Uh oh!

hmellor commented Oct 15, 2025

Uh oh!

hmellor commented Oct 15, 2025

Uh oh!

hmellor commented Oct 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Refactor Transformers backend to use mixins #26906

Refactor Transformers backend to use mixins #26906

Uh oh!

Conversation

hmellor commented Oct 15, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hmellor commented Oct 15, 2025

Uh oh!

mergify bot commented Oct 15, 2025

Uh oh!

Uh oh!

hmellor commented Oct 15, 2025

Uh oh!

hmellor commented Oct 15, 2025

Uh oh!

hmellor commented Oct 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hmellor commented Oct 15, 2025 •

edited by github-actions bot

Loading