Skip to content

Conversation

@hmellor
Copy link
Member

@hmellor hmellor commented Oct 15, 2025

This PR refactors the Transformers backend to use mixin classes to add model specific functionality like multi-modality, mixture-of-experts or pooling.

This enables us to compose these functionalities to create new Transformers backend classes composing any of these features. The two new classes added by this PR are for multimodal embedding and multimodal sequence classification.

It also adds __getattr__ to __init__.py so that if a user requests a new combination of these functionalities, they are prompted to create an issue asking us to add it.


This PR enables #26715 by adding support for multimodal embedding.


This PR also:

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a significant and well-executed refactoring of the Transformers backend. It modularizes the backend by introducing mixin classes for different functionalities like multi-modality, Mixture-of-Experts (MoE), and pooling. This greatly improves code organization, maintainability, and extensibility. The new structure, with a dedicated transformers package, is much cleaner. I've found one potential high-severity issue related to the torch.compile configuration for multimodal models that could lead to incorrect behavior or errors.

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clean up, just one comment

@hmellor
Copy link
Member Author

hmellor commented Oct 15, 2025

Still running local testing, so no need to enable CI just yet

@mergify
Copy link

mergify bot commented Oct 15, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @hmellor.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Oct 15, 2025
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) October 15, 2025 15:30
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 15, 2025
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
@mergify mergify bot added the multi-modality Related to multi-modality (#4194) label Oct 15, 2025
@hmellor
Copy link
Member Author

hmellor commented Oct 15, 2025

#24172 broke Qwen VL models for the Transformers backend because it relied on MRotaryEmbedding.get_input_positions_tensor, which has now been removed.

We will forward fix this.

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
@hmellor
Copy link
Member Author

hmellor commented Oct 15, 2025

It ended up not being too hard, I've fixed mrope for Transformers backend in this PR

@hmellor
Copy link
Member Author

hmellor commented Oct 16, 2025

royokong/e5-v has a very old config that is not compatible with the way processors currently work in Transformers.

I've switched to the same Gemma 3 checkpoint used in the multimodal classifier test from your PR @DarkLight1337

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
@DarkLight1337 DarkLight1337 merged commit fb5e10d into vllm-project:main Oct 16, 2025
53 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Transformers backend Oct 16, 2025
@hmellor hmellor deleted the transformers-backend-mixins branch October 16, 2025 21:51
Zhuul pushed a commit to Zhuul/vllm that referenced this pull request Oct 17, 2025
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
@chaunceyjiang chaunceyjiang mentioned this pull request Oct 17, 2025
5 tasks
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
albertoperdomo2 pushed a commit to albertoperdomo2/vllm that referenced this pull request Oct 23, 2025
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build deepseek Related to DeepSeek models multi-modality Related to multi-modality (#4194) new-model Requests to new models ready ONLY add when PR is ready to merge/full CI is needed

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[Bug]: google/embeddinggemma-300m when using transformers backend doesn't match the output of Sentence Transformers (or model_impl="vllm")

3 participants