Skip to content

Conversation

@AzizCode92
Copy link
Contributor

@AzizCode92 AzizCode92 commented Sep 3, 2025

Purpose

Introduces a support_mrope interface on the model class to delegate M-RoPE implementation details to each model, rather than using a centralized helper.

The Qwen2-VL model is updated to use this new interface as the first implementation.

Related to: #24165

#24172 also is working on the same issue. Please feel free to close mine if the prior PR does the job.

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
@mergify mergify bot added the qwen Related to Qwen models label Sep 3, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new SupportsMRoPE interface to delegate model-specific M-RoPE logic, which is a positive step towards improving modularity. My review focuses on ensuring the new interface is correctly defined and implemented, following existing patterns in the codebase. I've identified some issues in the interface definition in interfaces.py regarding consistency and correctness. Additionally, I've suggested an improvement to the Qwen2VLForConditionalGeneration implementation to make it more self-contained and robust, which aligns better with the goals of this PR.

@AzizCode92 AzizCode92 closed this Sep 3, 2025
@AzizCode92 AzizCode92 reopened this Sep 3, 2025
@pratapyash
Copy link
Contributor

Can we also expect an update to Qwen-2.5-Omni to use this new interface as well?

Currently, it emits this warning at model init:

Unrecognized keys in `rope_scaling` for 'rope_type'='default': {'mrope_section'}

AzizCode92 and others added 3 commits September 3, 2025 22:00
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Aziz <azizbenothman76@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Aziz <azizbenothman76@gmail.com>
Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
@AzizCode92
Copy link
Contributor Author

Can we also expect an update to Qwen-2.5-Omni to use this new interface as well?

Currently, it emits this warning at model init:

Unrecognized keys in `rope_scaling` for 'rope_type'='default': {'mrope_section'}

We will incrementally adjust all other models to adopt the new interface design. This PR focuses as first step on adapting qwen2_vl. Once approved we can move to the other models.

@wwl2755
Copy link
Contributor

wwl2755 commented Sep 3, 2025

Is qwen2-vl using the shared m-rope as mentioned in #24165 ? Because it seems to have already built its own m-rope in qwen2-vl.py

class Qwen2VisionRotaryEmbedding(nn.Module):

@AzizCode92
Copy link
Contributor Author

Is qwen2-vl using the shared m-rope as mentioned in #24165 ? Because it seems to have already built its own m-rope in qwen2-vl.py

class Qwen2VisionRotaryEmbedding(nn.Module):

Good question, I think in vllm/model_executor/layers/rotary_embedding/mrope.py the _vl_get_input_positions_tensor referes to Qwen2-VL/Qwen2.5-VL.

cc @Isotr0py

MRO of your model class.
"""

@classmethod
Copy link
Member

@DarkLight1337 DarkLight1337 Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this an instance method? Also the model runner should actually make use of this method instead of calling the one in mrope.py if available

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
@Isotr0py
Copy link
Member

Isotr0py commented Sep 4, 2025

Good question, I think in vllm/model_executor/layers/rotary_embedding/mrope.py the _vl_get_input_positions_tensor referes to Qwen2-VL/Qwen2.5-VL.

The Qwen2VisionRotaryEmbedding is only used in ViT, which is different from the MRotaryEmbedding used in Qwen2-VL's backbone.

@divyanshsinghvi
Copy link
Contributor

Seems a duplicate of #24172 ? @AzizCode92

Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
@wwl2755
Copy link
Contributor

wwl2755 commented Sep 4, 2025

Good question, I think in vllm/model_executor/layers/rotary_embedding/mrope.py the _vl_get_input_positions_tensor referes to Qwen2-VL/Qwen2.5-VL.

The Qwen2VisionRotaryEmbedding is only used in ViT, which is different from the MRotaryEmbedding used in Qwen2-VL's backbone.

Oh, yeah. I missed that. You're right. Thanks for pointing out.

context_len=num_computed_tokens + prompt_part_len,
num_new_tokens=completion_part_len,
)
if supports_mrope(self.model):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this neccesary? The original logic is to call get_next_input_positions_tensor()

Copy link
Contributor Author

@AzizCode92 AzizCode92 Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The original logic was using using get_input_positions_tensor from vllm.model_executor.layers.rotary_embedding and now we are switching to the get_mrope_input_positions defined in the new interface. To check if the model supports mrope, we have defined supports_mrope.
See here: https://github.com/AzizCode92/vllm/blob/8df4a1091bb465e4a3058ba9a8a10cb168d27d82/vllm/model_executor/models/interfaces.py#L872-L923

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, I mean, here it is using get_next_input_positions_tensor() instead of get_input_positions_tensor(). They are different logic I think.

If you want to refactor this as well, you may need another get_next_mrope_input_positions() in your interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch! That was definitely unnecessary to change.

Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
@DarkLight1337
Copy link
Member

Sorry for the delay, let's merge this one first, then #24172 can be modified to update the rest of the models.

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) September 17, 2025 13:12
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 17, 2025
@DarkLight1337
Copy link
Member

PTAL at the test failure in multimodal tests

Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
auto-merge was automatically disabled September 18, 2025 07:31

Head branch was pushed to by a user without write access

Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
@AzizCode92
Copy link
Contributor Author

@DarkLight1337 multimodal tests are green now.
Let me know your feedback if anything needs to be adjusted. Thanks!

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) September 18, 2025 16:59
@DarkLight1337
Copy link
Member

Merging with main to fix the unrelated CI failures

@DarkLight1337 DarkLight1337 merged commit 38db529 into vllm-project:main Sep 18, 2025
49 checks passed
ywang96 pushed a commit to ywang96/vllm that referenced this pull request Sep 19, 2025
Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
Signed-off-by: Aziz <azizbenothman76@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
debroy-rh pushed a commit to debroy-rh/vllm that referenced this pull request Sep 19, 2025
Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
Signed-off-by: Aziz <azizbenothman76@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
Signed-off-by: Aziz <azizbenothman76@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
Signed-off-by: Aziz <azizbenothman76@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Signed-off-by: charlifu <charlifu@amd.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
Signed-off-by: Aziz <azizbenothman76@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
Signed-off-by: Aziz <azizbenothman76@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
Signed-off-by: Aziz <azizbenothman76@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: AzizCode92 <azizbenothman76@gmail.com>
Signed-off-by: Aziz <azizbenothman76@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants