[Refactor]: Use M-RoPE interface directly while defining model class instead of maintaining model specific M-RoPE implementation in mrope.py #24172

divyanshsinghvi · 2025-09-03T10:36:19Z

Purpose

Fixes #24165
Adds supports for the various models to move away from mrope.py for better handling of the implementation.

By @DarkLight1337 and @wwl2755 : Use self.get_model() instead of self.model for interface checking to handle cases when the model is wrapped.

Test Plan

pytest tests/models/multimodal/generation/test_qwen2_vl.py::test_qwen2_vl_image_embeddings_input

Test Result

Passed

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

divyanshsinghvi · 2025-09-03T11:03:50Z

@Isotr0py is there some model I should attempt to modify using this interface instead or create a test case for the same?

divyanshsinghvi · 2025-09-03T11:08:27Z

Need to update the documents reflecting the new interface couldn't find markdown for the same.

Isotr0py · 2025-09-03T11:23:27Z

@divyanshsinghvi You can find all model_type needs MRoPE here (the default vl means Qwen2-VL/Qwen2.5-VL):

vllm/vllm/model_executor/layers/rotary_embedding/mrope.py

Lines 361 to 423 in 28f350e

    
           @classmethod 
        
           def get_input_positions_tensor( 
        
               cls, 
        
               input_tokens: list[int], 
        
               hf_config: PretrainedConfig, 
        
               image_grid_thw: Union[list[list[int]], torch.Tensor], 
        
               video_grid_thw: Union[list[list[int]], torch.Tensor], 
        
               second_per_grid_ts: list[float], 
        
               context_len: int = 0, 
        
               seq_len: Optional[int] = None, 
        
               audio_feature_lengths: Optional[torch.Tensor] = None, 
        
               use_audio_in_video: bool = False, 
        
           ) -> tuple[torch.Tensor, int]: 
        
               from vllm.transformers_utils.config import thinker_uses_mrope 
        
               if thinker_uses_mrope(hf_config): 
        
                   return cls._omni_get_input_positions_tensor( 
        
                       input_tokens=input_tokens, 
        
                       hf_config=hf_config, 
        
                       image_grid_thw=image_grid_thw, 
        
                       video_grid_thw=video_grid_thw, 
        
                       second_per_grid_ts=second_per_grid_ts, 
        
                       context_len=context_len, 
        
                       seq_len=seq_len, 
        
                       audio_feature_lengths=audio_feature_lengths, 
        
                       use_audio_in_video=use_audio_in_video, 
        
                   ) 
        
               elif hf_config.model_type in ["glm4v", "glm4v_moe"]: 
        
                   return cls._glm4v_get_input_positions_tensor( 
        
                       input_tokens=input_tokens, 
        
                       hf_config=hf_config, 
        
                       image_grid_thw=image_grid_thw, 
        
                       video_grid_thw=video_grid_thw, 
        
                       context_len=context_len, 
        
                       seq_len=seq_len, 
        
                   ) 
        
               elif hf_config.model_type in ["ernie4_5_moe_vl", "ernie4_5_vl"]: 
        
                   return cls._ernie_get_input_positions_tensor( 
        
                       input_tokens=input_tokens, 
        
                       hf_config=hf_config, 
        
                       image_grid_thw=image_grid_thw, 
        
                       video_grid_thw=video_grid_thw, 
        
                       context_len=context_len, 
        
                       seq_len=seq_len, 
        
                   ) 
        
               elif "KeyeVL1_5" in hf_config.model_type: 
        
                   return cls._keye_get_input_positions_tensor( 
        
                       input_tokens=input_tokens, 
        
                       hf_config=hf_config, 
        
                       image_grid_thw=image_grid_thw, 
        
                       video_grid_thw=video_grid_thw, 
        
                       context_len=context_len, 
        
                       seq_len=seq_len, 
        
                   ) 
        
               else: 
        
                   return cls._vl_get_input_positions_tensor( 
        
                       input_tokens=input_tokens, 
        
                       hf_config=hf_config, 
        
                       image_grid_thw=image_grid_thw, 
        
                       video_grid_thw=video_grid_thw, 
        
                       second_per_grid_ts=second_per_grid_ts, 
        
                       context_len=context_len, 
        
                       seq_len=seq_len, 
        
                   )

divyanshsinghvi · 2025-09-03T11:35:49Z

@divyanshsinghvi You can find all model_type needs MRoPE here (the default vl means Qwen2-VL/Qwen2.5-VL):

vllm/vllm/model_executor/layers/rotary_embedding/mrope.py

Lines 361 to 423 in 28f350e

@classmethod

def get_input_positions_tensor(

cls,

input_tokens: list[int],

hf_config: PretrainedConfig,

image_grid_thw: Union[list[list[int]], torch.Tensor],

video_grid_thw: Union[list[list[int]], torch.Tensor],

second_per_grid_ts: list[float],

context_len: int = 0,

seq_len: Optional[int] = None,

audio_feature_lengths: Optional[torch.Tensor] = None,

use_audio_in_video: bool = False,

) -> tuple[torch.Tensor, int]:

from vllm.transformers_utils.config import thinker_uses_mrope

if thinker_uses_mrope(hf_config):

return cls._omni_get_input_positions_tensor(

input_tokens=input_tokens,

hf_config=hf_config,

image_grid_thw=image_grid_thw,

video_grid_thw=video_grid_thw,

second_per_grid_ts=second_per_grid_ts,

context_len=context_len,

seq_len=seq_len,

audio_feature_lengths=audio_feature_lengths,

use_audio_in_video=use_audio_in_video,

)

elif hf_config.model_type in ["glm4v", "glm4v_moe"]:

return cls._glm4v_get_input_positions_tensor(

input_tokens=input_tokens,

hf_config=hf_config,

image_grid_thw=image_grid_thw,

video_grid_thw=video_grid_thw,

context_len=context_len,

seq_len=seq_len,

)

elif hf_config.model_type in ["ernie4_5_moe_vl", "ernie4_5_vl"]:

return cls._ernie_get_input_positions_tensor(

input_tokens=input_tokens,

hf_config=hf_config,

image_grid_thw=image_grid_thw,

video_grid_thw=video_grid_thw,

context_len=context_len,

seq_len=seq_len,

)

elif "KeyeVL1_5" in hf_config.model_type:

return cls._keye_get_input_positions_tensor(

input_tokens=input_tokens,

hf_config=hf_config,

image_grid_thw=image_grid_thw,

video_grid_thw=video_grid_thw,

context_len=context_len,

seq_len=seq_len,

)

else:

return cls._vl_get_input_positions_tensor(

input_tokens=input_tokens,

hf_config=hf_config,

image_grid_thw=image_grid_thw,

video_grid_thw=video_grid_thw,

second_per_grid_ts=second_per_grid_ts,

context_len=context_len,

seq_len=seq_len,

)

Yeah I meant, whether I should be changing all of them which are already implemented or make a new model along this lines to verify?

Isotr0py · 2025-09-03T11:47:51Z

whether I should be changing all of them which are already implemented

Personally, I prefer to modify one model (e.g. Qwen2-VL) with existing tests and keep remaing models with backward compatability in initial PR, and then migrate remain models to use new interface in following PR. Because some models like Ernie can't run on our CI and needs manual validation.

make a new model along this lines to verify?

I think there is no need to make a new model.

divyanshsinghvi · 2025-09-04T02:17:14Z

I dont see the original issue creators were added as the task reviewer. Let me know if something else needs to be added.
cc: @DarkLight1337 @Isotr0py @ywang96

mergify · 2025-10-10T17:05:06Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @divyanshsinghvi.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

DarkLight1337 · 2025-10-10T17:07:38Z

@ywang96 can you update this PR for Qwen3-Omni? Please remember to add @wwl2755 to co-authors before merging. Thanks

divyanshsinghvi · 2025-10-10T23:33:04Z

@DarkLight1337 let me know if I need to do something here.

ywang96 · 2025-10-10T23:35:40Z

@DarkLight1337 let me know if I need to do something here.

~~Let me update this PR for Qwen3-Omni too - I will also add @wwl2755 as co-author~~
On a second thought - it seems easier if I just add support for Qwen3-Omni in a separate PR than this. Will work on that soon.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

…instead of maintaining model specific M-RoPE implementation in mrope.py (vllm-project#24172) Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com> Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: wwl2755 <wangwenlong2755@gmail.com> Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>

…instead of maintaining model specific M-RoPE implementation in mrope.py (vllm-project#24172) Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com> Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: wwl2755 <wangwenlong2755@gmail.com> Signed-off-by: bbartels <benjamin@bartels.dev>

…instead of maintaining model specific M-RoPE implementation in mrope.py (vllm-project#24172) Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com> Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: wwl2755 <wangwenlong2755@gmail.com>

### What this PR does / why we need it? This is the step 1 of refactoring code to adapt with vllm main, and this pr aligned with vllm-project/vllm@17c540a 1. refactor deepseek to the latest code arch as of vllm-project/vllm@17c540a 2. bunches of fixes due to vllm changes - Fix `AscendScheduler` `__post_init__`, caused by vllm-project/vllm#25075 - Fix `AscendScheduler` init got an unexpected arg `block_size`, caused by vllm-project/vllm#26296 - Fix `KVCacheManager` `get_num_common_prefix_blocks` arg, caused by vllm-project/vllm#23485 - Fix `MLAAttention` import,caused by vllm-project/vllm#25103 - Fix `SharedFusedMoE` import, caused by vllm-project/vllm#26145 - Fix `LazyLoader` improt, caused by vllm-project/vllm#27022 - Fix `vllm.utils.swap_dict_values` improt, caused by vllm-project/vllm#26990 - Fix `Backend` enum import, caused by vllm-project/vllm#25893 - Fix `CompilationLevel` renaming to `CompilationMode` issue introduced by vllm-project/vllm#26355 - Fix fused_moe ops, caused by vllm-project/vllm#24097 - Fix bert model because of `inputs_embeds`, caused by vllm-project/vllm#25922 - Fix MRope because of `get_input_positions_tensor` to `get_mrope_input_positions`, caused by vllm-project/vllm#24172 - Fix `splitting_ops` changes introduced by vllm-project/vllm#25845 - Fix multi-modality changes introduced by vllm-project/vllm#16229 - Fix lora bias dropping issue introduced by vllm-project/vllm#25807 - Fix structured ouput break introduced by vllm-project/vllm#26737 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? CI passed with existing test. - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com> Co-authored-by: Icey <1790571317@qq.com>

…instead of maintaining model specific M-RoPE implementation in mrope.py (vllm-project#24172) Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com> Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: wwl2755 <wangwenlong2755@gmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

…instead of maintaining model specific M-RoPE implementation in mrope.py (vllm-project#24172) Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com> Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk> Co-authored-by: wwl2755 <wangwenlong2755@gmail.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

divyanshsinghvi changed the title ~~interfaces mrope added; initial commit~~ mrope interface added Sep 3, 2025

divyanshsinghvi changed the title ~~mrope interface added~~ [Refactor]: mrope interface added Sep 3, 2025

divyanshsinghvi changed the title ~~[Refactor]: mrope interface added~~ [Refactor]: Use M-RoPE from interface with each model defining the M-RoPE instead of from mrope.py Sep 3, 2025

divyanshsinghvi marked this pull request as ready for review September 3, 2025 11:01

divyanshsinghvi changed the title ~~[Refactor]: Use M-RoPE from interface with each model defining the M-RoPE instead of from mrope.py~~ [Refactor]: Use M-RoPE interface directly while defining model class instead of maintaining model specific M-RoPE implementation in mrope.py Sep 3, 2025

divyanshsinghvi marked this pull request as draft September 3, 2025 13:27

mergify bot added the qwen Related to Qwen models label Sep 3, 2025

divyanshsinghvi force-pushed the refactor_mrope_implementation branch from 99252fc to 421e80d Compare September 3, 2025 17:03

mergify bot added the new-model Requests to new models label Sep 3, 2025

divyanshsinghvi marked this pull request as ready for review September 3, 2025 19:14

divyanshsinghvi requested review from ProExpertProg, WoosukKwon, hmellor, houseroad, mgoin, robertgshaw2-redhat, sighingnow, simon-mo, tlrmchlsmth, yewentao256 and youkaichao as code owners September 3, 2025 19:14

AzizCode92 mentioned this pull request Sep 3, 2025

[feat]: Create interface for model-specific M-RoPE #24194

Merged

5 tasks

divyanshsinghvi mentioned this pull request Sep 4, 2025

[Refactor]: Let each modeling file define M-RoPE implementation #24165

Closed

1 task

divyanshsinghvi requested a review from njhill as a code owner September 4, 2025 08:19

DarkLight1337 enabled auto-merge (squash) October 10, 2025 16:49

DarkLight1337 disabled auto-merge October 10, 2025 16:49

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 10, 2025

DarkLight1337 enabled auto-merge (squash) October 10, 2025 16:49

DarkLight1337 disabled auto-merge October 10, 2025 16:49

DarkLight1337 enabled auto-merge (squash) October 10, 2025 16:50

mergify bot added the needs-rebase label Oct 10, 2025

DarkLight1337 disabled auto-merge October 10, 2025 17:07

DarkLight1337 added 2 commits October 11, 2025 05:17

Merge branch 'main' into refactor_mrope_implementation

0849f74

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Merge branch 'main' into refactor_mrope_implementation

fb27644

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

mergify bot removed the needs-rebase label Oct 11, 2025

DarkLight1337 enabled auto-merge (squash) October 11, 2025 05:25

DarkLight1337 merged commit 727144b into vllm-project:main Oct 11, 2025
57 checks passed

hmellor mentioned this pull request Oct 15, 2025

Refactor Transformers backend to use mixins #26906

Merged

MengqingCao mentioned this pull request Oct 17, 2025

[VLM][Refactor] Remove useless func get_input_positions in MRotaryEmbedding #27088

Merged

5 tasks

wxsIcey mentioned this pull request Oct 17, 2025

[CI] Upgrade vllm to newest commit vllm-project/vllm-ascend#3423

Closed

MengqingCao mentioned this pull request Oct 22, 2025

[1/N][Refactor] Refactor code to adapt with vllm main vllm-project/vllm-ascend#3612

Merged

DarkLight1337 mentioned this pull request Oct 22, 2025

[Bugfix] Make get_mrope_input_positions instance methods #27342

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Refactor]: Use M-RoPE interface directly while defining model class instead of maintaining model specific M-RoPE implementation in mrope.py #24172

[Refactor]: Use M-RoPE interface directly while defining model class instead of maintaining model specific M-RoPE implementation in mrope.py #24172

Uh oh!

divyanshsinghvi commented Sep 3, 2025 •

edited by github-actions bot

Loading

Uh oh!

divyanshsinghvi commented Sep 3, 2025

Uh oh!

divyanshsinghvi commented Sep 3, 2025 •

edited

Loading

Uh oh!

Isotr0py commented Sep 3, 2025

Uh oh!

divyanshsinghvi commented Sep 3, 2025

Uh oh!

Isotr0py commented Sep 3, 2025 •

edited

Loading

Uh oh!

divyanshsinghvi commented Sep 4, 2025 •

edited

Loading

Uh oh!

mergify bot commented Oct 10, 2025

Uh oh!

DarkLight1337 commented Oct 10, 2025

Uh oh!

divyanshsinghvi commented Oct 10, 2025

Uh oh!

ywang96 commented Oct 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

[Refactor]: Use M-RoPE interface directly while defining model class instead of maintaining model specific M-RoPE implementation in mrope.py #24172

[Refactor]: Use M-RoPE interface directly while defining model class instead of maintaining model specific M-RoPE implementation in mrope.py #24172

Uh oh!

Conversation

divyanshsinghvi commented Sep 3, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

divyanshsinghvi commented Sep 3, 2025

Uh oh!

divyanshsinghvi commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Isotr0py commented Sep 3, 2025

Uh oh!

divyanshsinghvi commented Sep 3, 2025

Uh oh!

Isotr0py commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

divyanshsinghvi commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergify bot commented Oct 10, 2025

Uh oh!

DarkLight1337 commented Oct 10, 2025

Uh oh!

divyanshsinghvi commented Oct 10, 2025

Uh oh!

ywang96 commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

divyanshsinghvi commented Sep 3, 2025 •

edited by github-actions bot

Loading

divyanshsinghvi commented Sep 3, 2025 •

edited

Loading

Isotr0py commented Sep 3, 2025 •

edited

Loading

divyanshsinghvi commented Sep 4, 2025 •

edited

Loading

ywang96 commented Oct 10, 2025 •

edited

Loading