Skip to content

Conversation

@divyanshsinghvi
Copy link
Contributor

@divyanshsinghvi divyanshsinghvi commented Sep 3, 2025

Purpose

Fixes #24165
Adds supports for the various models to move away from mrope.py for better handling of the implementation.

  • By @DarkLight1337 and @wwl2755 : Use self.get_model() instead of self.model for interface checking to handle cases when the model is wrapped.

Test Plan

pytest tests/models/multimodal/generation/test_qwen2_vl.py::test_qwen2_vl_image_embeddings_input

Test Result

Passed

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@divyanshsinghvi divyanshsinghvi changed the title interfaces mrope added; initial commit mrope interface added Sep 3, 2025
@divyanshsinghvi divyanshsinghvi changed the title mrope interface added [Refactor]: mrope interface added Sep 3, 2025
@divyanshsinghvi divyanshsinghvi changed the title [Refactor]: mrope interface added [Refactor]: Use M-RoPE from interface with each model defining the M-RoPE instead of from mrope.py Sep 3, 2025
@divyanshsinghvi divyanshsinghvi marked this pull request as ready for review September 3, 2025 11:01
@divyanshsinghvi
Copy link
Contributor Author

@Isotr0py is there some model I should attempt to modify using this interface instead or create a test case for the same?

@divyanshsinghvi
Copy link
Contributor Author

divyanshsinghvi commented Sep 3, 2025

Need to update the documents reflecting the new interface couldn't find markdown for the same.

@divyanshsinghvi divyanshsinghvi changed the title [Refactor]: Use M-RoPE from interface with each model defining the M-RoPE instead of from mrope.py [Refactor]: Use M-RoPE interface directly while defining model class instead of maintaining model specific M-RoPE implementation in mrope.py Sep 3, 2025
@Isotr0py
Copy link
Member

Isotr0py commented Sep 3, 2025

@divyanshsinghvi You can find all model_type needs MRoPE here (the default vl means Qwen2-VL/Qwen2.5-VL):

@classmethod
def get_input_positions_tensor(
cls,
input_tokens: list[int],
hf_config: PretrainedConfig,
image_grid_thw: Union[list[list[int]], torch.Tensor],
video_grid_thw: Union[list[list[int]], torch.Tensor],
second_per_grid_ts: list[float],
context_len: int = 0,
seq_len: Optional[int] = None,
audio_feature_lengths: Optional[torch.Tensor] = None,
use_audio_in_video: bool = False,
) -> tuple[torch.Tensor, int]:
from vllm.transformers_utils.config import thinker_uses_mrope
if thinker_uses_mrope(hf_config):
return cls._omni_get_input_positions_tensor(
input_tokens=input_tokens,
hf_config=hf_config,
image_grid_thw=image_grid_thw,
video_grid_thw=video_grid_thw,
second_per_grid_ts=second_per_grid_ts,
context_len=context_len,
seq_len=seq_len,
audio_feature_lengths=audio_feature_lengths,
use_audio_in_video=use_audio_in_video,
)
elif hf_config.model_type in ["glm4v", "glm4v_moe"]:
return cls._glm4v_get_input_positions_tensor(
input_tokens=input_tokens,
hf_config=hf_config,
image_grid_thw=image_grid_thw,
video_grid_thw=video_grid_thw,
context_len=context_len,
seq_len=seq_len,
)
elif hf_config.model_type in ["ernie4_5_moe_vl", "ernie4_5_vl"]:
return cls._ernie_get_input_positions_tensor(
input_tokens=input_tokens,
hf_config=hf_config,
image_grid_thw=image_grid_thw,
video_grid_thw=video_grid_thw,
context_len=context_len,
seq_len=seq_len,
)
elif "KeyeVL1_5" in hf_config.model_type:
return cls._keye_get_input_positions_tensor(
input_tokens=input_tokens,
hf_config=hf_config,
image_grid_thw=image_grid_thw,
video_grid_thw=video_grid_thw,
context_len=context_len,
seq_len=seq_len,
)
else:
return cls._vl_get_input_positions_tensor(
input_tokens=input_tokens,
hf_config=hf_config,
image_grid_thw=image_grid_thw,
video_grid_thw=video_grid_thw,
second_per_grid_ts=second_per_grid_ts,
context_len=context_len,
seq_len=seq_len,
)

@divyanshsinghvi
Copy link
Contributor Author

@divyanshsinghvi You can find all model_type needs MRoPE here (the default vl means Qwen2-VL/Qwen2.5-VL):

@classmethod
def get_input_positions_tensor(
cls,
input_tokens: list[int],
hf_config: PretrainedConfig,
image_grid_thw: Union[list[list[int]], torch.Tensor],
video_grid_thw: Union[list[list[int]], torch.Tensor],
second_per_grid_ts: list[float],
context_len: int = 0,
seq_len: Optional[int] = None,
audio_feature_lengths: Optional[torch.Tensor] = None,
use_audio_in_video: bool = False,
) -> tuple[torch.Tensor, int]:
from vllm.transformers_utils.config import thinker_uses_mrope
if thinker_uses_mrope(hf_config):
return cls._omni_get_input_positions_tensor(
input_tokens=input_tokens,
hf_config=hf_config,
image_grid_thw=image_grid_thw,
video_grid_thw=video_grid_thw,
second_per_grid_ts=second_per_grid_ts,
context_len=context_len,
seq_len=seq_len,
audio_feature_lengths=audio_feature_lengths,
use_audio_in_video=use_audio_in_video,
)
elif hf_config.model_type in ["glm4v", "glm4v_moe"]:
return cls._glm4v_get_input_positions_tensor(
input_tokens=input_tokens,
hf_config=hf_config,
image_grid_thw=image_grid_thw,
video_grid_thw=video_grid_thw,
context_len=context_len,
seq_len=seq_len,
)
elif hf_config.model_type in ["ernie4_5_moe_vl", "ernie4_5_vl"]:
return cls._ernie_get_input_positions_tensor(
input_tokens=input_tokens,
hf_config=hf_config,
image_grid_thw=image_grid_thw,
video_grid_thw=video_grid_thw,
context_len=context_len,
seq_len=seq_len,
)
elif "KeyeVL1_5" in hf_config.model_type:
return cls._keye_get_input_positions_tensor(
input_tokens=input_tokens,
hf_config=hf_config,
image_grid_thw=image_grid_thw,
video_grid_thw=video_grid_thw,
context_len=context_len,
seq_len=seq_len,
)
else:
return cls._vl_get_input_positions_tensor(
input_tokens=input_tokens,
hf_config=hf_config,
image_grid_thw=image_grid_thw,
video_grid_thw=video_grid_thw,
second_per_grid_ts=second_per_grid_ts,
context_len=context_len,
seq_len=seq_len,
)

Yeah I meant, whether I should be changing all of them which are already implemented or make a new model along this lines to verify?

@Isotr0py
Copy link
Member

Isotr0py commented Sep 3, 2025

whether I should be changing all of them which are already implemented

Personally, I prefer to modify one model (e.g. Qwen2-VL) with existing tests and keep remaing models with backward compatability in initial PR, and then migrate remain models to use new interface in following PR. Because some models like Ernie can't run on our CI and needs manual validation.

make a new model along this lines to verify?

I think there is no need to make a new model.

@divyanshsinghvi divyanshsinghvi marked this pull request as draft September 3, 2025 13:27
@mergify mergify bot added the qwen Related to Qwen models label Sep 3, 2025
@divyanshsinghvi divyanshsinghvi force-pushed the refactor_mrope_implementation branch from 99252fc to 421e80d Compare September 3, 2025 17:03
@mergify mergify bot added the new-model Requests to new models label Sep 3, 2025
@divyanshsinghvi divyanshsinghvi marked this pull request as ready for review September 3, 2025 19:14
@divyanshsinghvi
Copy link
Contributor Author

divyanshsinghvi commented Sep 4, 2025

I dont see the original issue creators were added as the task reviewer. Let me know if something else needs to be added.
cc: @DarkLight1337 @Isotr0py @ywang96

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) October 10, 2025 16:49
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 10, 2025
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) October 10, 2025 16:49
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) October 10, 2025 16:50
@mergify
Copy link

mergify bot commented Oct 10, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @divyanshsinghvi.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Oct 10, 2025
@DarkLight1337
Copy link
Member

@ywang96 can you update this PR for Qwen3-Omni? Please remember to add @wwl2755 to co-authors before merging. Thanks

@divyanshsinghvi
Copy link
Contributor Author

@DarkLight1337 let me know if I need to do something here.

@ywang96
Copy link
Member

ywang96 commented Oct 10, 2025

@DarkLight1337 let me know if I need to do something here.

Let me update this PR for Qwen3-Omni too - I will also add @wwl2755 as co-author
On a second thought - it seems easier if I just add support for Qwen3-Omni in a separate PR than this. Will work on that soon.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
@mergify mergify bot removed the needs-rebase label Oct 11, 2025
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) October 11, 2025 05:25
@DarkLight1337 DarkLight1337 merged commit 727144b into vllm-project:main Oct 11, 2025
57 checks passed
Dhruvilbhatt pushed a commit to Dhruvilbhatt/vllm that referenced this pull request Oct 14, 2025
…instead of maintaining model specific M-RoPE implementation in mrope.py (vllm-project#24172)

Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: wwl2755 <wangwenlong2755@gmail.com>
Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>
bbartels pushed a commit to bbartels/vllm that referenced this pull request Oct 16, 2025
…instead of maintaining model specific M-RoPE implementation in mrope.py (vllm-project#24172)

Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: wwl2755 <wangwenlong2755@gmail.com>
Signed-off-by: bbartels <benjamin@bartels.dev>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
…instead of maintaining model specific M-RoPE implementation in mrope.py (vllm-project#24172)

Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: wwl2755 <wangwenlong2755@gmail.com>
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
…instead of maintaining model specific M-RoPE implementation in mrope.py (vllm-project#24172)

Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: wwl2755 <wangwenlong2755@gmail.com>
wangxiyuan pushed a commit to vllm-project/vllm-ascend that referenced this pull request Oct 24, 2025
### What this PR does / why we need it?
This is the step 1 of refactoring code to adapt with vllm main, and this
pr aligned with
vllm-project/vllm@17c540a

1. refactor deepseek to the latest code arch as of
vllm-project/vllm@17c540a
 
2. bunches of fixes due to vllm changes
- Fix `AscendScheduler` `__post_init__`, caused by
vllm-project/vllm#25075
- Fix `AscendScheduler` init got an unexpected arg `block_size`, caused
by vllm-project/vllm#26296
- Fix `KVCacheManager` `get_num_common_prefix_blocks` arg, caused by
vllm-project/vllm#23485
- Fix `MLAAttention` import,caused by
vllm-project/vllm#25103
- Fix `SharedFusedMoE` import, caused by
vllm-project/vllm#26145
- Fix `LazyLoader` improt, caused by
vllm-project/vllm#27022
- Fix `vllm.utils.swap_dict_values` improt, caused by
vllm-project/vllm#26990
- Fix `Backend` enum import, caused by
vllm-project/vllm#25893
- Fix `CompilationLevel` renaming to `CompilationMode` issue introduced
by vllm-project/vllm#26355
- Fix fused_moe ops, caused by
vllm-project/vllm#24097
- Fix bert model because of `inputs_embeds`, caused by
vllm-project/vllm#25922
- Fix MRope because of `get_input_positions_tensor` to
`get_mrope_input_positions`, caused by
vllm-project/vllm#24172
- Fix `splitting_ops` changes introduced by
vllm-project/vllm#25845
- Fix multi-modality changes introduced by
vllm-project/vllm#16229
- Fix lora bias dropping issue introduced by
vllm-project/vllm#25807
- Fix structured ouput break introduced by
vllm-project/vllm#26737

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?
CI passed with existing test.


- vLLM version: v0.11.0rc3
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

---------

Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: Icey <1790571317@qq.com>
Co-authored-by: Icey <1790571317@qq.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
…instead of maintaining model specific M-RoPE implementation in mrope.py (vllm-project#24172)

Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: wwl2755 <wangwenlong2755@gmail.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…instead of maintaining model specific M-RoPE implementation in mrope.py (vllm-project#24172)

Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: wwl2755 <wangwenlong2755@gmail.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…instead of maintaining model specific M-RoPE implementation in mrope.py (vllm-project#24172)

Signed-off-by: Divyansh Singhvi <divyanshsinghvi@gmail.com>
Signed-off-by: dsinghvi <divyanshsinghvi@gmail.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: wwl2755 <wangwenlong2755@gmail.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new-model Requests to new models qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Refactor]: Let each modeling file define M-RoPE implementation

5 participants