[Refactor][main CI] Refactor code to align with vllm main #3504

MengqingCao · 2025-10-16T11:08:15Z

What this PR does / why we need it?

cherry-pick [CI] Upgrade vllm to newest commit #3423
refactor deepseek code
- seprate modeling code deepseek v3.2 with other deepseek version
- using the latest MLA code arch in vLLM

How was this patch tested?

Test locally passed with AscendScheduler enabled + torchair/eager mode on DeepSeek-V3.2-Exp-W8A8

vLLM version: v0.11.0rc3
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

github-actions · 2025-10-16T11:08:23Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request refactors the codebase to align with upstream vllm main, introducing version compatibility checks across multiple files. The changes are extensive and mostly correct. However, I've identified a critical issue in a test case that could break CI for a specific vllm version, and a code duplication issue that should be addressed for better maintainability.

gemini-code-assist · 2025-10-16T11:10:50Z

tests/ut/test_platform.py

            self.assertEqual(
                vllm_config.compilation_config.level,
-                CompilationLevel.NO_COMPILATION,
+                CompilationMode.NONE,
            )


The assertion for vllm_config.compilation_config.level is incorrect when vllm_version_is("0.11.0") is true. The test will fail in that case. The assertion should be conditional, similar to other tests in this file, to check for CompilationLevel.NO_COMPILATION for vllm 0.11.0 and CompilationMode.NONE otherwise.

if vllm_version_is("0.11.0"): self.assertEqual( vllm_config.compilation_config.level, CompilationLevel.NO_COMPILATION, ) else: self.assertEqual( vllm_config.compilation_config.level, CompilationMode.NONE, )

gemini-code-assist · 2025-10-16T11:10:50Z

vllm_ascend/worker/model_runner_v1.py

+        # In multi-DP scenarios, there may be situations where all DP groups are executing dummy runs.
+        # If sequence parallelism is enabled, it is essential to ensure that num_tokens is divisible by tp_size.
+        if self.use_aclgraph and enable_sp(self.vllm_config):
+            tp_size = self.vllm_config.parallel_config.tensor_parallel_size
+            num_tokens = math.ceil(num_tokens / tp_size) * tp_size
+
        # In multi-DP scenarios, there may be situations where all DP groups are executing dummy runs.
        # If sequence parallelism is enabled, it is essential to ensure that num_tokens is divisible by tp_size.
        if self.use_aclgraph and enable_sp(self.vllm_config):


This block of code, which pads num_tokens for sequence parallelism, appears to be duplicated. The second block should be removed to avoid redundancy.

Suggested change

# In multi-DP scenarios, there may be situations where all DP groups are executing dummy runs.

# If sequence parallelism is enabled, it is essential to ensure that num_tokens is divisible by tp_size.

if self.use_aclgraph and enable_sp(self.vllm_config):

tp_size = self.vllm_config.parallel_config.tensor_parallel_size

num_tokens = math.ceil(num_tokens / tp_size) * tp_size

# In multi-DP scenarios, there may be situations where all DP groups are executing dummy runs.

# If sequence parallelism is enabled, it is essential to ensure that num_tokens is divisible by tp_size.

if self.use_aclgraph and enable_sp(self.vllm_config):

# In multi-DP scenarios, there may be situations where all DP groups are executing dummy runs.

# If sequence parallelism is enabled, it is essential to ensure that num_tokens is divisible by tp_size.

if self.use_aclgraph and enable_sp(self.vllm_config):

tp_size = self.vllm_config.parallel_config.tensor_parallel_size

num_tokens = math.ceil(num_tokens / tp_size) * tp_size

if self.use_aclgraph and enable_sp(self.vllm_config):

whx-sjtu

The adaption of MLA part LGTM. It's ok for me to delete deepseek_v2 modeling after this PR merged.

github-actions · 2025-10-17T10:19:05Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

MengqingCao · 2025-10-18T08:17:25Z

torchair mtp failed in

test_models_distributed_Qwen_Dense_with_flashcomm_v1 failed in https://github.com/vllm-project/vllm-ascend/actions/runs/18611921289/job/53071268128?pr=3504

tp+ep failed due to dispatcher to different communication ops https://github.com/vllm-project/vllm-ascend/actions/runs/18616600778/job/53081964431?pr=3504

github-actions · 2025-10-19T03:05:12Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

github-actions · 2025-10-20T01:54:57Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: MengqingCao <cmq0113@163.com>

Signed-off-by: Icey <1790571317@qq.com>

* fix bert model * fix guided decoding * revert skipped e2e test * fix lora vllm-project/vllm#25807 * fix vl Signed-off-by: MengqingCao <cmq0113@163.com>

Signed-off-by: MengqingCao <cmq0113@163.com>

MengqingCao · 2025-10-20T06:19:26Z

e2e tests on singlecard only failed with test_mtp_torchair_correctness_piecewise and test_mtp_torchair_correctness_full

2025-10-18T15:07:07.2022784Z tests/e2e/singlecard/spec_decode_v1/test_v1_mtp_torchair_correctness.py::test_mtp_torchair_correctness_piecewise SKIPPED
2025-10-18T15:07:09.4163583Z tests/e2e/singlecard/spec_decode_v1/test_v1_mtp_torchair_correctness.py::test_mtp_torchair_correctness_full SKIPPED

in https://github.com/vllm-project/vllm-ascend/actions/runs/18616600778/job/53081964430?pr=3504

Signed-off-by: MengqingCao <cmq0113@163.com>

github-actions · 2025-10-20T07:33:46Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: MengqingCao <cmq0113@163.com>

MengqingCao · 2025-10-24T09:40:59Z

closing as done in #3612

github-actions bot added module:tests module:ops module:core module:quantization labels Oct 16, 2025

gemini-code-assist bot reviewed Oct 16, 2025

View reviewed changes

whx-sjtu reviewed Oct 16, 2025

View reviewed changes

github-actions bot added the merge-conflicts label Oct 17, 2025

MengqingCao force-pushed the refactor_ds_1016_rebase_main branch from b8e0d5e to 85b1d42 Compare October 17, 2025 12:09

github-actions bot removed the merge-conflicts label Oct 17, 2025

MengqingCao added ready read for review ready-for-test start test by label for PR labels Oct 17, 2025

MengqingCao force-pushed the refactor_ds_1016_rebase_main branch from 71aeb39 to 1ebbd54 Compare October 18, 2025 06:19

github-actions bot added the merge-conflicts label Oct 19, 2025

MengqingCao force-pushed the refactor_ds_1016_rebase_main branch from 366fa84 to a1038da Compare October 19, 2025 06:48

github-actions bot removed the merge-conflicts label Oct 19, 2025

github-actions bot added the merge-conflicts label Oct 20, 2025

MengqingCao and others added 9 commits October 20, 2025 03:51

[Refactor][DS32] Refactor DeepSeekV3.2 to adapt with vllm main

5076943

Signed-off-by: MengqingCao <cmq0113@163.com>

[CI] Upgrade vllm to newest commit

458abc9

Signed-off-by: Icey <1790571317@qq.com>

merge-commit

57f0815

Signed-off-by: Icey <1790571317@qq.com>

Many fixes

31a4b67

* fix bert model * fix guided decoding * revert skipped e2e test * fix lora vllm-project/vllm#25807 * fix vl Signed-off-by: MengqingCao <cmq0113@163.com>

fix mtp

a46ed6f

Signed-off-by: MengqingCao <cmq0113@163.com>

fix ut

5ffa767

Signed-off-by: MengqingCao <cmq0113@163.com>

fix lint

359888f

Signed-off-by: MengqingCao <cmq0113@163.com>

fix aclgraph by importing re instead of regex

c7e27e2

Signed-off-by: MengqingCao <cmq0113@163.com>

skip failed e2e

1d34cb4

Signed-off-by: MengqingCao <cmq0113@163.com>

MengqingCao added 3 commits October 20, 2025 03:52

fix ut

e18eb7b

Signed-off-by: MengqingCao <cmq0113@163.com>

fix ray

9607aae

Signed-off-by: MengqingCao <cmq0113@163.com>

fix torchair mtp

b6a9207

Signed-off-by: MengqingCao <cmq0113@163.com>

MengqingCao force-pushed the refactor_ds_1016_rebase_main branch from a1038da to b6a9207 Compare October 20, 2025 03:54

github-actions bot removed the merge-conflicts label Oct 20, 2025

wxsIcey mentioned this pull request Oct 20, 2025

[main] Fix main to vllm newer commit #3544

Closed

run test_expert_parallel and test_mtp_torchair_correctness_full

95e42f0

Signed-off-by: MengqingCao <cmq0113@163.com>

github-actions bot added the merge-conflicts label Oct 20, 2025

fix graph mode

14f7627

Signed-off-by: MengqingCao <cmq0113@163.com>

MengqingCao closed this Oct 24, 2025

[Refactor][main CI] Refactor code to align with vllm main #3504

[Refactor][main CI] Refactor code to align with vllm main #3504

Uh oh!

Conversation

MengqingCao commented Oct 16, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

How was this patch tested?

Uh oh!

github-actions bot commented Oct 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

whx-sjtu left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 17, 2025

Uh oh!

MengqingCao commented Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 19, 2025

Uh oh!

github-actions bot commented Oct 20, 2025

Uh oh!

MengqingCao commented Oct 20, 2025

Uh oh!

github-actions bot commented Oct 20, 2025

Uh oh!

MengqingCao commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MengqingCao commented Oct 16, 2025 •

edited by github-actions bot

Loading

MengqingCao commented Oct 18, 2025 •

edited

Loading