[Bugfix] Fix the bug that qwen3 moe doesn't work with aclgraph #2478

shen-shanshan · 2025-08-21T12:23:42Z

What this PR does / why we need it?

What's the PR does:

Move AscendSparseMoeBlock to qwen3 model, since it's only used by qwen3 model.
Disable AscendSparseMoeBlock if aclgraph is enabled, AscendSparseMoeBlock doesn't work with aclgraph currently.

Does this PR introduce any user-facing change?

How was this patch tested?

Signed-off-by: shen-shanshan <467638484@qq.com>

gemini-code-assist

Code Review

This pull request aims to fix an issue where qwen3 MoE models do not work with aclgraph. The changes involve moving the Ascend-specific MoE block implementation to the qwen3 model file and conditionally disabling it when aclgraph is enabled. While the overall approach is sound, the current implementation introduces critical bugs. Specifically, the method signatures for the newly introduced CustomSparseMoeBlock and the conditionally used Qwen3MoeSparseMoeBlock are incompatible with how they are called, which will lead to runtime TypeErrors. I have provided detailed comments and suggestions to address these critical issues.

vllm_ascend/models/qwen3_moe.py

Signed-off-by: shen-shanshan <467638484@qq.com>

wangxiyuan · 2025-08-22T06:13:11Z

tests/multicard/test_qwen3_moe.py

+    example_prompts = [
+        "Hello, my name is",
+    ]
+    dtype = "half"


vllm runner actually runs in eager mode by default.

Fix the bug that qwen3 moe doesn't work with aclgraph

dd8965c

Signed-off-by: shen-shanshan <467638484@qq.com>

github-actions bot added module:tests module:ops labels Aug 21, 2025

shen-shanshan mentioned this pull request Aug 21, 2025

[Release]: Release checklist for v0.9.1rc3 #2396

Open

25 tasks

gemini-code-assist bot reviewed Aug 21, 2025

View reviewed changes

vllm_ascend/models/qwen3_moe.py Outdated Show resolved Hide resolved

vllm_ascend/models/qwen3_moe.py Show resolved Hide resolved

shen-shanshan added 2 commits August 21, 2025 12:37

update

46d426c

Signed-off-by: shen-shanshan <467638484@qq.com>

update

8d8fd84

Signed-off-by: shen-shanshan <467638484@qq.com>

github-actions bot removed the module:ops label Aug 21, 2025

update

ff50da6

Signed-off-by: shen-shanshan <467638484@qq.com>

shen-shanshan force-pushed the v0.9.1-dev branch from f3fb510 to ff50da6 Compare August 22, 2025 01:29

shen-shanshan added 3 commits August 22, 2025 02:53

update

c5fe817

Signed-off-by: shen-shanshan <467638484@qq.com>

update

a5e1e6b

Signed-off-by: shen-shanshan <467638484@qq.com>

update

e04fb7a

Signed-off-by: shen-shanshan <467638484@qq.com>

wangxiyuan approved these changes Aug 22, 2025

View reviewed changes

wangxiyuan merged commit 9f590c7 into vllm-project:v0.9.1-dev Aug 22, 2025
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Fix the bug that qwen3 moe doesn't work with aclgraph #2478

[Bugfix] Fix the bug that qwen3 moe doesn't work with aclgraph #2478

shen-shanshan commented Aug 21, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

wangxiyuan Aug 22, 2025

Uh oh!

Uh oh!

Uh oh!

[Bugfix] Fix the bug that qwen3 moe doesn't work with aclgraph #2478

[Bugfix] Fix the bug that qwen3 moe doesn't work with aclgraph #2478

Conversation

shen-shanshan commented Aug 21, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

wangxiyuan Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!