support aclgraph #426

fakeYan · 2025-03-28T15:25:05Z

What this PR does / why we need it?

This PR supports the access of vllm-acend to the piecewise_graph feature provided by the v1 engine.

register unifiled_ascend_attention_with_output for piecewise_graph to split graph.
support NPUGraph to accelerate kernel launch.

Does this PR introduce any user-facing change?

support npugraph to default， Users can disenable the npugraph feature by configuring enforce_eager，just like.
from vllm import LLM, llm = LLM(model="Qwen/Qwen2.5-0.5B-Instruct", enforce_eager=True)

This has corresponding requirements for the versions of torch_npu and CANN, and they need to support graph capture.

How was this patch tested?

it turn to default

wuhuikx

please help implement the unit test case and system test case.

vllm_ascend/worker/model_runner_v1.py

wangxiyuan · 2025-04-15T07:59:56Z

vllm_ascend/worker/model_runner_v1.py

@@ -171,6 +179,12 @@ def __init__(self, vllm_config: VllmConfig, device: torch.device):
        self.input_positions_cpu = torch.arange(0,
                                                self.max_num_tokens,
                                                device="cpu")
+        self.use_cuda_graph = (self.vllm_config.compilation_config.level


rename to self.use_acl_graph

self.use_npu_graph is better

wangxiyuan · 2025-04-15T08:00:01Z

vllm_ascend/worker/model_runner_v1.py

+        self.use_cuda_graph = (self.vllm_config.compilation_config.level
+                                == CompilationLevel.PIECEWISE
+                                and not self.model_config.enforce_eager)
+        self.cudagraph_batch_sizes = list(


wangxiyuan · 2025-04-15T08:01:07Z

vllm_ascend/worker/model_runner_v1.py

+    from vllm.v1.sample.rejection_sampler import INVALID_TOKEN_ID, RejectionSampler
+else:
+    INVALID_TOKEN_ID = None
+    RejectionSampler = None


why this change? HAS_TRITON is alway false in vllm-ascend. So I guess you want to rewrite vllm.v1.sample.rejection_sampler. INVALID_TOKEN_ID, RejectionSampler in vllm-ascend here？

wangxiyuan · 2025-04-15T08:47:12Z

vllm_ascend/utils.py

+        self.name = name
+
+
+def register_dummy_fusion_op() -> None:


move to ops module

wangxiyuan · 2025-04-18T08:20:07Z

requirements.txt

 torch >= 2.5.1
+torch_npu == 2.5.1rc1


do not limit torch-npu version here.

MengqingCao · 2025-04-22T09:02:55Z

vllm_ascend/__init__.py

@@ -15,6 +15,8 @@
 # This file is a part of the vllm-ascend project.
 #

+from torch_npu.contrib import transfer_to_npu  # noqa: F401


Why do we need this here? This will hide some issue and break some scenes in RL, where torch.cuda expected to be called normally.

Signed-off-by: Bug Hunter Yan <yanpq@zju.edu.cn>

Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>

### What this PR does / why we need it?  This PR supports the access of vllm-acend to the piecewise_graph feature provided by the v1 engine. 1. register unifiled_ascend_attention_with_output for piecewise_graph to split graph. 2. support NPUGraph to accelerate kernel launch. ### Does this PR introduce _any_ user-facing change?  support npugraph to default， Users can disenable the npugraph feature by configuring enforce_eager. This has corresponding requirements for the versions of torch_npu and CANN, and they need to support graph capture. ### How was this patch tested?  it turn to default --------- Signed-off-by: Bug Hunter Yan <yanpq@zju.edu.cn> Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com> Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com>

github-actions bot added the module:core label Mar 28, 2025

fakeYan force-pushed the main branch from af89ebb to 817c738 Compare March 31, 2025 06:11

github-actions bot added the module:tests label Mar 31, 2025

fakeYan force-pushed the main branch from 817c738 to 9c62ba4 Compare March 31, 2025 06:15

Yikun mentioned this pull request Mar 31, 2025

vLLM Ascend Roadmap Q2 2025 #448

Closed

40 tasks

fakeYan force-pushed the main branch from a069aaf to 9da472d Compare March 31, 2025 13:10

github-actions bot removed the module:tests label Mar 31, 2025

fakeYan force-pushed the main branch from 9da472d to 5c67cf9 Compare March 31, 2025 13:11

github-actions bot added documentation Improvements or additions to documentation module:tests labels Mar 31, 2025

fakeYan force-pushed the main branch from 52de42f to 7f4e5af Compare March 31, 2025 13:21

github-actions bot removed documentation Improvements or additions to documentation module:tests labels Mar 31, 2025

fakeYan force-pushed the main branch from 7f4e5af to 7df6263 Compare April 1, 2025 01:27

wuhuikx reviewed Apr 14, 2025

View reviewed changes

fakeYan force-pushed the main branch from d3707fd to f95fc5a Compare April 15, 2025 07:02

fakeYan commented Apr 15, 2025

View reviewed changes

vllm_ascend/worker/model_runner_v1.py Outdated Show resolved Hide resolved

fakeYan force-pushed the main branch 7 times, most recently from 1ae4054 to 695689e Compare April 18, 2025 01:53

github-actions bot added the module:tests label Apr 18, 2025

fakeYan force-pushed the main branch 2 times, most recently from 8793fa4 to edce3b8 Compare April 18, 2025 07:07

wangxiyuan reviewed Apr 18, 2025

View reviewed changes

fakeYan force-pushed the main branch 2 times, most recently from e6bdffb to a8d3d27 Compare April 18, 2025 08:51

yiz-liu force-pushed the main branch 8 times, most recently from 28ce6ee to 0ca634e Compare April 22, 2025 07:44

MengqingCao reviewed Apr 22, 2025

View reviewed changes

yiz-liu force-pushed the main branch 6 times, most recently from eca5828 to 6c0e10d Compare April 22, 2025 12:11

github-actions bot added the ci/build label Apr 22, 2025

yiz-liu force-pushed the main branch from 6c0e10d to 8f636f4 Compare April 22, 2025 12:33

github-actions bot removed the ci/build label Apr 22, 2025

yiz-liu force-pushed the main branch from 8f636f4 to f0815c5 Compare April 23, 2025 02:43

support npugraph to default

bf07f10

Signed-off-by: Bug Hunter Yan <yanpq@zju.edu.cn>

yiz-liu force-pushed the main branch 2 times, most recently from a7e0e28 to b2a0b53 Compare April 23, 2025 11:05

[Fix] Correct minor formatting issues

da5b322

Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>

yiz-liu force-pushed the main branch from b2a0b53 to da5b322 Compare April 23, 2025 11:08

ganyi1996ppo approved these changes Apr 23, 2025

View reviewed changes

ganyi1996ppo merged commit 05bdcbe into vllm-project:main Apr 23, 2025
16 checks passed

wangxiyuan mentioned this pull request Apr 27, 2025

[Feature] Support the v1 connector API #605

Open

MengqingCao mentioned this pull request May 13, 2025

[Compile][Platform] Make PiecewiseBackend pluggable and extendable vllm-project/vllm#18076

Merged

ganyi1996ppo mentioned this pull request Jul 30, 2025

[core] Support capture custom ops into aclgraph #2113

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

support aclgraph #426

support aclgraph #426

Uh oh!

fakeYan commented Mar 28, 2025 •

edited

Loading

Uh oh!

wuhuikx left a comment

Uh oh!

Uh oh!

wangxiyuan Apr 15, 2025

Uh oh!

fakeYan Apr 18, 2025 •

edited

Loading

Uh oh!

wangxiyuan Apr 15, 2025

Uh oh!

wangxiyuan Apr 15, 2025

Uh oh!

wangxiyuan Apr 15, 2025

Uh oh!

fakeYan Apr 18, 2025

Uh oh!

wangxiyuan Apr 18, 2025

Uh oh!

fakeYan Apr 18, 2025

Uh oh!

MengqingCao Apr 22, 2025

Uh oh!

yiz-liu Apr 22, 2025

Uh oh!

Uh oh!

Uh oh!

support aclgraph #426

support aclgraph #426

Uh oh!

Conversation

fakeYan commented Mar 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

wuhuikx left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fakeYan Apr 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

fakeYan commented Mar 28, 2025 •

edited

Loading

fakeYan Apr 18, 2025 •

edited

Loading