[2/N][refactor] torchair deepseek mla backend refactor #2459

linfeng-yuan · 2025-08-20T08:40:59Z

What this PR does / why we need it?

This PR move current unified mla backend to torchair folder and remove torchair-related code in attention/mla_v1.py (1.3k -> 0.9k).

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Running eager mode with mla backend, and torchair mode with code before 2445

vLLM version: v0.10.0
vLLM main: vllm-project/vllm@f571ff8

gemini-code-assist

Code Review

This pull request refactors the attention backend selection logic and introduces a new TorchAir MLA backend for DeepSeek models on Ascend NPUs. The refactoring in platform.py correctly handles the selection of different attention backends. The new implementation in vllm_ascend/torchair/torchair_mla.py adds the TorchAir-based MLA backend. I've found a critical issue in the decode path of this new implementation that needs to be addressed.

vllm_ascend/torchair/torchair_mla.py

github-actions · 2025-08-20T09:02:32Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

codecov · 2025-08-20T17:15:10Z

Codecov Report

❌ Patch coverage is 79.62264% with 216 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.56%. Comparing base (2bb7e55) to head (0b5b5b2).
⚠️ Report is 24 commits behind head on main.

Files with missing lines	Patch %	Lines
vllm_ascend/torchair/torchair_mla.py	61.70%	211 Missing ⚠️
vllm_ascend/attention/mla_v1.py	85.29%	5 Missing ⚠️

❌ Your patch check has failed because the patch coverage (79.62%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2459      +/-   ##
==========================================
+ Coverage   76.18%   77.56%   +1.37%     
==========================================
  Files         120      130      +10     
  Lines       13532    17149    +3617     
==========================================
+ Hits        10310    13302    +2992     
- Misses       3222     3847     +625

Flag	Coverage Δ
unittests	`77.56% <79.62%> (+1.37%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2025-08-21T00:57:14Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

wangxiyuan · 2025-08-21T01:31:56Z

test passed here: https://github.com/vllm-project/vllm-ascend/actions/runs/17105103268?pr=2459

Signed-off-by: linfeng-yuan <1102311262@qq.com>

wangxiyuan · 2025-08-21T06:02:20Z

CI failue doesn't relate to this PR

gemini-code-assist bot reviewed Aug 20, 2025

View reviewed changes

vllm_ascend/torchair/torchair_mla.py Show resolved Hide resolved

github-actions bot added the module:core label Aug 20, 2025

linfeng-yuan force-pushed the torchair_deepseek_modeling_refactore_01 branch from 9d75ec4 to 5c278a8 Compare August 20, 2025 15:46

github-actions bot added the module:tests label Aug 20, 2025

linfeng-yuan force-pushed the torchair_deepseek_modeling_refactore_01 branch 2 times, most recently from 6ad2f4d to 0b5b5b2 Compare August 20, 2025 16:59

github-actions bot added the merge-conflicts label Aug 21, 2025

linfeng-yuan force-pushed the torchair_deepseek_modeling_refactore_01 branch 2 times, most recently from 582e3b8 to 9b0233d Compare August 21, 2025 01:59

github-actions bot removed the merge-conflicts label Aug 21, 2025

linfeng-yuan force-pushed the torchair_deepseek_modeling_refactore_01 branch 3 times, most recently from afd1b79 to 2d4e437 Compare August 21, 2025 04:06

[2/N][refactor] torchair deepseek mla backend refactor

d8671e8

Signed-off-by: linfeng-yuan <1102311262@qq.com>

linfeng-yuan force-pushed the torchair_deepseek_modeling_refactore_01 branch from 2d4e437 to d8671e8 Compare August 21, 2025 04:51

wangxiyuan approved these changes Aug 21, 2025

View reviewed changes

wangxiyuan merged commit 0ca3f48 into vllm-project:main Aug 21, 2025
18 of 20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[2/N][refactor] torchair deepseek mla backend refactor #2459

[2/N][refactor] torchair deepseek mla backend refactor #2459

linfeng-yuan commented Aug 20, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Aug 20, 2025

Uh oh!

codecov bot commented Aug 20, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 21, 2025

Uh oh!

wangxiyuan commented Aug 21, 2025

Uh oh!

wangxiyuan commented Aug 21, 2025

Uh oh!

Uh oh!

Uh oh!

[2/N][refactor] torchair deepseek mla backend refactor #2459

[2/N][refactor] torchair deepseek mla backend refactor #2459

Conversation

linfeng-yuan commented Aug 20, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

github-actions bot commented Aug 20, 2025

Uh oh!

codecov bot commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Aug 21, 2025

Uh oh!

wangxiyuan commented Aug 21, 2025

Uh oh!

wangxiyuan commented Aug 21, 2025

Uh oh!

Uh oh!

Uh oh!

linfeng-yuan commented Aug 20, 2025 •

edited by github-actions bot

Loading

codecov bot commented Aug 20, 2025 •

edited

Loading