Skip to content

[2/N][refactor] torchair deepseek mla backend refactor #2459

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

linfeng-yuan
Copy link
Contributor

@linfeng-yuan linfeng-yuan commented Aug 20, 2025

What this PR does / why we need it?

This PR move current unified mla backend to torchair folder and remove torchair-related code in attention/mla_v1.py (1.3k -> 0.9k).

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Running eager mode with mla backend, and torchair mode with code before 2445

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the attention backend selection logic and introduces a new TorchAir MLA backend for DeepSeek models on Ascend NPUs. The refactoring in platform.py correctly handles the selection of different attention backends. The new implementation in vllm_ascend/torchair/torchair_mla.py adds the TorchAir-based MLA backend. I've found a critical issue in the decode path of this new implementation that needs to be addressed.

Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@linfeng-yuan linfeng-yuan force-pushed the torchair_deepseek_modeling_refactore_01 branch from 9d75ec4 to 5c278a8 Compare August 20, 2025 15:46
@linfeng-yuan linfeng-yuan force-pushed the torchair_deepseek_modeling_refactore_01 branch 2 times, most recently from 6ad2f4d to 0b5b5b2 Compare August 20, 2025 16:59
Copy link

codecov bot commented Aug 20, 2025

Codecov Report

❌ Patch coverage is 79.62264% with 216 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.56%. Comparing base (2bb7e55) to head (0b5b5b2).
⚠️ Report is 24 commits behind head on main.

Files with missing lines Patch % Lines
vllm_ascend/torchair/torchair_mla.py 61.70% 211 Missing ⚠️
vllm_ascend/attention/mla_v1.py 85.29% 5 Missing ⚠️

❌ Your patch check has failed because the patch coverage (79.62%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2459      +/-   ##
==========================================
+ Coverage   76.18%   77.56%   +1.37%     
==========================================
  Files         120      130      +10     
  Lines       13532    17149    +3617     
==========================================
+ Hits        10310    13302    +2992     
- Misses       3222     3847     +625     
Flag Coverage Δ
unittests 77.56% <79.62%> (+1.37%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

@wangxiyuan
Copy link
Collaborator

@linfeng-yuan linfeng-yuan force-pushed the torchair_deepseek_modeling_refactore_01 branch 2 times, most recently from 582e3b8 to 9b0233d Compare August 21, 2025 01:59
@linfeng-yuan linfeng-yuan force-pushed the torchair_deepseek_modeling_refactore_01 branch 3 times, most recently from afd1b79 to 2d4e437 Compare August 21, 2025 04:06
Signed-off-by: linfeng-yuan <1102311262@qq.com>
@linfeng-yuan linfeng-yuan force-pushed the torchair_deepseek_modeling_refactore_01 branch from 2d4e437 to d8671e8 Compare August 21, 2025 04:51
@wangxiyuan
Copy link
Collaborator

CI failue doesn't relate to this PR

@wangxiyuan wangxiyuan merged commit 0ca3f48 into vllm-project:main Aug 21, 2025
18 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants