Skip to content

[WIP] support sm90_fp4_mqa_logits#326

Draft
laixinn wants to merge 4 commits into
deepseek-ai:mainfrom
laixinn:sm90_fp4_mqa_logits
Draft

[WIP] support sm90_fp4_mqa_logits#326
laixinn wants to merge 4 commits into
deepseek-ai:mainfrom
laixinn:sm90_fp4_mqa_logits

Conversation

@laixinn
Copy link
Copy Markdown

@laixinn laixinn commented May 1, 2026

This PR dequants w4a4 to w8a8 to support the FP4 Indexer in SM90, supposing to be a part of sgl-project/sglang#23602.
Currently, this operation brings 40% latency drawbacks.
cc @AniZpZ

@laixinn laixinn changed the title [Feature] support sm90_fp4_mqa_logits [WIP] support sm90_fp4_mqa_logits May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant