[ROCm][Bugfix] Aiter mha fp8 fix #24991

dllehr-amd · 2025-09-16T19:26:12Z

Fixes accuracy when using VLLM_ROCM_USE_AITER_MHA=1 on gfx architectures that use torch.float8_e4m3fn datatypes

Purpose

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>

Change f8 kv-cache check in rocm_aiter_fa.py to account for both float8_e4m3fnuz and float8_e4m3fn datatypes. Signed-off-by: Doug Lehr <douglehr@amd.com>

Signed-off-by: Doug Lehr <douglehr@amd.com> Co-authored-by: Doug Lehr <douglehr@amd.com>

Signed-off-by: Doug Lehr <douglehr@amd.com> Co-authored-by: Doug Lehr <douglehr@amd.com> Signed-off-by: charlifu <charlifu@amd.com>

Signed-off-by: Doug Lehr <douglehr@amd.com> Co-authored-by: Doug Lehr <douglehr@amd.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Signed-off-by: Doug Lehr <douglehr@amd.com> Co-authored-by: Doug Lehr <douglehr@amd.com>

Signed-off-by: Doug Lehr <douglehr@amd.com> Co-authored-by: Doug Lehr <douglehr@amd.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

mergify bot added documentation Improvements or additions to documentation ci/build llama Related to Llama models performance Performance-related issues gpt-oss Related to GPT-OSS models rocm Related to AMD ROCm labels Sep 16, 2025

github-project-automation bot added this to gpt-oss Issues & Enhancements Sep 16, 2025

mergify bot added the v1 label Sep 16, 2025

github-project-automation bot moved this to To Triage in gpt-oss Issues & Enhancements Sep 16, 2025

gshtras removed request for DarkLight1337, WoosukKwon, alexm-redhat, gshtras, houseroad, robertgshaw2-redhat, yewentao256 and ywang96 September 16, 2025 19:35

mergify bot added the v1 label Sep 16, 2025

dllehr-amd force-pushed the aiter_mha_fp8_fix branch from e69addc to d12d4ea Compare September 16, 2025 20:33

gshtras added a commit to ROCm/vllm that referenced this pull request Sep 17, 2025

Cherry picking vllm-project#24991

02d1f85

Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>

Add check for float8 type in aiter mha

d662144

Change f8 kv-cache check in rocm_aiter_fa.py to account for both float8_e4m3fnuz and float8_e4m3fn datatypes. Signed-off-by: Doug Lehr <douglehr@amd.com>

dllehr-amd force-pushed the aiter_mha_fp8_fix branch from d12d4ea to d662144 Compare September 17, 2025 20:26

gshtras approved these changes Sep 17, 2025

View reviewed changes

github-project-automation bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Sep 17, 2025

gshtras added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 17, 2025

gshtras enabled auto-merge (squash) September 17, 2025 20:50

gshtras changed the title ~~Aiter mha fp8 fix~~ [ROCm][Bugfix] Aiter mha fp8 fix Sep 17, 2025

gshtras merged commit 1a456c7 into vllm-project:main Sep 17, 2025
53 of 55 checks passed

github-project-automation bot moved this from Ready to Done in gpt-oss Issues & Enhancements Sep 17, 2025

debroy-rh pushed a commit to debroy-rh/vllm that referenced this pull request Sep 19, 2025

Aiter mha fp8 fix (vllm-project#24991)

23afac1

Signed-off-by: Doug Lehr <douglehr@amd.com> Co-authored-by: Doug Lehr <douglehr@amd.com>

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

Aiter mha fp8 fix (vllm-project#24991)

f62de80

Signed-off-by: Doug Lehr <douglehr@amd.com> Co-authored-by: Doug Lehr <douglehr@amd.com>

charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025

Aiter mha fp8 fix (vllm-project#24991)

85262ca

Signed-off-by: Doug Lehr <douglehr@amd.com> Co-authored-by: Doug Lehr <douglehr@amd.com> Signed-off-by: charlifu <charlifu@amd.com>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025

Aiter mha fp8 fix (vllm-project#24991)

57a4aac

Signed-off-by: Doug Lehr <douglehr@amd.com> Co-authored-by: Doug Lehr <douglehr@amd.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025

Aiter mha fp8 fix (vllm-project#24991)

3794cd9

Signed-off-by: Doug Lehr <douglehr@amd.com> Co-authored-by: Doug Lehr <douglehr@amd.com>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025

Aiter mha fp8 fix (vllm-project#24991)

1247a47

Signed-off-by: Doug Lehr <douglehr@amd.com> Co-authored-by: Doug Lehr <douglehr@amd.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

[ROCm][Bugfix] Aiter mha fp8 fix #24991

[ROCm][Bugfix] Aiter mha fp8 fix #24991

Uh oh!

dllehr-amd commented Sep 16, 2025 •

edited by github-actions bot

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Uh oh!

[ROCm][Bugfix] Aiter mha fp8 fix #24991

[ROCm][Bugfix] Aiter mha fp8 fix #24991

Uh oh!

Conversation

dllehr-amd commented Sep 16, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dllehr-amd commented Sep 16, 2025 •

edited by github-actions bot

Loading