Skip to content

Conversation

@dllehr-amd
Copy link
Contributor

@dllehr-amd dllehr-amd commented Sep 16, 2025

Fixes accuracy when using VLLM_ROCM_USE_AITER_MHA=1 on gfx architectures that use torch.float8_e4m3fn datatypes

Purpose

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify mergify bot added the v1 label Sep 16, 2025
gshtras added a commit to ROCm/vllm that referenced this pull request Sep 17, 2025
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
Change f8 kv-cache check in rocm_aiter_fa.py to account for both float8_e4m3fnuz
and float8_e4m3fn datatypes.

Signed-off-by: Doug Lehr <douglehr@amd.com>
@github-project-automation github-project-automation bot moved this from To Triage to Ready in gpt-oss Issues & Enhancements Sep 17, 2025
@gshtras gshtras added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 17, 2025
@gshtras gshtras enabled auto-merge (squash) September 17, 2025 20:50
@gshtras gshtras changed the title Aiter mha fp8 fix [ROCm][Bugfix] Aiter mha fp8 fix Sep 17, 2025
@gshtras gshtras merged commit 1a456c7 into vllm-project:main Sep 17, 2025
53 of 55 checks passed
debroy-rh pushed a commit to debroy-rh/vllm that referenced this pull request Sep 19, 2025
Signed-off-by: Doug Lehr <douglehr@amd.com>
Co-authored-by: Doug Lehr <douglehr@amd.com>
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: Doug Lehr <douglehr@amd.com>
Co-authored-by: Doug Lehr <douglehr@amd.com>
charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: Doug Lehr <douglehr@amd.com>
Co-authored-by: Doug Lehr <douglehr@amd.com>
Signed-off-by: charlifu <charlifu@amd.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
Signed-off-by: Doug Lehr <douglehr@amd.com>
Co-authored-by: Doug Lehr <douglehr@amd.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
Signed-off-by: Doug Lehr <douglehr@amd.com>
Co-authored-by: Doug Lehr <douglehr@amd.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: Doug Lehr <douglehr@amd.com>
Co-authored-by: Doug Lehr <douglehr@amd.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm v1

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants