ROCm mx-fp8 Gemm #2066

petrex · 2025-04-16T23:16:03Z

TLDR: This pull request introduces support for AMD MI355x GPUs with HIPBLASLT kernels in the MX formats prototype. Note that this feature requires ROCm 6.5+ and gfx950

alongside several updates to improve compatibility and functionality for these GPUs. Key changes include updates to configuration options, validation logic, and GEMM kernel handling to integrate HIPBLASLT support.

AMD MI355x GPU Support:

torchao/prototype/mx_formats/config.py:
- Added HIPBLASLT as a new MXGemmKernelChoice and included it in the MXLinearRecipeName for configuration presets. [1] [2]
- Updated _validate_gemm_kernel_choice to include validation logic for HIPBLASLT, ensuring proper block size, data type, and ROCm availability.
torchao/prototype/mx_formats/mx_ops.py:
- Extended mx_mm to support HIPBLASLT for scaled matrix multiplication and real GEMM operations. [1] [2]
- Adjusted error messaging for unsupported kernel choices in FP4 operations.

Documentation Updates:

torchao/prototype/mx_formats/README.md:
- Updated the README to reflect AMD MI355x GPU support, including instructions for using HIPBLASLT kernels and ongoing optimization efforts for AMD hardware. [1] [2] [3]

Minor Code Refinements:

torchao/prototype/mx_formats/mx_ops.py:
- Improved readability in mx_view_op by reformatting conditions for FP6 element packing.

…dation logic. Added MXFP8_HIPBLASLT recipe and adjusted mx_mm function to accommodate new kernel options.

…ASLT kernel choice for mxfp8 gemm. Enhance documentation on end-to-end performance optimization efforts for AMD GPUs.

pytorch-bot · 2025-04-16T23:16:07Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2066

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 129a6d6 with merge base 801af03 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…py to include HIPBLASLT as a valid kernel choice for MX FP8 operations.

petrex added 2 commits April 16, 2025 15:59

Enhance MX formats to support HIPBLASLT kernel choice and update vali…

c21d24c

…dation logic. Added MXFP8_HIPBLASLT recipe and adjusted mx_mm function to accommodate new kernel options.

Update README.md to include support for AMD MI355x hardware and HIPBL…

36dd5b7

…ASLT kernel choice for mxfp8 gemm. Enhance documentation on end-to-end performance optimization efforts for AMD GPUs.

pytorch-bot bot added the module: rocm label Apr 16, 2025

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 16, 2025

petrex added the mx label Apr 17, 2025

lint

c75df8e

petrex requested a review from vkuzo April 18, 2025 16:59

petrex added topic: new feature Use this tag if this PR adds a new feature ciflow/rocm labels Apr 18, 2025

petrex and others added 7 commits April 23, 2025 17:34

Merge branch 'main' into rocm_mx_gemm

9b7b602

Merge branch 'main' into rocm_mx_gemm

5ee124e

lint

df2c220

Merge branch 'main' into rocm_mx_gemm

8df1d85

Update HIPBLASLT comment in config.py and adjust assertion in mx_ops.…

8ae4021

…py to include HIPBLASLT as a valid kernel choice for MX FP8 operations.

lint

8505860

lint

129a6d6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ROCm mx-fp8 Gemm #2066

ROCm mx-fp8 Gemm #2066

Uh oh!

petrex commented Apr 16, 2025

Uh oh!

pytorch-bot bot commented Apr 16, 2025 •

edited

Loading

Uh oh!

Uh oh!

ROCm mx-fp8 Gemm #2066

Are you sure you want to change the base?

ROCm mx-fp8 Gemm #2066

Uh oh!

Conversation

petrex commented Apr 16, 2025

AMD MI355x GPU Support:

Documentation Updates:

Minor Code Refinements:

Uh oh!

pytorch-bot bot commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2066

✅ No Failures

Uh oh!

Uh oh!

pytorch-bot bot commented Apr 16, 2025 •

edited

Loading