Skip to content

Enable autotuning and bf16 accumulation for SYCL CUTLASS #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

sommerlukas
Copy link
Collaborator

@sommerlukas sommerlukas commented Apr 30, 2025

Enable autotuning for SYCL CUTLASS by completing the SYCL benchmark request class.

Also removes a temporary workaround that forced float32 accumulation to now allow GEMM to accumulate in bfloat16.

This addresses one of the items left open in #2.

@sommerlukas sommerlukas self-assigned this Apr 30, 2025
Enable autotuning for SYCL CUTLASS by completing
the SYCL benchmark request class.

Also adds a temporary workaround to allow bf16 GEMM
to accumulate in FP32 in code paths used when
auto-tuning is active.

Signed-off-by: Lukas Sommer <lukas.sommer@codeplay.com>
Signed-off-by: Lukas Sommer <lukas.sommer@codeplay.com>
@sommerlukas sommerlukas force-pushed the cutlass-sycl-autotune branch from e21c49d to d76676d Compare May 5, 2025 14:46
@sommerlukas sommerlukas changed the title Enable autotuning for SYCL CUTLASS Enable autotuning and bf16 accumulation for SYCL CUTLASS May 5, 2025
@sommerlukas
Copy link
Collaborator Author

This PR depends on codeplaysoftware/cutlass-sycl#356 for the GEMM accumulation in bf16. We can only merge this PR once codeplaysoftware/cutlass-sycl#356 has been merged and the third_party/cutlass submodule has been updated to include the changes from codeplaysoftware/cutlass-sycl#356.

Signed-off-by: Lukas Sommer <lukas.sommer@codeplay.com>
@sommerlukas sommerlukas merged commit bc53ae6 into codeplaysoftware:sycl-develop May 16, 2025
47 checks passed
@sommerlukas sommerlukas deleted the cutlass-sycl-autotune branch May 16, 2025 08:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants