-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FP8 splitgemm user defined triton kernel #263
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/263
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 113ecde with merge base 5e28109 (): This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This is passing locally on nightlies but failing CI, I'll check what's up - I suspect it's some caching issue with compile |
this is interesting, wondering if we have benchmarks / accuracy data on how this compares to cuBLAS float8 gemm? |
I don't believe the announcement baselined vs cuBLAS but perhaps @AdnanHoque can shed some more detail |
* FP8 splitgemm user defined triton kernel * yolo * Trigger CI * yolo * yolo * yolo * yolo * Update test_fp8.py
This is borrowing the kernel from https://github.com/pytorch-labs/applied-ai/blob/main/kernels/triton/inference/fp8/splitk_gemm_fp8.py
We had an issue with testing this in our CI but it works fine locally and in pytorch CI pytorch/pytorch#126982