-
Notifications
You must be signed in to change notification settings - Fork 130
perf[gpu]: fused AoT for and bitpacking kernel
#4872
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for and bitpacking kernel
Codecov Report✅ All modified and coverable lines are covered by tests. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ed33d7f to
bba773b
Compare
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
d74cf62 to
d77b6eb
Compare
# Conflicts: # fls-gpu-kernel-gen/src/bit_unpack.rs
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
for and bitpacking kernelfor and bitpacking kernel
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
Signed-off-by: Joe Isaacs <joe.isaacs@live.co.uk>
| launch.arg(&self.packed); | ||
| launch.arg(&self.unpacked); | ||
| launch.arg(&self.reference); | ||
| launch.record_kernel_launch(CU_EVENT_DEFAULT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need this
I have only implemented a PoC fused FoR-BP kernel, I don't want to implemented them all since there will be a lot of duplication. I think we likely need to compile these at runtime.
I have also fixed up the kernels build system.
Fused is fast
Signed-off-by: Joe Isaacs joe.isaacs@live.co.uk