Skip to content

fix: CUDA scalar inplace#6567

Merged
0ax1 merged 1 commit intodevelopfrom
ad/fix-scalar-inplace-cuda
Feb 18, 2026
Merged

fix: CUDA scalar inplace#6567
0ax1 merged 1 commit intodevelopfrom
ad/fix-scalar-inplace-cuda

Conversation

@0ax1
Copy link
Contributor

@0ax1 0ax1 commented Feb 18, 2026

Summary

Previously, we were calling into scalar_kernel(const InputT *__restrict in, OutputT *__restrict out..) from scalar_kernel_inplace. This was unsound, as __restrict requires them to be separate buffers and is a hint to allow the compiler to apply more optimizations.

@0ax1 0ax1 added the changelog/fix A bug fix label Feb 18, 2026
@0ax1 0ax1 enabled auto-merge (squash) February 18, 2026 11:34
@0ax1 0ax1 merged commit b251c1e into develop Feb 18, 2026
47 of 48 checks passed
@0ax1 0ax1 deleted the ad/fix-scalar-inplace-cuda branch February 18, 2026 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/fix A bug fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants