Skip to content

CUDA: Fallback to UnsafeAtomics #56

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: vc/cuda_atomicswap
Choose a base branch
from

Conversation

vchuravy
Copy link
Member

No description provided.

@vchuravy
Copy link
Member Author

vchuravy commented May 13, 2025

This of course does not work:

using Atomix
using UnsafeAtomics

using CUDA

function kernel(a, b)
    x = UnsafeAtomics.load(pointer(a, 1))
    UnsafeAtomics.store!(pointer(b, 1), x)
    return nothing
end


b = CUDA.zeros(Int, 1)
a = CUDA.ones(Int, 1)

@cuda kernel(a, b)

Yields

ERROR: LLVM error: Cannot select: 0x1e7845f0: ch = AtomicStore<(store seq_cst (s64) into %ir.4, addrspace 1)> 0x1e784580:1, 0x1e783a90, 0x1e784580, /home/vchuravy/.julia/packages/LLVM/xTJfF/src/interop/base.jl:39 @[ /home/vchuravy/.julia/packages/UnsafeAtomics/vpyYB/ext/UnsafeAtomicsLLVM/atomics.jl:178 @[ /home/vchuravy/.julia/packages/UnsafeAtomics/vpyYB/ext/UnsafeAtomicsLLVM/atomics.jl:131 @[ /home/vchuravy/.julia/packages/UnsafeAtomics/vpyYB/ext/UnsafeAtomicsLLVM/internal.jl:16 @[ /home/vchuravy/.julia/packages/UnsafeAtomics/vpyYB/src/core.jl:8 @[ /home/vchuravy/.julia/packages/UnsafeAtomics/vpyYB/src/core.jl:2 @[ /home/vchuravy/niklas.jl:8 ] ] ] ] ] ]
  0x1e783a90: i64,ch = load<(dereferenceable invariant load (s64) from `i64 addrspace(101)* null`, addrspace 101)> 0x1da29370, TargetExternalSymbol:i64'_Z6kernel13CuDeviceArrayI5Int64Li1ELi1EES1__param_2', undef:i64
    0x1e783f60: i64 = TargetExternalSymbol'_Z6kernel13CuDeviceArrayI5Int64Li1ELi1EES1__param_2'
    0x1e783780: i64 = undef
  0x1e784580: i64,ch = AtomicLoad<(load seq_cst (s64) from %ir.2, addrspace 1)> 0x1da29370, 0x1e783a20, /home/vchuravy/.julia/packages/LLVM/xTJfF/src/interop/base.jl:39 @[ /home/vchuravy/.julia/packages/UnsafeAtomics/vpyYB/ext/UnsafeAtomicsLLVM/atomics.jl:94 @[ /home/vchuravy/.julia/packages/UnsafeAtomics/vpyYB/ext/UnsafeAtomicsLLVM/atomics.jl:94 @[ /home/vchuravy/.julia/packages/UnsafeAtomics/vpyYB/ext/UnsafeAtomicsLLVM/internal.jl:12 @[ /home/vchuravy/.julia/packages/UnsafeAtomics/vpyYB/src/core.jl:7 @[ /home/vchuravy/.julia/packages/UnsafeAtomics/vpyYB/src/core.jl:1 @[ /home/vchuravy/niklas.jl:7 ] ] ] ] ] ]
    0x1e783a20: i64,ch = load<(dereferenceable invariant load (s64) from `i64 addrspace(101)* null`, addrspace 101)> 0x1da29370, TargetExternalSymbol:i64'_Z6kernel13CuDeviceArrayI5Int64Li1ELi1EES1__param_1', undef:i64
      0x1e783860: i64 = TargetExternalSymbol'_Z6kernel13CuDeviceArrayI5Int64Li1ELi1EES1__param_1'
      0x1e783780: i64 = undef
In function: _Z6kernel13CuDeviceArrayI5Int64Li1ELi1EES1_

So seq_cst is not supported...

function kernel(a, b)
    x = UnsafeAtomics.load(pointer(a, 1), UnsafeAtomics.monotonic)
    UnsafeAtomics.store!(pointer(b, 1), x, UnsafeAtomics.monotonic)
    return nothing
end

b = CUDA.zeros(Int, 1)
a = CUDA.ones(Int, 1)

@cuda kernel(a, b)

Does work

@maleadt
Copy link
Member

maleadt commented May 13, 2025

PTX does not support atom.seq_cst. Instead, cmpxchg seq_cst is emulated as "fence.sc; atom.cas"
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-atom

@vchuravy
Copy link
Member Author

Yeah, the whole order business is a mess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants