Closed as not planned
Description
With CUDA v5.5.2 and Julia 1.10.3 this works:
using CUDA, StaticArrays, Random
f(x, rng) = SVector(rand(rng), rand(rng))
f.(CuArray(ones(5)), Random.GLOBAL_RNG)
5-element CuArray{SVector{2, Float64}, 1, CUDA.DeviceMemory}:
[0.36764910175195165, 0.4794875174237414]
[0.010348667621972396, 0.6326148712929516]
[0.5735762055272076, 0.42568438586826485]
[0.4907457866677476, 0.19396317615182368]
[0.02563223451906871, 0.553653790980341]
However with Julia 1.11.0 it errors:
ERROR: InvalidIRError: compiling MethodInstance for (::GPUArrays.var"#34#36")(::CUDA.CuKernelContext, ::CuDeviceVector{…}, ::Base.Broadcast.Broadcasted{…}, ::Int64) resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to julia.get_pgcstack)
Stacktrace:
[1] _broadcast_getindex_evalf
@ ./broadcast.jl:673
[2] _broadcast_getindex
@ ./broadcast.jl:646
[3] getindex
@ ./broadcast.jl:605
[4] #34
@ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:59
Hint: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code with Cthulhu.jl
Stacktrace:
[1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, args::LLVM.Module)
@ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/validation.jl:147
[2] macro expansion
@ ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:382 [inlined]
[3] macro expansion
@ ~/.julia/packages/TimerOutputs/NRdsv/src/TimerOutput.jl:253 [inlined]
[4] macro expansion
@ ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:381 [inlined]
[5]
@ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/utils.jl:108
[6] emit_llvm
@ ~/.julia/packages/GPUCompiler/2CW9L/src/utils.jl:106 [inlined]
[7]
@ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:100
[8] codegen
@ ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:82 [inlined]
[9] compile(target::Symbol, job::GPUCompiler.CompilerJob; kwargs::@Kwargs{})
@ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:79
[10] compile
@ ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:74 [inlined]
[11] #1145
@ ~/.julia/dev/CUDA/src/compiler/compilation.jl:250 [inlined]
[12] JuliaContext(f::CUDA.var"#1145#1148"{GPUCompiler.CompilerJob{…}}; kwargs::@Kwargs{})
@ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:34
[13] JuliaContext(f::Function)
@ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:25
[14] compile(job::GPUCompiler.CompilerJob)
@ CUDA ~/.julia/dev/CUDA/src/compiler/compilation.jl:249
[15] actual_compilation(cache::Dict{…}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{…}, compiler::typeof(CUDA.compile), linker::typeof(CUDA.link))
@ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/execution.jl:237
[16] cached_compilation(cache::Dict{…}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{…}, compiler::Function, linker::Function)
@ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/execution.jl:151
[17] macro expansion
@ ~/.julia/dev/CUDA/src/compiler/execution.jl:380 [inlined]
[18] macro expansion
@ ./lock.jl:273 [inlined]
[19] cufunction(f::GPUArrays.var"#34#36", tt::Type{Tuple{…}}; kwargs::@Kwargs{})
@ CUDA ~/.julia/dev/CUDA/src/compiler/execution.jl:375
[20] cufunction
@ ~/.julia/dev/CUDA/src/compiler/execution.jl:372 [inlined]
[21] macro expansion
@ ~/.julia/dev/CUDA/src/compiler/execution.jl:112 [inlined]
[22] #launch_heuristic#1200
@ ~/.julia/dev/CUDA/src/gpuarrays.jl:17 [inlined]
[23] launch_heuristic
@ ~/.julia/dev/CUDA/src/gpuarrays.jl:15 [inlined]
[24] _copyto!
@ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:78 [inlined]
[25] copyto!
@ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:44 [inlined]
[26] copy
@ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:29 [inlined]
[27] materialize(bc::Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{…}, Nothing, typeof(f), Tuple{…}})
@ Base.Broadcast ./broadcast.jl:867
[28] top-level scope
@ REPL[3]:1
Some type information was truncated. Use `show(err)` to see complete types.
I know there are some complexities with broadcasting random numbers (#1480) but I wonder if this is an intended regression.
My CUDA.versioninfo()
is:
CUDA runtime 12.6, artifact installation
CUDA driver 12.6
NVIDIA driver 535.183.1
CUDA libraries:
- CUBLAS: 12.6.3
- CURAND: 10.3.7
- CUFFT: 11.3.0
- CUSOLVER: 11.7.1
- CUSPARSE: 12.5.4
- CUPTI: 2024.3.2 (API 24.0.0)
- NVML: 12.0.0+535.183.1
Julia packages:
- CUDA: 5.5.2
- CUDA_Driver_jll: 0.10.3+0
- CUDA_Runtime_jll: 0.15.3+0
Toolchain:
- Julia: 1.11.0
- LLVM: 16.0.6
2 devices:
0: NVIDIA RTX A6000 (sm_86, 46.941 GiB / 47.988 GiB available)
1: NVIDIA RTX A6000 (sm_86, 45.888 GiB / 47.988 GiB available)
Metadata
Metadata
Assignees
Labels
No labels