Skip to content

Broadcasting over random static array errors on Julia 1.11 #2523

Closed as not planned
@jgreener64

Description

@jgreener64

With CUDA v5.5.2 and Julia 1.10.3 this works:

using CUDA, StaticArrays, Random
f(x, rng) = SVector(rand(rng), rand(rng))
f.(CuArray(ones(5)), Random.GLOBAL_RNG)
5-element CuArray{SVector{2, Float64}, 1, CUDA.DeviceMemory}:
 [0.36764910175195165, 0.4794875174237414]
 [0.010348667621972396, 0.6326148712929516]
 [0.5735762055272076, 0.42568438586826485]
 [0.4907457866677476, 0.19396317615182368]
 [0.02563223451906871, 0.553653790980341]

However with Julia 1.11.0 it errors:

ERROR: InvalidIRError: compiling MethodInstance for (::GPUArrays.var"#34#36")(::CUDA.CuKernelContext, ::CuDeviceVector{…}, ::Base.Broadcast.Broadcasted{…}, ::Int64) resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to julia.get_pgcstack)
Stacktrace:
 [1] _broadcast_getindex_evalf
   @ ./broadcast.jl:673
 [2] _broadcast_getindex
   @ ./broadcast.jl:646
 [3] getindex
   @ ./broadcast.jl:605
 [4] #34
   @ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:59
Hint: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code with Cthulhu.jl
Stacktrace:
  [1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams}, args::LLVM.Module)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/validation.jl:147
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:382 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/TimerOutputs/NRdsv/src/TimerOutput.jl:253 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:381 [inlined]
  [5] 
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/utils.jl:108
  [6] emit_llvm
    @ ~/.julia/packages/GPUCompiler/2CW9L/src/utils.jl:106 [inlined]
  [7] 
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:100
  [8] codegen
    @ ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:82 [inlined]
  [9] compile(target::Symbol, job::GPUCompiler.CompilerJob; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:79
 [10] compile
    @ ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:74 [inlined]
 [11] #1145
    @ ~/.julia/dev/CUDA/src/compiler/compilation.jl:250 [inlined]
 [12] JuliaContext(f::CUDA.var"#1145#1148"{GPUCompiler.CompilerJob{…}}; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:34
 [13] JuliaContext(f::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/driver.jl:25
 [14] compile(job::GPUCompiler.CompilerJob)
    @ CUDA ~/.julia/dev/CUDA/src/compiler/compilation.jl:249
 [15] actual_compilation(cache::Dict{…}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{…}, compiler::typeof(CUDA.compile), linker::typeof(CUDA.link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/execution.jl:237
 [16] cached_compilation(cache::Dict{…}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{…}, compiler::Function, linker::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/2CW9L/src/execution.jl:151
 [17] macro expansion
    @ ~/.julia/dev/CUDA/src/compiler/execution.jl:380 [inlined]
 [18] macro expansion
    @ ./lock.jl:273 [inlined]
 [19] cufunction(f::GPUArrays.var"#34#36", tt::Type{Tuple{…}}; kwargs::@Kwargs{})
    @ CUDA ~/.julia/dev/CUDA/src/compiler/execution.jl:375
 [20] cufunction
    @ ~/.julia/dev/CUDA/src/compiler/execution.jl:372 [inlined]
 [21] macro expansion
    @ ~/.julia/dev/CUDA/src/compiler/execution.jl:112 [inlined]
 [22] #launch_heuristic#1200
    @ ~/.julia/dev/CUDA/src/gpuarrays.jl:17 [inlined]
 [23] launch_heuristic
    @ ~/.julia/dev/CUDA/src/gpuarrays.jl:15 [inlined]
 [24] _copyto!
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:78 [inlined]
 [25] copyto!
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:44 [inlined]
 [26] copy
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:29 [inlined]
 [27] materialize(bc::Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{…}, Nothing, typeof(f), Tuple{…}})
    @ Base.Broadcast ./broadcast.jl:867
 [28] top-level scope
    @ REPL[3]:1
Some type information was truncated. Use `show(err)` to see complete types.

I know there are some complexities with broadcasting random numbers (#1480) but I wonder if this is an intended regression.

My CUDA.versioninfo() is:

CUDA runtime 12.6, artifact installation
CUDA driver 12.6
NVIDIA driver 535.183.1

CUDA libraries: 
- CUBLAS: 12.6.3
- CURAND: 10.3.7
- CUFFT: 11.3.0
- CUSOLVER: 11.7.1
- CUSPARSE: 12.5.4
- CUPTI: 2024.3.2 (API 24.0.0)
- NVML: 12.0.0+535.183.1

Julia packages: 
- CUDA: 5.5.2
- CUDA_Driver_jll: 0.10.3+0
- CUDA_Runtime_jll: 0.15.3+0

Toolchain:
- Julia: 1.11.0
- LLVM: 16.0.6

2 devices:
  0: NVIDIA RTX A6000 (sm_86, 46.941 GiB / 47.988 GiB available)
  1: NVIDIA RTX A6000 (sm_86, 45.888 GiB / 47.988 GiB available)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions