Skip to content

ndrange provided in KernelAbstractions kernels is broken #283

Closed
@VarLad

Description

@VarLad

I'm unable to pinpoint at what point the error is occurring, as debugging KernelAbstractions kernels is different from normal Julia, so some guidance here would be helpful.
The issue is that some elements of the output array are 0.

I can't test this with any other driver other than PoCL either, since NVIDIA doesn't support SPIRV, so it would be nice if someone could test if it gives a different behavior with Intel drivers.
This probably points to a problem in our current code somewhere. Behavior is same both before and after the USM PR.

Edit:

Most recently:

Here's a reproducer:

using OpenCL, pocl_jll, KernelAbstractions

@kernel inbounds=true function _mwe!(@Const(v))
           temp = @localmem Int8 (1,)
           i = @index(Global, Linear)
           @print i "\n"
           @synchronize()
       end

v = CLArray(rand(Float32, 10))

_mwe!(OpenCLBackend(), 256)(v, ndrange=length(v))

This prints 1...256.

The CUDA version of the same code prints 1...10.

using CUDA

b = CuArray(rand(Float32, 10))

_mwe!(CUDABackend(false, false), 256)(b, ndrange=length(b))

The issue is probably that ndrange is not working.

Consequently, creating a CLArray of size (multiple of) 256 works without any issues, for the any and all functions , as well as merge_sort function

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions