Skip to content

CLBlast: byte offset / element count confusion #3307

Closed
@shibe2

Description

@shibe2

Prerequisites

Please answer the following questions for yourself before submitting an issue.

Expected Behavior

Correct uploading of contiguous 3D tensor data to GPU.

Current Behavior

ggml_cl_h2d_tensor_2d uses offset argument as byte offset in a call to clEnqueueWriteBuffer. ggml_cl_transform_tensor passes element count as offset to ggml_cl_h2d_tensor_2d. This corresponds to byte offset only if element size is exactly 1.

Also, I don't understand why ggml_cl_mul_f32 passes non-zero offset to ggml_cl_h2d_tensor_2d.

Environment and Context

AMD GPU
Linux

Steps to Reproduce

  1. Pass 3D tensor with contiguous GGML_TYPE_F16 or GGML_TYPE_F32 data to ggml_cl_transform_tensor.
  2. Read data back from GPU memory or perform ggml_cl_mul_mat on that tensor.
  3. Observe incorrect data or result.

Ping

@0cc4m
@JohannesGaessler
@SlyEcho

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions