Skip to content

Vector conversion does not work correctly on CUDA #11840

Closed
@AlexeySachkov

Description

@AlexeySachkov

Describe the bug

#11770 re-implemented vec::convert to use whole vector conversions (like __spirv_SConvert_*) instead of per-element ones and #11821 adds an E2E tests for `vec::convert.

CI logs indicate that only first element of a vector is properly handled, whilst the rest are set to 0 on device:

# RUN: at line 7
/__w/llvm/llvm/toolchain/bin//clang++   -fsycl -fsycl-targets=nvptx64-nvidia-cuda /__w/llvm/llvm/llvm/sycl/test-e2e/Basic/vector/int-convert.cpp -o /__w/llvm/llvm/build-e2e/Basic/vector/Output/int-convert.cpp.tmp.out -DSYCL2020_DISABLE_DEPRECATION_WARNINGS
# executed command: /__w/llvm/llvm/toolchain/bin//clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda /__w/llvm/llvm/llvm/sycl/test-e2e/Basic/vector/int-convert.cpp -o /__w/llvm/llvm/build-e2e/Basic/vector/Output/int-convert.cpp.tmp.out -DSYCL2020_DISABLE_DEPRECATION_WARNINGS
# note: command had no output on stdout or stderr
# RUN: at line 8
env SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT=1 ONEAPI_DEVICE_SELECTOR=ext_oneapi_cuda:gpu  /__w/llvm/llvm/build-e2e/Basic/vector/Output/int-convert.cpp.tmp.out
# executed command: env SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT=1 ONEAPI_DEVICE_SELECTOR=ext_oneapi_cuda:gpu /__w/llvm/llvm/build-e2e/Basic/vector/Output/int-convert.cpp.tmp.out
# .---command stdout------------
# | host and device results do not match (vec<long, 4>::convert<unsigned char>)
# | 	{37, 0, [24](https://github.com/intel/llvm/actions/runs/6811797787/job/18522877312#step:21:25)5, 13} vs {[37](https://github.com/intel/llvm/actions/runs/6811797787/job/18522877312#step:21:38), 0, 0, 0}
# | device results don't match reference (vec<long, 4>::convert<unsigned char>)
# | 	{37, 0, 0, 0} vs {37, 0, 2[45](https://github.com/intel/llvm/actions/runs/6811797787/job/18522877312#step:21:46), 13}
# | host and device results do not match (vec<unsigned long, 4>::convert<unsigned char>)
# | 	{37, 0, 11, 13} vs {37, 0, 0, 0}
# | device results don't match reference (vec<unsigned long, 4>::convert<unsigned char>)
# | 	{37, 0, 0, 0} vs {37, 0, 11, 13}
# | host and device results do not match (vec<long, 4>::convert<signed char>)
# | 	{37, 0, -11, 13} vs {37, 0, 0, 0}
# | device results don't match reference (vec<long, 4>::convert<signed char>)
# | 	{37, 0, 0, 0} vs {37, 0, -11, 13}
# | host and device results do not match (vec<long, 4>::convert<unsigned char>)
# | 	{37, 0, 245, 13} vs {37, 0, 0, 0}
# | device results don't match reference (vec<long, 4>::convert<unsigned char>)
# | 	{37, 0, 0, 0} vs {37, 0, 245, 13}
# | host and device results do not match (vec<unsigned long, 4>::convert<signed char>)
# | 	{37, 0, 11, 13} vs {37, 0, 0, 0}
# | device results don't match reference (vec<unsigned long, 4>::convert<signed char>)
# | 	{37, 0, 0, 0} vs {37, 0, 11, 13}
# | host and device results do not match (vec<unsigned long, 4>::convert<unsigned char>)
# | 	{37, 0, 11, 13} vs {37, 0, 0, 0}
# | device results don't match reference (vec<unsigned long, 4>::convert<unsigned char>)

... Many more lines like this for other combinations of data types

To Reproduce

Ensure that you have #11821 in your environment, run sycl/test-e2e/Basic/vector/int-convert.cpp

Environment (please complete the following information):

  • OS: Linux
  • Target device and vendor: CUDA

Additional context

Until this is fixed, vec::convert will be switched to per-element approach for CUDA

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcudaCUDA back-end

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions