Closed
Description
Describe the bug
#11770 re-implemented vec::convert
to use whole vector conversions (like __spirv_SConvert_*
) instead of per-element ones and #11821 adds an E2E tests for `vec::convert.
CI logs indicate that only first element of a vector is properly handled, whilst the rest are set to 0
on device:
# RUN: at line 7
/__w/llvm/llvm/toolchain/bin//clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda /__w/llvm/llvm/llvm/sycl/test-e2e/Basic/vector/int-convert.cpp -o /__w/llvm/llvm/build-e2e/Basic/vector/Output/int-convert.cpp.tmp.out -DSYCL2020_DISABLE_DEPRECATION_WARNINGS
# executed command: /__w/llvm/llvm/toolchain/bin//clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda /__w/llvm/llvm/llvm/sycl/test-e2e/Basic/vector/int-convert.cpp -o /__w/llvm/llvm/build-e2e/Basic/vector/Output/int-convert.cpp.tmp.out -DSYCL2020_DISABLE_DEPRECATION_WARNINGS
# note: command had no output on stdout or stderr
# RUN: at line 8
env SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT=1 ONEAPI_DEVICE_SELECTOR=ext_oneapi_cuda:gpu /__w/llvm/llvm/build-e2e/Basic/vector/Output/int-convert.cpp.tmp.out
# executed command: env SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT=1 ONEAPI_DEVICE_SELECTOR=ext_oneapi_cuda:gpu /__w/llvm/llvm/build-e2e/Basic/vector/Output/int-convert.cpp.tmp.out
# .---command stdout------------
# | host and device results do not match (vec<long, 4>::convert<unsigned char>)
# | {37, 0, [24](https://github.com/intel/llvm/actions/runs/6811797787/job/18522877312#step:21:25)5, 13} vs {[37](https://github.com/intel/llvm/actions/runs/6811797787/job/18522877312#step:21:38), 0, 0, 0}
# | device results don't match reference (vec<long, 4>::convert<unsigned char>)
# | {37, 0, 0, 0} vs {37, 0, 2[45](https://github.com/intel/llvm/actions/runs/6811797787/job/18522877312#step:21:46), 13}
# | host and device results do not match (vec<unsigned long, 4>::convert<unsigned char>)
# | {37, 0, 11, 13} vs {37, 0, 0, 0}
# | device results don't match reference (vec<unsigned long, 4>::convert<unsigned char>)
# | {37, 0, 0, 0} vs {37, 0, 11, 13}
# | host and device results do not match (vec<long, 4>::convert<signed char>)
# | {37, 0, -11, 13} vs {37, 0, 0, 0}
# | device results don't match reference (vec<long, 4>::convert<signed char>)
# | {37, 0, 0, 0} vs {37, 0, -11, 13}
# | host and device results do not match (vec<long, 4>::convert<unsigned char>)
# | {37, 0, 245, 13} vs {37, 0, 0, 0}
# | device results don't match reference (vec<long, 4>::convert<unsigned char>)
# | {37, 0, 0, 0} vs {37, 0, 245, 13}
# | host and device results do not match (vec<unsigned long, 4>::convert<signed char>)
# | {37, 0, 11, 13} vs {37, 0, 0, 0}
# | device results don't match reference (vec<unsigned long, 4>::convert<signed char>)
# | {37, 0, 0, 0} vs {37, 0, 11, 13}
# | host and device results do not match (vec<unsigned long, 4>::convert<unsigned char>)
# | {37, 0, 11, 13} vs {37, 0, 0, 0}
# | device results don't match reference (vec<unsigned long, 4>::convert<unsigned char>)
... Many more lines like this for other combinations of data types
To Reproduce
Ensure that you have #11821 in your environment, run sycl/test-e2e/Basic/vector/int-convert.cpp
Environment (please complete the following information):
- OS: Linux
- Target device and vendor: CUDA
Additional context
Until this is fixed, vec::convert
will be switched to per-element approach for CUDA