Description
Describe the bug
In the past I've made use of sycl::atomic
for atomic operations in our code, which used to behave well on all backends that I tried. (Including the CUDA and HIP ones.) Now that the latest nightlies warn about sycl::atomic
being deprecated, I tried to teach our code to use sycl::atomic_ref
instead when it is available.
But when I do that, building the code for an NVIDIA backend fails like:
ptxas fatal : Unresolved extern function '_Z18__spirv_AtomicIAddPiN5__spv5Scope4FlagENS0_19MemorySemanticsMask4FlagEi'
llvm-foreach:
clang-15: error: ptxas command failed with exit code 255 (use -v to see invocation)
clang version 15.0.0 (https://github.com/intel/llvm.git 64e92cbc41d1bcde9c728798e4e1fed9e3fab253)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin
clang-15: note: diagnostic msg: Error generating preprocessed source(s).
To Reproduce
Take the following example:
#include <CL/sycl.hpp>
#include <cassert>
#include <iostream>
//#define USE_ATOMIC_REF
int main() {
// Set up the queue.
sycl::queue queue;
std::cout << "Running on device: " << queue.get_device().get_info<sycl::info::device::name>() << std::endl;
// Set up the array.
const std::size_t ARRAY_SIZE = 100;
int* mem = static_cast<int*>(sycl::malloc_shared(ARRAY_SIZE * sizeof(int), queue));
for (std::size_t i = 0; i < ARRAY_SIZE; ++i) {
mem[i] = 0;
}
// Modify the array atomically on a device.
queue.submit([mem, ARRAY_SIZE](sycl::handler& h) {
h.parallel_for<class atomic_test>(sycl::range<1>(ARRAY_SIZE),
[mem](sycl::id<1> id) {
#ifdef USE_ATOMIC_REF
sycl::atomic_ref<int, sycl::memory_order::relaxed,
sycl::memory_scope::device> aref(mem[id]);
aref.fetch_add(1);
#else
sycl::atomic_fetch_add<int>(sycl::atomic<int>(sycl::global_ptr<int>(mem + id)), 1);
#endif
});
}).wait_and_throw();
// Check the array's payload.
for (std::size_t i = 0; i < ARRAY_SIZE; ++i) {
assert(mem[i] == 1);
}
// Finish up.
std::cout << "All OK!" << std::endl;
sycl::free(mem, queue);
return 0;
}
In this exact form (without USE_ATOMIC_REF
being defined) it builds and runs correctly, albeit with a good number of warnings.
[bash][Legolas]:sycl > clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda -g atomic_ref_error.cpp
atomic_ref_error.cpp:30:46: warning: 'atomic<int, sycl::access::address_space::global_space>' is deprecated: sycl::atomic is deprecated since SYCL 2020 [-Wdeprecated-declarations]
sycl::atomic_fetch_add<int>(sycl::atomic<int>(sycl::global_ptr<int>(mem + id)), 1);
^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/atomic.hpp:171:7: note: 'atomic<int, sycl::access::address_space::global_space>' has been explicitly marked deprecated here
class __SYCL2020_DEPRECATED(
^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:54:40: note: expanded from macro '__SYCL2020_DEPRECATED'
#define __SYCL2020_DEPRECATED(message) __SYCL_DEPRECATED(message)
^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:45:38: note: expanded from macro '__SYCL_DEPRECATED'
#define __SYCL_DEPRECATED(message) [[deprecated(message)]]
^
warning: linking module '/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/lib/clang/15.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc': Linking two modules of different target triples: '/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/lib/clang/15.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc' is 'nvptx64-unknown-nvidiacl' whereas 'atomic_ref_error.cpp' is 'nvptx64-nvidia-cuda'
[-Wlinker-warnings]
2 warnings generated.
atomic_ref_error.cpp:30:46: warning: 'atomic<int, sycl::access::address_space::global_space>' is deprecated: sycl::atomic is deprecated since SYCL 2020 [-Wdeprecated-declarations]
sycl::atomic_fetch_add<int>(sycl::atomic<int>(sycl::global_ptr<int>(mem + id)), 1);
^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/atomic.hpp:171:7: note: 'atomic<int, sycl::access::address_space::global_space>' has been explicitly marked deprecated here
class __SYCL2020_DEPRECATED(
^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:54:40: note: expanded from macro '__SYCL2020_DEPRECATED'
#define __SYCL2020_DEPRECATED(message) __SYCL_DEPRECATED(message)
^
/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin/../include/sycl/CL/sycl/detail/defines_elementary.hpp:45:38: note: expanded from macro '__SYCL_DEPRECATED'
#define __SYCL_DEPRECATED(message) [[deprecated(message)]]
^
1 warning generated.
[bash][Legolas]:sycl > SYCL_DEVICE_FILTER=CUDA ./a.out
Running on device: NVIDIA GeForce RTX 3080
All OK!
[bash][Legolas]:sycl >
But if I un-comment the line defining USE_ATOMIC_REF
, the build fails with:
[bash][Legolas]:sycl > clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda -g atomic_ref_error.cpp
warning: linking module '/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/lib/clang/15.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc': Linking two modules of different target triples: '/home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/lib/clang/15.0.0/../../clc/remangled-l64-signed_char.libspirv-nvptx64--nvidiacl.bc' is 'nvptx64-unknown-nvidiacl' whereas 'atomic_ref_error.cpp' is 'nvptx64-nvidia-cuda'
[-Wlinker-warnings]
1 warning generated.
ptxas fatal : Unresolved extern function '_Z18__spirv_AtomicIAddPiN5__spv5Scope4FlagENS0_19MemorySemanticsMask4FlagEi'
llvm-foreach:
clang-15: error: ptxas command failed with exit code 255 (use -v to see invocation)
clang version 15.0.0 (https://github.com/intel/llvm.git 64e92cbc41d1bcde9c728798e4e1fed9e3fab253)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin
clang-15: note: diagnostic msg: Error generating preprocessed source(s).
[bash][Legolas]:sycl >
I tried a few different arguments for the memory order and memory scope template arguments, but that didn't seem to make much of a difference.
Environment (please complete the following information)
- OS: Ubuntu 20.04
- Target device and vendor: NVIDIA GPU
- DPC++ version:
[bash][Legolas]:sycl > clang++ -v
clang version 15.0.0 (https://github.com/intel/llvm.git 64e92cbc41d1bcde9c728798e4e1fed9e3fab253)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/krasznaa/software/intel/clang/nightly-20220217/x86_64-ubuntu2004-gcc9-opt/bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/7.5.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@mx32
Selected multilib: .;@m64
Found CUDA installation: /home/krasznaa/software/cuda/11.5.2/x86_64-ubuntu2004, version 11.5
Found HIP installation: /opt/rocm, version 4.2.21155-37cb3a34
[bash][Legolas]:sycl >
- Dependencies version: N/A