Skip to content

SYCL Device Selector can return an invalid device #3947

Closed
@Michoumichmich

Description

@Michoumichmich

Hello,

My question concern device selectors. It seems to me that clang allows you to construct a device and a queue for a 'target' without having compiled for it.

Let's consider the following code :

#include <sycl/sycl.hpp>

class cuda_selector : public sycl::device_selector {
public:
    int operator()(const sycl::device &device) const override {
        return device.get_platform().get_backend() == sycl::backend::cuda && device.get_info<sycl::info::device::is_available>() ? 1 : -1;
    }
};

/**
 * Tries to get a queue from a selector else returns the host device
 */
static inline sycl::queue try_get_queue(const sycl::device_selector &selector) {
    auto exception_handler = [](const sycl::exception_list &exceptions) {
        for (std::exception_ptr const &e : exceptions) {
            try {
                std::rethrow_exception(e);
            }
            catch (sycl::exception const &e) {
                std::cout << "Caught asynchronous SYCL exception: " << e.what() << std::endl;
            }
            catch (std::exception const &e) {
                std::cout << "Caught asynchronous STL exception: " << e.what() << std::endl;
            }
        }
    };

    sycl::device dev;
    sycl::queue q;
    try {
        dev = sycl::device(selector);
        q = sycl::queue(dev, exception_handler);
        //if (dev.is_cpu() || dev.is_gpu()) q.single_task([]() {}).wait_and_throw(); //workaround
    }
    catch (...) {
        dev = sycl::device(sycl::host_selector());
        q = sycl::queue(dev, exception_handler);
        std::cout << "Warning: Expected device not found! Fall back on: " << dev.get_info<sycl::info::device::name>() << std::endl;
    }
    return q;
}

static inline void probe_queue(sycl::queue q){
    try{
        q.single_task([]() {}).wait_and_throw();
        std::cout << "Successfully ran on " << q.get_device().get_info<sycl::info::device::name>() << std::endl; 
    } catch (...){
        std::cout << "Something went wrong running on " << q.get_device().get_info<sycl::info::device::name>() << std::endl; 
    }
}

int main(){
    auto cuda_q = try_get_queue(cuda_selector{});
    auto cpu_q = try_get_queue(sycl::cpu_selector{});
    auto host_q = try_get_queue(sycl::host_selector{});
    probe_queue(cuda_q);
    probe_queue(cpu_q);
    probe_queue(host_q);
}

And the "makefile"

all:
	clang++ -fsycl -fsycl-targets=spir64_x86_64-unknown-unknown-sycldevice,nvptx64-nvidia-cuda-sycldevice -sycl-std=2020 -std=c++20 -fsycl-unnamed-lambda main.cpp
cuda:
	clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda-sycldevice -sycl-std=2020 -std=c++20 -fsycl-unnamed-lambda main.cpp
cpu:
	clang++ -fsycl -fsycl-targets=spir64_x86_64-unknown-unknown-sycldevice -sycl-std=2020 -std=c++20 -fsycl-unnamed-lambda main.cpp
host:
	clang++ -fsycl -sycl-std=2020 -std=c++20 -fsycl-unnamed-lambda main.cpp

Expected behaviour
We would expect try_get_queue(cuda_selector{}) to return a device that satisfies sycl::info::device::is_available and has a cuda backend. Turns out that compiling that code without nvptx64-nvidia-cuda-sycldevice still returns a queue attached to a Nvidia GPU. Running probing function on this queue results in CL_INVALID_BINARY (caught here) as there is no PTX code embedded in the binary, of course. The same behaviour happens when using the sycl::cpu_selector{} and not providing the target.

The program shown should only print Successfully ran on "Insert device compiled for".

Finally, there is also a bug which occurs with the target all. We should expect to run on the Host, the CPU and GPU, but it's not the case.

Workaround
Trying to run, in a try catch block, a dummy kernel such as

q.single_task([]() {}).wait_and_throw();

on the queue allows to check whether the device is "usable". See the commented line in the previous code.
This solution is not perfect if the device really is usable as it will block the caller. Furthermore, the time during which the caller will be blocked is potentially unknown and can depend on the device's and runtime's schedulers.

[Open] Solutions
The norm is vague about is_available, it states that it Returns true if the SYCL device is available and returns false if the device is not avail­able. One solution could use get_info<sycl::info::device::is_available>() and check the binary to see if at least some code is inside, before using the SYCL runtime to detect the devices. One couldn't imagine having the platform not listing the devices that cannot possibly be used as it would conflict with sycl-ls, I guess.

Any suggestions ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions