Skip to content

Building a simple SYCL project with CMake results in a "No kernel named _(insert mangled name here)_ was found -46 (PI_ERROR_INVALID_KERNEL_NAME)" error. #11568

Open
@jonathan-ramsey

Description

@jonathan-ramsey

When trying to build any SYCL-enabled project with CMake and the compiler built from the sycl branch of intel/llvm (see below), the successfully compiled binary kernel throws an error along the lines of "No kernel named (mangled name) was found -46 (PI_ERROR_INVALID_KERNEL_NAME)" while trying to execute any SYCL kernel.

I can get the error to arise even with the vector-add-buffers example in the oneAPI-samples repository.

However, if I use an installed copy of oneAPI Base toolkit (version 2023.2.1), then the example (and indeed other SYCL-enabled projects) work fine!

After digging around, the consensus is that I should be using the compiler (clang-cl.exe) to link, while the default build process uses lld-link.exe. Okay, so I did that and used clang-cl.exe to link with the appropriate options instead, but it does not fix the issue.

After carefully comparing the build process for the oneAPI toolkit versus the self-built clang/llvm from this repo, it would seem that the linking step of the oneAPI toolkit build (which is using icx.exe) is doing many more things than when I link using the self-built clang-cl.exe or lld-link.exe (e.g. icx.exe makes multiple calls to clang-offload-builder).

Another big indicator that things are different is that the oneAPI toolkit built executable is almost 4 times larger than the executable from the self-built clang compiler.

Important note: In the case of the vector-add-buffers oneAPI sample, if I forego CMake and compile just the single source code file in a one-liner (e.g. clang-cl -fsycl /EHsc vector-add-buffers.cpp -o vector-add-buffers.exe), then it does work as expected.
 
Can anyone tell me what are the additional steps I need to take to get the linking step of the self-built clang to behave like the oneAPI toolkit?

An alternative way to phrase this might be, how do I deploy the self-built clang/DPC++ toolchain on my local system?

Environment:

  • OS: Windows 10 Pro
  • Target device: Any device I have it seems, but let's stick with an Intel Core i7-10875H for simplicity.
  • DPC++ version: clang version 18.0.0, commit 47083f8 (tag: nightly-2023-09-28)
  • clang was built locally with no additional options to configure.py or compile.py following the Getting Started instructions.
  • Using set ONEAPI_DEVICE_SELECTOR=opencl:cpu to force calculation on the CPU.
    CMakeLists.txt

P.S. If you're wondering why I'd want to use SYCL to target CPUs, it is only a stepping stone to NVIDIA GPU offloading...once I can get things working...

P.P.S. I have tried other commits, including a nightly from late last week, and one from the end of June matching the last release date of the oneAPI toolkit, but the problem persists.

Thank you in advance for any help or suggestions you can give!

My slightly modified vector-add-buffers.cpp and corresponding CMakelists.txt are attached.
vector-add-buffers.txt
CMakeLists.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocumentationMissing documentation for the code, compiler or runtime features, etc.WindowscudaCUDA back-end

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions