Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correspondence of unnamed lambdas as kernels across host and device compilation #454

Open
tahonermann opened this issue Aug 12, 2023 · 1 comment

Comments

@tahonermann
Copy link

Per SYCL 2020 4.12.2 (Defining kernels as lambda functions), a C++ lambda can be used as a SYCL kernel. Per section 5.2 (Naming of kernels), lambdas used as kernels can optionally be given a name via a template parameter. Appendix B (Feature sets) specifies a reduced feature set that omits support for unnamed lambdas. Finally, section 5.6 (Preprocessor directives and macros) specifies that an implementation that does not support unnamed lambdas shall not define SYCL_FEATURE_SET_FULL and shall define SYCL_FEATURE_SET_REDUCED=1.

Consider the following example that contains a conditional lambda expression that is only present for host compilation. Implementations that rely on the Itanium C++ ABI name mangling specification may face challenges correlating names for the unnamed lambda used as a kernel in the call to cgh.single_task() because the presence of the host-only lambda will cause allocation of name mangling discriminators to be misaligned for host and device compilation.

#include <sycl/sycl.hpp>
int main() {
  sycl::queue q(sycl::cpu_selector_v);
  q.submit([](sycl::handler &cgh) {
#if !defined(__SYCL_DEVICE_ONLY__)
    // The presence of a lambda that is only present for one of
    // host or device compilation might cause names synthesized
    // for unnamed lambdas not to coincide.
    []{}();
#endif
    cgh.single_task([]{});
  });
  q.wait();
}

Is support for an example like the one above intended to be required for an implementation to claim support for the full feature set?

@AerialMantis
Copy link
Collaborator

SYCL WG call:

  • Agreed that it's not realistic to expect an implementation to support this example
  • Specification likely needs clarification
  • Certain usage of macros can result in different definitions for host and device
  • Hard to define what a valid or invalid use of macros would be
  • CUDA has if device, which means it's no longer an ODR violation
  • DPC++ has an extension (if_device) which is similar
  • It's not just for SYCL_DEVICE_ONLY it's other macros too
  • Hard to define rules as it's dependent on implementation-defined behaviour
  • In the long term should we move away from using SYCL_DEVICE_ONLY
  • SYCL_DEVICE_ONLY is also not implementable for the single-pass compiler
  • Need to identify what users are using this for, to provide alternatives
  • Possible F2F topic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants