Description
Is your feature request related to a problem? Please describe
Currently group_local_memory/group_local_memory_for_overwrite is implemented with a function wrapper to builtin __sycl_allocateLocalMemory call. Since a call to group_local_memory_for_overwrite represents a distinct allocation of group local memory, we have to inline group_local_memory_for_overwrite function before builtin __sycl_allocateLocalMemory can be resolved. The restriction causes two problems:
- __sycl_allocateLocalMemory is lowered in SYCLLowerWGLocalMemoryPass which must run after AlwaysInlinerPass. Though AlwaysInlinerPass is guaranteed to run in LLVM pass pipeline, but it isn't run at O2 pipeline start. This is problematic if there is conflict that backend compiler requires __sycl_allocateLocalMemory to be lowered earlier than the AlwaysInlinerPass. In addition, the dependence on AlwaysInlinerPass is implicit and make SYCLLowerWGLocalMemoryPass not self-contained. PR [SYCL][SYCLLowerWGLocalMemoryPass] Remove implicit dependency on AlwaysInlinerPass and move to PipelineStart #16356 does inlining within SYCLLowerWGLocalMemoryPass, and thus removes the restriction. However, the PR introduces a new attribute which might be a tech debt.
- syclcompat::local_mem directly calls group_local_memory_for_overwrite:
llvm/sycl/include/syclcompat/memory.hpp
Lines 71 to 75 in 44c58bb
I am not sure if the issue is directly related to the restriction that group_local_memory must be defined at kernel function scope. The behavior, which prevents definition in non-kernel function, aligns with OpenCL. Please also see below note in the spec:
llvm/sycl/doc/extensions/supported/sycl_ext_oneapi_local_memory.asciidoc
Lines 111 to 112 in 44c58bb
Describe the solution you would like
No response
Describe alternatives you have considered
No response
Additional context
No response