Skip to content

Remove restriction that group_local_memory/group_local_memory_for_overwrite function requires to be inlined #16617

Open
@wenju-he

Description

@wenju-he

Is your feature request related to a problem? Please describe

Currently group_local_memory/group_local_memory_for_overwrite is implemented with a function wrapper to builtin __sycl_allocateLocalMemory call. Since a call to group_local_memory_for_overwrite represents a distinct allocation of group local memory, we have to inline group_local_memory_for_overwrite function before builtin __sycl_allocateLocalMemory can be resolved. The restriction causes two problems:

  1. __sycl_allocateLocalMemory is lowered in SYCLLowerWGLocalMemoryPass which must run after AlwaysInlinerPass. Though AlwaysInlinerPass is guaranteed to run in LLVM pass pipeline, but it isn't run at O2 pipeline start. This is problematic if there is conflict that backend compiler requires __sycl_allocateLocalMemory to be lowered earlier than the AlwaysInlinerPass. In addition, the dependence on AlwaysInlinerPass is implicit and make SYCLLowerWGLocalMemoryPass not self-contained. PR [SYCL][SYCLLowerWGLocalMemoryPass] Remove implicit dependency on AlwaysInlinerPass and move to PipelineStart #16356 does inlining within SYCLLowerWGLocalMemoryPass, and thus removes the restriction. However, the PR introduces a new attribute which might be a tech debt.
  2. syclcompat::local_mem directly calls group_local_memory_for_overwrite:
    template <typename AllocT> auto *local_mem() {
    sycl::multi_ptr<AllocT, sycl::access::address_space::local_space>
    As_multi_ptr =
    sycl::ext::oneapi::group_local_memory_for_overwrite<AllocT>(
    sycl::ext::oneapi::this_work_item::get_work_group<3>());
    . This use propagates the inlining dependency and function syclcompat::local_mem must be inlined as well. Please also note that spec doesn't allow the use of group_local_memory in library function.

I am not sure if the issue is directly related to the restriction that group_local_memory must be defined at kernel function scope. The behavior, which prevents definition in non-kernel function, aligns with OpenCL. Please also see below note in the spec:

NOTE: The restriction that group-local variables must be defined at kernel
functor scope may be lifted in a future version of this extension.

Describe the solution you would like

No response

Describe alternatives you have considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions