Skip to content

[SYCL][CUDA][ROCm] USMEnqueuePrefetch flags #4467

Closed
@AidanBeltonS

Description

@AidanBeltonS

This issue is to ask for clarification on flags passed to piextUSMEnqueuePrefetch.
So CUDA and ROCm USMEnqueuePrefetch can be fully implemented.
It appears this parameter will only recieve one flag PI_USM_MIGRATION_TBD0.
In each plugin the flags value does not appear to effect the functions behaviour.

Plugins:
Both cuda and rocm have asserts to fail if a flag is specified.

// TODO implement handling the flags once the expected behaviour
// of piextUSMEnqueuePrefetch is detailed in the USM extension
assert(flags == 0u);

Level zero fails if the Flag is not set to PI_USM_MIGRATION_TBD0 or 0

PI_ASSERT(!(Flags & ~PI_USM_MIGRATION_TBD0), PI_INVALID_VALUE);

OpenCL currently does not use the flag but has a commented out implementation which does use it.

if (Err != PI_SUCCESS) {
    RetVal = Err;
  } else {
    RetVal = cast<pi_result>(FuncPtr(
        cast<cl_command_queue>(queue), ptr, size, flags, num_events_in_waitlist,
        reinterpret_cast<const cl_event *>(events_waitlist),
        reinterpret_cast<cl_event *>(event)));
  }

Flag Usage:
Currently only one prefetch flag exists:

typedef enum : pi_bitfield {
  PI_USM_MIGRATION_TBD0 = (1 << 0)
} _pi_usm_migration_flags;

Usage of piextUSMEnqueuePrefetch in source/detail/memory_manager.cpp shows that PI_USM_MIGRATION_TBD0 is the only flag used.

void MemoryManager::prefetch_usm(void *Mem, QueueImplPtr Queue, size_t Length,
                                 std::vector<RT::PiEvent> DepEvents,
                                 RT::PiEvent &OutEvent) {
  sycl::context Context = Queue->get_context();

  if (Context.is_host()) {
    // TODO: Potentially implement prefetch on the host.
  } else {
    const detail::plugin &Plugin = Queue->getPlugin();
    Plugin.call<PiApiKind::piextUSMEnqueuePrefetch>(
        Queue->getHandleRef(), Mem, Length, PI_USM_MIGRATION_TBD0,
        DepEvents.size(), DepEvents.data(), &OutEvent);
  }
}

Questions:
What is the purpose of the flag, is this a placeholder for future opencl features?
How should plugins adjust their behaviour based on its value?
Is it still neccessary to have asserts if a flag is set for CUDA and ROCm backends?

Proposal:
If the flag is currently not effecting the functions behaviour, remove asserts in CUDA and ROCm backends. or replacing with level_zero assertion.

Metadata

Metadata

Assignees

No one assigned

    Labels

    cudaCUDA back-endenhancementNew feature or requesthipIssues related to execution on HIP backend.runtimeRuntime library related issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions