Description
This issue is to ask for clarification on flags passed to piextUSMEnqueuePrefetch.
So CUDA and ROCm USMEnqueuePrefetch can be fully implemented.
It appears this parameter will only recieve one flag PI_USM_MIGRATION_TBD0
.
In each plugin the flags value does not appear to effect the functions behaviour.
Plugins:
Both cuda and rocm have asserts to fail if a flag is specified.
// TODO implement handling the flags once the expected behaviour
// of piextUSMEnqueuePrefetch is detailed in the USM extension
assert(flags == 0u);
Level zero fails if the Flag is not set to PI_USM_MIGRATION_TBD0
or 0
PI_ASSERT(!(Flags & ~PI_USM_MIGRATION_TBD0), PI_INVALID_VALUE);
OpenCL currently does not use the flag but has a commented out implementation which does use it.
if (Err != PI_SUCCESS) {
RetVal = Err;
} else {
RetVal = cast<pi_result>(FuncPtr(
cast<cl_command_queue>(queue), ptr, size, flags, num_events_in_waitlist,
reinterpret_cast<const cl_event *>(events_waitlist),
reinterpret_cast<cl_event *>(event)));
}
Flag Usage:
Currently only one prefetch flag exists:
typedef enum : pi_bitfield {
PI_USM_MIGRATION_TBD0 = (1 << 0)
} _pi_usm_migration_flags;
Usage of piextUSMEnqueuePrefetch
in source/detail/memory_manager.cpp
shows that PI_USM_MIGRATION_TBD0
is the only flag used.
void MemoryManager::prefetch_usm(void *Mem, QueueImplPtr Queue, size_t Length,
std::vector<RT::PiEvent> DepEvents,
RT::PiEvent &OutEvent) {
sycl::context Context = Queue->get_context();
if (Context.is_host()) {
// TODO: Potentially implement prefetch on the host.
} else {
const detail::plugin &Plugin = Queue->getPlugin();
Plugin.call<PiApiKind::piextUSMEnqueuePrefetch>(
Queue->getHandleRef(), Mem, Length, PI_USM_MIGRATION_TBD0,
DepEvents.size(), DepEvents.data(), &OutEvent);
}
}
Questions:
What is the purpose of the flag, is this a placeholder for future opencl features?
How should plugins adjust their behaviour based on its value?
Is it still neccessary to have asserts if a flag is set for CUDA and ROCm backends?
Proposal:
If the flag is currently not effecting the functions behaviour, remove asserts in CUDA and ROCm backends. or replacing with level_zero assertion.