-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL][PI] Removal of PI_CALL family of macros & implementation of Plugin_impl class #1030
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][PI] Removal of PI_CALL family of macros & implementation of Plugin_impl class #1030
Conversation
@smaslov-intel, @kbobrovs, it looks like implementation of plug-in interface matured enough to fill some of design unknowns. Could you update plug-in documentation and replace TBDs with the implemented design, please? |
@@ -78,6 +82,7 @@ class KernelProgramCache { | |||
|
|||
ProgramCacheT MCachedPrograms; | |||
KernelCacheT MKernelsPerProgramCache; | |||
PlatformImplPtr MPlatform; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems very strange to have platform object in the cache.
@s-kanaev Do you know if we can do better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand cache only requires plugin information, not the whole platform. I'd prefer to keep plugin reference/pointer instead of platform.
Besides., is it even possible to retrieve plugin information out of PI kernel/program object/pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems very strange to have platform object in the cache.
@s-kanaev Do you know if we can do better?
plugin_impl is only kept in Platform object.
To access it, platform_impl has been included here. The platform_impl here is from the context_impl to which the cache belongs.
Will it be better to include a reference to the parent Context? I wasn't able to achieve it in the code. So i decided to pass the platform the context belongs to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand cache only requires plugin information, not the whole platform. I'd prefer to keep plugin reference/pointer instead of platform.
Yes, only the plugin information is needed. The overall design takes care that only platform_impl owns a shared_ptr to the plugin, and others get a reference to the plugin only through platform_impl class. Storing a reference to plugin_impl has not been done anywhere. So I refrained from storing it here.
Besides., is it even possible to retrieve plugin information out of PI kernel/program object/pointer?
No we cannot access the plugin_information through any PI class objects. Atleast not with the current design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be better to include a reference to the parent Context? I wasn't able to achieve it in the code. So i decided to pass the platform the context belongs to.
This may be quite reasonable.
One may store a reference to parent context as well as a mere pointer. Besides, pointer will be wrapped with weak_ptr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm.. Let me try.. I'll update the code.
Thank you for pointing out the weak_ptr here. I didn't know about the circular dependency issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have included the ptr instead of weak_ptr.
@smaslov-intel Please review. |
@@ -78,6 +82,7 @@ class KernelProgramCache { | |||
|
|||
ProgramCacheT MCachedPrograms; | |||
KernelCacheT MKernelsPerProgramCache; | |||
PlatformImplPtr MPlatform; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand cache only requires plugin information, not the whole platform. I'd prefer to keep plugin reference/pointer instead of platform.
Besides., is it even possible to retrieve plugin information out of PI kernel/program object/pointer?
@@ -78,6 +82,7 @@ class KernelProgramCache { | |||
|
|||
ProgramCacheT MCachedPrograms; | |||
KernelCacheT MKernelsPerProgramCache; | |||
PlatformImplPtr MPlatform; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to have multiple plugins/platforms during the same runtime? Can we make plugin a singleton?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently no. But in the future yes. The design enables you to connect to multiple Plugins and have multiple platforms from different plugins at the same time.
We have a GlobalPlugin which stores one plugin in use,it is currently used for OpenCL Interoperability constructors only. It will be a bad design to use it here, because we would like to delete that GlobalPlugin in the future.
@bjoernknafla @keryell |
@@ -78,6 +82,7 @@ class KernelProgramCache { | |||
|
|||
ProgramCacheT MCachedPrograms; | |||
KernelCacheT MKernelsPerProgramCache; | |||
PlatformImplPtr MPlatform; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems very strange to have platform object in the cache.
@s-kanaev Do you know if we can do better?
plugin_impl is only kept in Platform object.
To access it, platform_impl has been included here. The platform_impl here is from the context_impl to which the cache belongs.
Will it be better to include a reference to the parent Context? I wasn't able to achieve it in the code. So i decided to pass the platform the context belongs to.
@@ -78,6 +82,7 @@ class KernelProgramCache { | |||
|
|||
ProgramCacheT MCachedPrograms; | |||
KernelCacheT MKernelsPerProgramCache; | |||
PlatformImplPtr MPlatform; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand cache only requires plugin information, not the whole platform. I'd prefer to keep plugin reference/pointer instead of platform.
Yes, only the plugin information is needed. The overall design takes care that only platform_impl owns a shared_ptr to the plugin, and others get a reference to the plugin only through platform_impl class. Storing a reference to plugin_impl has not been done anywhere. So I refrained from storing it here.
Besides., is it even possible to retrieve plugin information out of PI kernel/program object/pointer?
No we cannot access the plugin_information through any PI class objects. Atleast not with the current design.
@@ -78,6 +82,7 @@ class KernelProgramCache { | |||
|
|||
ProgramCacheT MCachedPrograms; | |||
KernelCacheT MKernelsPerProgramCache; | |||
PlatformImplPtr MPlatform; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently no. But in the future yes. The design enables you to connect to multiple Plugins and have multiple platforms from different plugins at the same time.
We have a GlobalPlugin which stores one plugin in use,it is currently used for OpenCL Interoperability constructors only. It will be a bad design to use it here, because we would like to delete that GlobalPlugin in the future.
PI_CALL(piMemGetInfo)(detail::pi::cast<detail::RT::PiMem>(MemObject), | ||
CL_MEM_SIZE, sizeof(size_t), &BufSize, nullptr); | ||
auto Plugin = detail::getSyclObjImpl(SyclContext)->getPlugin(); | ||
Plugin.call<detail::PiApiKind::piMemGetInfo>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about adding some free functions to avoid explicitly extracting PI Plugin every time? Something like:
template <detail::PiApiKind API, typename... Args>
pi_call(const context &SYCLContext, Args...) {
auto Plugin = detail::getSYCLObjImpl(SYCLContext)->getPlugin();
Plugin.call<API>(Args...);
}
// overloads for device, event and other SYCL RT classes
// On call site:
pi_call<detail::PiApiKind::piMemGetInfo>(SyclContext, /* rest of the args */);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We wanted to avoid making such calls that takes an extra argument beyond the arguments of the function call.
It should look like a function call.
Also, what is the benefit over the existing call format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexeySachkov any comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what is the benefit over the existing call format?
In most cases in this PR, call to a plugin performed only once, so suggested format can save you one line at most of call sites.
We wanted to avoid making such calls that takes an extra argument beyond the arguments of the function call.
It should look like a function call.
I'm not pushing towards my suggestion, this was just an idea. If the intent was to make this call as close as possible to actual call which will be performed under the hood, then let it be. I don't have objections against it
static std::vector<RT::PiEvent> | ||
getPiEvents(const std::vector<EventImplPtr> &EventImpls) { | ||
std::vector<RT::PiEvent> RetPiEvents; | ||
for (auto &EventImpl : EventImpls) | ||
RetPiEvents.push_back(EventImpl->getHandleRef()); | ||
return RetPiEvents; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can replace this with std::transform
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this file, the functionality is called at multiple locations..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this file, the functionality is called at multiple locations..
I will leave it up to you, then. Not sure that inserting std::transform
into this helper will help a lot, so, you can leave it as is
RT::PiResult Err = call_nocheck<PiApiOffset>(Args...); | ||
checkPiResult(Err); | ||
} | ||
// TODO: Make this private. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please resolve this TODO. Briefly looking at the PR, there should be no problems with that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@garimagu Please, resolve this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, just a few comment on comments.
Agree. @garimagu , would you please own that? I think this can go in a separate PR. |
@garimagu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smaslov-intel, @kbobrovs, it looks like implementation of plug-in interface matured enough to fill some of design unknowns. Could you update plug-in documentation and replace TBDs with the implemented design, please?
Agree. @garimagu , would you please own that? I think this can go in a separate PR.
Sure. I can update the design doc in PR #680 .
PI_CALL(piMemGetInfo)(detail::pi::cast<detail::RT::PiMem>(MemObject), | ||
CL_MEM_SIZE, sizeof(size_t), &BufSize, nullptr); | ||
auto Plugin = detail::getSyclObjImpl(SyclContext)->getPlugin(); | ||
Plugin.call<detail::PiApiKind::piMemGetInfo>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We wanted to avoid making such calls that takes an extra argument beyond the arguments of the function call.
It should look like a function call.
Also, what is the benefit over the existing call format?
@@ -78,6 +82,7 @@ class KernelProgramCache { | |||
|
|||
ProgramCacheT MCachedPrograms; | |||
KernelCacheT MKernelsPerProgramCache; | |||
PlatformImplPtr MPlatform; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have included the ptr instead of weak_ptr.
static std::vector<RT::PiEvent> | ||
getPiEvents(const std::vector<EventImplPtr> &EventImpls) { | ||
std::vector<RT::PiEvent> RetPiEvents; | ||
for (auto &EventImpl : EventImpls) | ||
RetPiEvents.push_back(EventImpl->getHandleRef()); | ||
return RetPiEvents; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this file, the functionality is called at multiple locations..
74d3112
to
9cc8977
Compare
Yes. I have resolved the ones I feel were clear enough to be resolved. Please feel free to point out anything I may have missed. |
PI_CALL(piEnqueueEventsWait)(Queue->getHandleRef(), RawEvents.size(), | ||
&RawEvents[0], &Event); | ||
std::vector<RT::PiEvent> RawEvents = getPiEvents(EventImpls); | ||
auto Plugin = Queue->getPlugin(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor. Why can't these lines be unified with lines 148 and 149 and moved from if?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should remove the getPiEvents() call out of the if condition.
But the Plugin needs to be taken from EventImpl and queue in the true and false path respectively.
sycl/include/CL/sycl/buffer.hpp
Outdated
@@ -181,8 +181,10 @@ class buffer { | |||
: Range{0} { | |||
|
|||
size_t BufSize = 0; | |||
PI_CALL(piMemGetInfo)(detail::pi::cast<detail::RT::PiMem>(MemObject), | |||
CL_MEM_SIZE, sizeof(size_t), &BufSize, nullptr); | |||
auto Plugin = detail::getSyclObjImpl(SyclContext)->getPlugin(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, do not use auto in such cases
See: https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm.. Are you asking me to remove auto keyword since the namespace is different and its an implementations specific class in sycl::buffer class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I'm asking you to do this because the type which getPlugin returns is not clear from the context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please apply to the whole patch. Some of the places where I see similar problem:
3 image_impl.hpp : 237, 355
4 queue_impl.hpp: 91
5 function_pointer.hpp: 83
6 device_impl.hpp: 60, 79, 113
7 kernel_program_cache.cpp: 34, 39
8 memory_manager.cpp: 28, 62, 105, 126, 145, 196, 231, 289, 314, 420, 476, 493, 508, 524, 539
9 platform_impl.cpp: 38, 214
10 program_impl.cpp: 71, 85, 147, 157, 195, 222, 261, 284, 292,
11 ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed the auto keyword from all locations.
I think we are mostly done here. Just a couple of comments. |
PI_CALL(piEnqueueEventsWait)(Queue->getHandleRef(), RawEvents.size(), | ||
&RawEvents[0], &Event); | ||
std::vector<RT::PiEvent> RawEvents = getPiEvents(EventImpls); | ||
auto Plugin = Queue->getPlugin(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should remove the getPiEvents() call out of the if condition.
But the Plugin needs to be taken from EventImpl and queue in the true and false path respectively.
sycl/include/CL/sycl/buffer.hpp
Outdated
@@ -181,8 +181,10 @@ class buffer { | |||
: Range{0} { | |||
|
|||
size_t BufSize = 0; | |||
PI_CALL(piMemGetInfo)(detail::pi::cast<detail::RT::PiMem>(MemObject), | |||
CL_MEM_SIZE, sizeof(size_t), &BufSize, nullptr); | |||
auto Plugin = detail::getSyclObjImpl(SyclContext)->getPlugin(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm.. Are you asking me to remove auto keyword since the namespace is different and its an implementations specific class in sycl::buffer class?
8456386
to
ff0911a
Compare
1901f8f
to
b602782
Compare
@bader . Is it possible to merge this first? This patch is ready to be merged and shouldn't be kept hanging for long. There are rebase conflicts everyday in this PR which require manual changes. |
@@ -81,6 +83,7 @@ class SYCLMemObjT : public SYCLMemObjI { | |||
|
|||
virtual ~SYCLMemObjT() = default; | |||
|
|||
const plugin &getPlugin() const { return MInteropContext->getPlugin(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you use the method? If no I suggest removing it.
The problem here is that MInteropContext can be/usually it's nullptr. It's not null when user creates a memory object using interop constructor only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot remove it, since it is used in sycl_mem_obj_t.hpp.
I have added an assert check on the pointer.
…ugin_Impl class. The changeset removes the family of PI_CALL macros, and replaces them with call API in a new plugin_impl class. The object of this class is stored in Platform class. When multiple platforms are associated with the same Plugin, they will have shared_ptr to the same plugin_impl object. Eg: An OpenCLPlugin can have multiple platforms attached to it. To effectively use this change, following additional fetaures/changes were added: - To make access to the plugin_impl class easy, getPlugin() API has been introduced to some *_impl classes, that eventually call the platform_impl->getPlugin() API. It has been included in context_impl, event_impl, kernel_impl, platform_impl, program_impl, queue_impl. Is not present in accessor_impl(Does not call PI), buffer_impl (Needs only in the constructor where context is passed), sampler_impl (Does not have a single context), stream_impl (Does not call PI), usm_impl (Has no member sycl_impl objects.) The API returns const reference to plugin_impl class. - Use of PiApiKind enum to uniquely identify the PI API. - PI API calls are done using : plugin_impl.call, plugin_impl.call_nocheck, plugin_impl.checkPiResult<Exception> - GlobalPlugin is a SharedPtr to a globally available Plugin for the use with Interoperability Constructors. It is used as a shared_ptr to avoid making copies of it. - Changes in *_info structs and get() methods. They now take a const plugin_impl& argument. - To make Pi Api calls by host to check if certain events are finished on a device, the plugin_impl class is needed. At such places, event_impl is passed instead of a PiEvent. Eg: memory_manager.cpp - The use of Global variable has been restricted to only the GlobalPlugin now. Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Signed-off-by: Garima Gupta <garima.gupta@intel.com>
memory leaks. The constructor using PiEvent is like a copy constructor earlier was doing a retain+release. We only need a release when destructor is called. Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Signed-off-by: Garima Gupta <garima.gupta@intel.com>
kenrel_program_cache. Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Removal of platform_impl from kernel_program_cache and addition of its parent (object of context_impl keeping teh instance of kernel_program_cache) context_impl* instead. Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Changed auto Plugin to const detail::plugin &Plugin Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Signed-off-by: Garima Gupta <garima.gupta@intel.com>
Addition of a suggested change. Signed-off-by: Garima Gupta <garima.gupta@intel.com>
b602782
to
4bea0ee
Compare
@romanovvlad Please let me know of any pending issues you see. I will resolve all the conflicts and do one final rebase of the workspace. |
@Ruyk, @Alexander-Johnston are you okay if #1030 merged first? |
Yeah thats fine, we are still working on rebasing our PR on top of the latest changes |
Running tests with clang-cl would set the proper object extension, but clang and clang++ on Windows would use the wrong .o extension.
The changeset removes the family of PI_CALL macros, and replaces them with
call API in a new plugin_impl class. The object of this class is stored in
Platform class. When multiple platforms are associated with the same Plugin,
they will have shared_ptr to the same plugin_impl object. Eg: An OpenCLPlugin
can have multiple platforms attached to it.
To effectively use this change, following additional fetaures/changes
were added:
introduced to some *_impl classes, that eventually call the
platform_impl->getPlugin() API. It has been included in context_impl,
event_impl, kernel_impl, platform_impl, program_impl, queue_impl. Is not
present in accessor_impl(Does not call PI), buffer_impl (Needs only in the
constructor where context is passed), sampler_impl (Does not have a single
context), stream_impl (Does not call PI), usm_impl (Has no member sycl_impl
objects.) The API returns const reference to plugin_impl class.
plugin_impl.checkPiResult
Interoperability Constructors. It is used as a shared_ptr to avoid making
copies of it.
plugin_impl& argument.
device, the plugin_impl class is needed. At such places, event_impl is passed
instead of a PiEvent. Eg: memory_manager.cpp
Signed-off-by: Garima Gupta garima.gupta@intel.com
NOTE TO REVIEWERS:
This patch does not take care of the following.
*To add the config file mechanism to find attached Plugins, as has been mentioned in [SYCL][DOC] Update the SYCL Runtime Interface document to give some design updates #680
*To add better tracing and debugging.
*Format the code in Doxygen friendly-way and move the implementation into cpp files if possible.