Skip to content

[SYCL] enable_shared_from_this for kernel_bundle_impl #18899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

Alexandr-Konovalov
Copy link
Contributor

This allow us to get rid of passing self to kernel_bundle_impl methods and to return raw pointer to kernel bundle instead of shared_ptr.

This allow us to get rid of passing self to kernel_bundle_impl methods
and to return raw pointer to kernel bundle instead of shared_ptr.
Comment on lines 619 to 620
std::shared_ptr<kernel_bundle_impl> Self =
const_cast<kernel_bundle_impl *>(this)->shared_from_this();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporary until tryGetSourceBasedKernel is updated, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I missed the idea. How is it related to tryGetSourceBasedKernel() and what is temporary?

@@ -938,13 +947,15 @@ class kernel_bundle_impl {
SelectedImage->get_ur_program_ref());

return std::make_shared<kernel_impl>(
Kernel, detail::getSyclObjImpl(MContext), SelectedImage, Self, ArgMask,
Kernel, detail::getSyclObjImpl(MContext), SelectedImage,
const_cast<kernel_bundle_impl *>(this)->shared_from_this(), ArgMask,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do

class kernel_bundle_impl : public std::enable_shared_from_this<kernel_bundle_impl> {
  using Base = std::enable_shared_from_this<kernel_bundle_impl>;
public
  std::shared_ptr<kernel_bundle_impl> shared_from_this() const {
    return const_cast<...>(this)->Base::shared_from_this();
  }
};

If compiler would complain about incomplete types, make it a template:

template <typename Self = kernel_bundle_impl>
... shared_from_this() const {
  Self *self = this;
  // use `self` instead of `this` to make template-type-dependent
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the idea is to have ugly cast in single place, right? Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not sure if this or just dropping a bunch of consts is better. We're essentially lying here when we do a const_cast...

Comment on lines 1991 to 1993
#ifdef __INTEL_PREVIEW_BREAKING_CHANGES
detail::kernel_bundle_impl *KernelBundleImplPtr,
#else
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most things under source/detail could start using the raw-pointer version immediately, without this pre-processor dispatch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's both more readable and faster.

Copy link
Contributor

@Bensuo Bensuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Graph-related changes LGTM!

@@ -1739,6 +1744,10 @@ class __SYCL_EXPORT handler {
/// called.
void setUserFacingNodeType(ext::oneapi::experimental::node_type Type);

#ifdef __INTEL_PREVIEW_BREAKING_CHANGES
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't this be done now? Same in the next file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done.

@@ -619,7 +619,7 @@ class device_image_impl {

std::shared_ptr<kernel_impl> tryGetSourceBasedKernel(
std::string_view Name, const context &Context,
const std::shared_ptr<kernel_bundle_impl> &OwnerBundle,
std::shared_ptr<kernel_bundle_impl> &&OwnerBundle,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not raw pointer/reference instead? Is it because you want to keep the PR manageable in size and that subgraph of the codebase is left to a subsequent change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's because I don't understand lifetime constrains for kernel. I mean, after switching to raw pointers we must guarantee that kernel created from kernel_bundle only used when kernel_bundle is alive.

sycl::kernel_bundle<sycl::bundle_state::executable> kb = /*...*/;
sycl::kernel k = kb.ext_oneapi_get_kernel("foo");

Is SYCL spec or sycl_ext_oneapi_kernel_compiler_spirv allow us to require this?

This constrain is important by kernel_bundle_impl::ext_oneapi_get_kernel(), for kernel_bundle_impl::tryGetKernel() there is no usage of kernel out out kernel_bundle scope. But having 2 verisons of device_image_impl::tryGetSourceBasedKernel() for shared_ptr and for raw pointers doesn't look attractive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is that you'll shared_from_this in a place where copy is created before this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is that you'll shared_from_this in a place where copy is created before this change.

Yes, that's correct, previously shared_ptr was copied and now one is created with shared_from_this. And what do you suggest to do?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please find the updated version.

I don't like the idea to make a hole in the encapsulation, but we should definitely take into account that C++11 move semantics is a difficult subject.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should definitely take into account that C++11 move semantics is a difficult subject.

Exactly! No moves is just easier for everyone ;)

DevImgImpl->tryGetSourceBasedKernel(Name, MContext, Self,
DevImgImpl))
// move is performed only when SourceBasedKernel is not null
DevImgImpl->tryGetSourceBasedKernel(Name, MContext, std::move(Self),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And somewhat replying to my question above, I think it would make sense to modify tryGetSourceBasedKernel to accept raw reference as part of this PR. We'd be able to avoid the ugly Self that way.

Also, the current version is buggy - if we call these more than once during the loop then the second call is trying to std::move something that's already is in the "moved-out-from" state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the current version is buggy - if we call these more than once during the loop then the second call is trying to std::move something that's already is in the "moved-out-from" state.

Why? Self became empty only during tryGetSourceBasedKernel() call that return non-null. And when tryGetSourceBasedKernel() return non-null, there is no subsequent iteration. Probably I have to write more detailed comment.

Kernel, detail::getSyclObjImpl(MContext), SelectedImage, Self, ArgMask,
SelectedImage->get_ur_program_ref(), CacheMutex);
Kernel, detail::getSyclObjImpl(MContext), SelectedImage,
shared_from_this(), ArgMask, SelectedImage->get_ur_program_ref(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, kernel_impl should accept raw reference too, but I'm fine if that is left to a future change.

@@ -39,7 +39,7 @@ kernel_impl::kernel_impl(ur_kernel_handle_t Kernel, ContextImplPtr Context,

kernel_impl::kernel_impl(ur_kernel_handle_t Kernel, ContextImplPtr ContextImpl,
DeviceImageImplPtr DeviceImageImpl,
KernelBundleImplPtr KernelBundleImpl,
KernelBundleImplPtr &&KernelBundleImpl,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, since we need to make the change here anyway, then making it raw reference/pointer as part of this PR is preferrable.

// returns nullptr if Insert is false
detail::kernel_bundle_impl *
handler::getOrInsertHandlerKernelBundlePtr(bool Insert) const {
if (!impl->MKernelBundle && Insert) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think inverting this and making early return for Insert == false would be more readable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 507 to 508
std::shared_ptr<detail::kernel_bundle_impl> ExecKernelBundleImpPtr =
detail::getSyclObjImpl(ExecKernelBundle);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::shared_ptr<detail::kernel_bundle_impl> ExecKernelBundleImpPtr =
detail::getSyclObjImpl(ExecKernelBundle);
detail::kernel_bundle_impl &ExecKernelBundleImpPtr =
*detail::getSyclObjImpl(ExecKernelBundle);

I plan to update getSyclObjImpl to return a raw reference once I finish refactoring all the SYCL RT classes. Updating all the code as part of ongoing refactorings will make my later task of doing the change much easier.

If making this change would cause unnecessary complications, then it's fine to avoid it for now, but if the fix is simple and possible at this time, please do it.

Same below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I do not understand the lifetimes for the change you suggested. If ExecKernelBundleImpPtr became reference, why KernelBundleImpPtr is valid? ExecKernelBundle is out of scope,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both ExecKernelBundle and ExecKernelBundleImpPtr are in the same scope, and ExecKernelBundleImpPtr is the inner one, so it will be destroyed before the ExecKernelBundle. IIUC, it's not the ExecKernelBundleImpPtr that extends any lifetime, but a copy of it created when calling something with it. We can have an explicit shared_from_this (either here or at the callee) to make it clear where that lifetime extension happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not the ExecKernelBundleImpPtr that extends any lifetime, but a copy of it created when calling something with it.

Exactly! The lifetime is extended, because we pass ExecKernelBundleImpPtr to setHandlerKernelBundle(), where it's saved. But, if I understand you correctly, suggest to eliminate shared_ptr here and use only raw pointers. Then, only way to extend the scope is to pass to setHandlerKernelBundle() shared_from_this from the raw pointer. That is probably correct, but why is it better then the version with explicit shared_ptr?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because if we pass by raw ptr/ref, then we represent non-null property through type system. And if we pass everything by raw ptr/ref and avoid any passing of shared pointers (and only do shared_from_this when we do store a shared pointer to memory, like member data field), then it becomes impossible to accidentally create an unnecessary shared pointer copy.

If we do have some interfaces with raw ptr/rf and other with shared pointers, then we have two approaches and some people who are less experienced with the code base will make the wrong choice in new code, and will create unnecessary shared pointer copies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, explicit shared_from_this increases readability. Done.

Co-authored-by: aelovikov-intel <andrei.elovikov@intel.com>
@againull againull merged commit 74c7854 into intel:sycl Jun 17, 2025
37 of 39 checks passed
aelovikov-intel added a commit to aelovikov-intel/llvm that referenced this pull request Jun 26, 2025
…bundle_impl`

Follow-up for intel#18899.

Also adds proper `private_tag` argument for the ctor added in
intel#18949 that missed that.
aelovikov-intel added a commit to aelovikov-intel/llvm that referenced this pull request Jun 26, 2025
…bundle_impl`

Follow-up for intel#18899.

Also adds proper `private_tag` argument for the ctor added in
intel#18949 that missed that. Also pass
`span` by value while on it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants