-
Notifications
You must be signed in to change notification settings - Fork 769
[UR][CUDA][HIP] Allow adapter objects to block full adapter teardown. #17571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
martygrant
merged 15 commits into
intel:sycl
from
aarongreig:aaron/reworkDestructorDependencies
Apr 10, 2025
Merged
[UR][CUDA][HIP] Allow adapter objects to block full adapter teardown. #17571
martygrant
merged 15 commits into
intel:sycl
from
aarongreig:aaron/reworkDestructorDependencies
Apr 10, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In the cuda adapter the adapter struct itself is currently an extern global defined in adapter.cpp. This means fully tearing down the adapter is subject to the same destructor ordering as all other static and global variables, it's first in last out. This presents a problem because an application can declare a static sycl object like a buffer right up top before doing anything else, which results in the sycl object being destroyed after the cuda adapter struct. The UR spec doesn't put the onus on users to keep their parent object lifetimes in order, i.e. there is no statement about "the context you use to create a ur_mem_handle_t must not be released until after the mem_handle". It's assumed (by omission rather than explicitly) that adapters will have their objects keep a reference to any parent objects alive for the duration of their own lifetime. This change moves the cuda adapter structs ownership into a global shared_ptr, which allows child objects of the adapter to keep their own references to it alive past the point where its initial definition goes out of scope. Also adjusts how some other objects track parent object references so that the destructors correctly cascade back to the top: mem handle releases its context, which releases its adapter, which releases the platform + devices, etc. Fixes intel#17450
this is a wider problem with the adapters (and spec to some extent) which I plan to address, but I think it's fine for this to go through as a stand-alone patch for now |
npmiller
reviewed
Mar 21, 2025
againull
reviewed
Mar 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sycl changes LGTM
npmiller
approved these changes
Apr 9, 2025
npmiller
added a commit
to npmiller/llvm
that referenced
this pull request
Apr 9, 2025
`urDeviceRetain` and `urDeviceRelease` are no-ops, so this patch removes all uses. Uses of it in the context will be removed in intel#17571
againull
approved these changes
Apr 9, 2025
@intel/llvm-gatekeepers please merge |
kbenzie
pushed a commit
to oneapi-src/unified-runtime
that referenced
this pull request
Apr 10, 2025
In the cuda adapter the adapter struct itself is currently an extern global defined in adapter.cpp. This means fully tearing down the adapter is subject to the same destructor ordering as all other static and global variables, it's first in last out. This presents a problem because an application can declare a static sycl object like a buffer right up top before doing anything else, which results in the sycl object being destroyed after the cuda adapter struct. The UR spec doesn't put the onus on users to keep their parent object lifetimes in order, i.e. there is no statement about "the context you use to create a ur_mem_handle_t must not be released until after the mem_handle". It's assumed (by omission rather than explicitly) that adapters will have their objects keep a reference to any parent objects alive for the duration of their own lifetime. This change moves the cuda adapter structs ownership into a global shared_ptr, which allows child objects of the adapter to keep their own references to it alive past the point where its initial definition goes out of scope. Also adjusts how some other objects track parent object references so that the destructors correctly cascade back to the top: mem handle releases its context, which releases its adapter, which releases the platform + devices, etc. All of this also applies to the hip adapter, although it seems something in hip itself prevents this change from fixing the static_buffer_dtor test - see intel/llvm#17571 (comment) Fixes intel/llvm#17450
martygrant
pushed a commit
that referenced
this pull request
Apr 14, 2025
`urDeviceRetain` and `urDeviceRelease` are no-ops, so this patch removes all uses. Uses of it in the context will be removed in #17571
kbenzie
pushed a commit
to oneapi-src/unified-runtime
that referenced
this pull request
Apr 16, 2025
`urDeviceRetain` and `urDeviceRelease` are no-ops, so this patch removes all uses. Uses of it in the context will be removed in intel/llvm#17571
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In the cuda adapter the adapter struct itself is currently an extern global defined in adapter.cpp. This means fully tearing down the adapter is subject to the same destructor ordering as all other static and global variables, it's first in last out. This presents a problem because an application can declare a static sycl object like a buffer right up top before doing anything else, which results in the sycl object being destroyed after the cuda adapter struct.
The UR spec doesn't put the onus on users to keep their parent object lifetimes in order, i.e. there is no statement about "the context you use to create a ur_mem_handle_t must not be released until after the mem_handle". It's assumed (by omission rather than explicitly) that adapters will have their objects keep a reference to any parent objects alive for the duration of their own lifetime.
This change moves the cuda adapter structs ownership into a global shared_ptr, which allows child objects of the adapter to keep their own references to it alive past the point where its initial definition goes out of scope. Also adjusts how some other objects track parent object references so that the destructors correctly cascade back to the top: mem handle releases its context, which releases its adapter, which releases the platform + devices, etc.
All of this also applies to the hip adapter, although it seems something in hip itself prevents this change from fixing the static_buffer_dtor test - see #17571 (comment)
Fixes #17450