-
Notifications
You must be signed in to change notification settings - Fork 784
[SYCL] Guard access to memory objects with a mutex #5231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -839,6 +839,10 @@ struct _pi_mem : _pi_object { | |||
size_t Size; | |||
}; | |||
|
|||
// Protects accesses to all the non-const member variables. Exclusive access | |||
// is required to modify any of these members. | |||
std::shared_mutex Mutex; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: I think eventually we may want to move the Mutex to base _pi_object as I expect every PI object needs this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to pi_object.
@@ -403,7 +403,6 @@ static bool PiDriverModuleProgramExtensionFound = false; | |||
// write (as it is OK for multiple threads to read the map without locking it). | |||
|
|||
pi_result _pi_mem::addMapping(void *MappedTo, size_t Offset, size_t Size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update comments in the header that the "this->Mutex" should be locked before calling this.
Maybe, in fact, inline these functions and remove these methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inlined these functions.
@@ -6001,6 +6008,10 @@ pi_result piEnqueueMemBufferCopy(pi_queue Queue, pi_mem SrcBuffer, | |||
pi_event *Event) { | |||
PI_ASSERT(SrcBuffer && DstBuffer, PI_INVALID_MEM_OBJECT); | |||
PI_ASSERT(Queue, PI_INVALID_QUEUE); | |||
|
|||
std::scoped_lock Lock(Queue->PiQueueMutex, SrcBuffer->Mutex, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "std::scoped_lock" will lock all mutex for exclusive use, which looks like unnecessary many times.
Can we adopt a more fine-grain approach, like this:
std::shared_lock L1(SrcBuffer->Mutex, std::defer_lock);
std::unique_lock L2(DstBuffer->Mutex, std::defer_lock);
std::unique_lock L3(Queue->Mutex, std::defer_lock);
std::scoped_lock lockall(L1, L2, L3);
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, thanks, fixed.
@@ -3595,13 +3593,15 @@ pi_result piMemGetInfo(pi_mem Mem, | |||
(void)ParamValueSize; | |||
(void)ParamValue; | |||
(void)ParamValueSizeRet; | |||
std::shared_lock Guard(Mem->Mutex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we even need this lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably you are right, removed for now. Can be added if necessary.
@@ -5762,6 +5770,7 @@ pi_result piEnqueueMemBufferRead(pi_queue Queue, pi_mem Src, | |||
PI_ASSERT(Src, PI_INVALID_MEM_OBJECT); | |||
PI_ASSERT(Queue, PI_INVALID_QUEUE); | |||
|
|||
std::scoped_lock Lock(Queue->PiQueueMutex, Src->Mutex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can only lock "Src" for shared use, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, fixed.
// TODO:: In the following 2 methods we may want to distinguish read access vs. | ||
// write (as it is OK for multiple threads to read the map without locking it). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
@@ -5805,8 +5792,6 @@ static pi_result enqueueMemCopyHelper(pi_command_type CommandType, | |||
pi_event *Event, bool PreferCopyEngine) { | |||
PI_ASSERT(Queue, PI_INVALID_QUEUE); | |||
PI_ASSERT(Event, PI_INVALID_EVENT); | |||
// Lock automatically releases when this goes out of scope. | |||
std::lock_guard<std::mutex> lock(Queue->PiQueueMutex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjust the comment at line 5798
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjusted.
@@ -5868,9 +5853,6 @@ static pi_result enqueueMemCopyRectHelper( | |||
PI_ASSERT(Region && SrcOrigin && DstOrigin && Queue, PI_INVALID_VALUE); | |||
PI_ASSERT(Event, PI_INVALID_EVENT); | |||
|
|||
// Lock automatically releases when this goes out of scope. | |||
std::lock_guard<std::mutex> lock(Queue->PiQueueMutex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adjust the comment at 5859
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjusted.
std::shared_lock SrcLock(SrcBuffer->Mutex, std::defer_lock); | ||
std::unique_lock DstLock(DstBuffer->Mutex, std::defer_lock); | ||
std::unique_lock QueueLock(Queue->Mutex, std::defer_lock); | ||
std::scoped_lock LockAll(SrcLock, DstLock, QueueLock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is shorter, although I am not sure if it reads better:
std::shared_lock SrcLock(SrcBuffer->Mutex, std::defer_lock); | |
std::unique_lock DstLock(DstBuffer->Mutex, std::defer_lock); | |
std::unique_lock QueueLock(Queue->Mutex, std::defer_lock); | |
std::scoped_lock LockAll(SrcLock, DstLock, QueueLock); | |
std::shared_lock SrcLock(SrcBuffer->Mutex, std::defer_lock); | |
std::scoped_lock LockAll(SrcLock, DstBuffer->Mutex, Queue->Mutex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, used shorter version.
@@ -6317,7 +6337,7 @@ pi_result piEnqueueMemUnmap(pi_queue Queue, pi_mem MemObj, void *MappedPtr, | |||
|
|||
{ | |||
// Lock automatically releases when this goes out of scope. | |||
std::lock_guard<std::mutex> lock(Queue->PiQueueMutex); | |||
std::scoped_lock lock(Queue->Mutex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::scoped_lock lock(Queue->Mutex); | |
std::shared_lock lock(Queue->Mutex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, fixed.
@@ -6326,6 +6346,7 @@ pi_result piEnqueueMemUnmap(pi_queue Queue, pi_mem MemObj, void *MappedPtr, | |||
} | |||
} | |||
|
|||
std::shared_lock Guard(MemObj->Mutex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "memcpy" below potentially modifies the data of the MemObj, right?
std::shared_lock Guard(MemObj->Mutex); | |
std::scoped_lock Guard(MemObj->Mutex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, fixed.
@@ -206,6 +206,10 @@ struct _pi_object { | |||
// Level Zero doesn't do the reference counting, so we have to do. | |||
// Must be atomic to prevent data race when incrementing/decrementing. | |||
std::atomic<pi_uint32> RefCount; | |||
|
|||
// Protects accesses to all the non-const member variables. Exclusive access | |||
// is required to modify any of these members. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Please add examples here how to properly lock things (including multiple objects, and read/write), so that people use the same syntax more or less.
-
I think we should add a mode where single-threaded app doesn't pay the overhead of working with many mutexes. Please see how this can get organized. Maybe wrap the "std::shared_mutex" into a class that would redefine mutex operations to no-op under some env var.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added examples and single-threaded mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few more comments
// A single-threaded app has an opportunity to enable this mode to avoid | ||
// overhead from mutex locking. | ||
static const bool SingleThreadMode = [] { | ||
return std::getenv("SYCL_PI_LEVEL_ZERO_SINGLE_THREAD_MODE") != nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that we distinguish "0" as disabling ST mode (it it convenient to switch between 0 and 1 when debugging)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just few comments about new env var
I was able to reproduce this failure by running make check-sycl-all multiple times, it can't be reproduced by running a single test or by single make check run. |
Please don't merge this PR yet, I am double checking SYCL :: Assert/assert_in_simultaneous_kernels.cpp failure. |
Actually, it looks like I can reproduce Assert/assert_in_simultaneous_kernels.cpp failure only with this PR. But it requires running make check-sycl-all many times. I have no idea how to analyze such type of failure. Please let me know if you have any ideas. It doesn't look to me that these changes themselves can cause a failure. More likely these changes change timing and this makes some existing problem to appear more often. I don't have an idea how to prove that. |
I suggest you try https://www.intel.com/content/www/us/en/developer/tools/oneapi/inspector.html to pinpoint the issue, |
Moved to a draft - drafts can't be merged. Artur, please, convert to a pull request when it's ready for merge. |
Fixed Assert/assert_in_simultaneous_kernels.cpp regression by this commit: 786dc4b |
@intel/dpcpp-doc-reviewers Could you please review doc changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc ok
This reverts commit 3321141.
No description provided.