-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] Removed mutex leading to deadlock #889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
constexpr size_t threadCount = 4; | ||
std::size_t data[size]; | ||
TestDeadLock<std::size_t, 1> Task(data, size); | ||
Task.execute(threadCount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suppose a deadlock happens.
Will the test framework deal with it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, but I guess the task might be killed by lit infrastructure due to execution timeout.
Description contains to many details not really related to the issue.
wait on step 3 blocks the mutex, but waits because for host accessor A1 to be destructed. |
The deadlock appeared under following circumstances: 1) thread1: adds nodes to the graph for host accessor A1 to the buffer B; 2) thread2: adds nodes to the graph for host accessor A2 to the buffer B; 3) thread2: waits for host accessor A2 nodes to complete; 4) thread1: waits for host accessor A1 nodes to complete. On step 3 thread2 locks a mutex in `Scheduler::waitForEvent` and waits for destruction of host accessor A1. Actions on step 4 cannot be completed because thread1 waits for the mutex to be unlocked. Signed-off-by: Ivan Karachun <ivan.karachun@intel.com>
e93584e
to
5326535
Compare
Commit message was updated. |
Signed-off-by: Ivan Karachun <ivan.karachun@intel.com>
5326535
to
09c03e1
Compare
Signed-off-by: Ivan Karachun <ivan.karachun@intel.com>
Signed-off-by: Ivan Karachun <ivan.karachun@intel.com>
080a729
to
16d0b02
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
template <typename Func, typename... Args> | ||
void enqueue(Func func, Args... args) { | ||
MThreadPool.push_back(std::thread(func, args...)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor
MThreadPool.push_back(std::thread(func, args...)); | |
MThreadPool.emplace_back(std::forward(func), std::forward(args)...); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied
18da51d
to
b55d511
Compare
Imagine the situation, when there is a global buffer, and several threads
trying to write some data in it. Each thread creates a host accessor to the
buffer and writes its id into the memory.
When a host accessor is created, SYCL RT creates two commands:
UpdateHostRequirement and EmptyCommand.
So SYCL execution graph will look like this:
... -> [EmpCmd T2] -> [UpdReq T2] -> [EmpCmd T1] -> [UpdReq T1] -> [AllocaBuf]
An EmptyCommand acts like mutex: it blocks all operations execution which were
created after host accessor creation while host accessors is "alive".
Let's assume that thread #2 first started an execution by enqueueing
UpdReq T2
command. But the command cannot be started since its execution is blocked by
EmpCmd T1
. Since thread #2 locked the mutex inScheduler::waitForEvent
,thread #1 hangs on this mutex, which leads to deadlock.
Signed-off-by: Ivan Karachun ivan.karachun@intel.com