Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collection of fixes for various shutdown scenario race conditions #266
Collection of fixes for various shutdown scenario race conditions #266
Changes from 1 commit
6d8e717
fe20129
d8e4a89
f3ba622
af12ea6
1d569fd
c226aab
d0639f1
be59a73
315a21f
49e5ee5
772b4ae
d1f9d00
91367b0
38b22f1
2934cd2
bc16fb6
5f7013d
75646f3
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't m_itemInProgress be protected by a mutex, since it gets set on the worker thread, but gets checked in
Cancel
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of the logic, is that it should be something like:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll write a separate note on why it's fine :) I was again thinking that you may be right, as it looks racy, but in fact it's not. I'd like to avoid unnecessarily locking here since it might impact perf within the worker thread queue. Deletion of items is done atomically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right --- there was a corner case where we did not wait if item is already popped for execution, but m_itemInProgress hasn't been assigned yet. I made this part of code thread-safe in the next commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this lockguard here? It makes me nervous to hold a lock while sleeping. Won't this also prevent the workerthread from getting the shutdown item off the queue until this completes (probably not a big deal, but a little scary in case that code changes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I thought that you're correct. But on a second thought, in fact, I think it's probably a good thing. Because it totally eliminates my other concern about another task assigned with the same raw ptr:
Task is already executing in another thread here:
That section of code does not hold the lock. When we're done, we assigned nullptr to the item in progress. Cool. That unblocks the waiter.
Since our idle-waiter holds the lock - we prevent the worker thread to schedule a new item, because in order to schedule another item - it has to acquire the lock. This will cause up to 50ms hiccup to the scheduler, but since cancellation itself is a very infrequent operation and the worker is in a background thread - it won't affect performance in any way.
Since I hope we never cancel the shutdown item :D :D :D ... we should be fine. Yeah, it'd prevent the shutdown item to get scheduled if we're waiting on a cancellation of a current task preceding the shutdown item.
Let me know if it makes sense. If it doesn't, please let me know if I'm mistaken. I'm honestly willing to adjust this if this doesn't seem right to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reworked this slightly, but the comment remains the same - I kept the lock for the duration of idle-wait, to prevent another item from being scheduled. That is fully addressing my previous reliability concern - what if a new task gets allocated with the same thread ptr. Now we won't run into that condition at all. This may result in a hiccup of up to 50ms for the worker thread scheduling, but only in case when we're currently executing work item that's being cancelled. In other cases we do not block. My logical reasoning here:
Thus, this hypothetical 50ms 'excessive' wait on task cancellation is not impacting perf in any measurable way.