[8.0] Add a config flag to experiment with some work item prioritization changes in cases involving a lot of sync-over-async #103984
+178
−8
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Port of #103983 to 8.0
Some services that use a lot of sync-over-async were seen to experience stalls due to some priority inversion issues in work items that get queued to the thread pool. For instance, a work item W1 queues another work item W2 to the global queue and blocks waiting for a task to complete, where W2 would need to run in order to complete the task, but W2 is queued behind a number of other work items that operate like W1, and this sometimes leads to long-duration stalls. This change adds an experimental config option that when enabled, enqueues some kinds of work items to a new low-priority global queue that is checked after all other global queues. This was seen to help in some cases. The change is limited to CoreCLR for experimentation.
Customer Impact
Some 1p services that use a lot of sync-over-async are experiencing stalls due to some priority inversion issues in work items that get queued to the thread pool. In some situations, the services stall for a long time until a large number of worker threads are injected, which can take a long time. Due to having to handle existing synchronous APIs using async-only libraries, the sync-over-async is not easy to avoid. Increasing the rate of thread injection through config was not desirable due to other performance issues with high worker thread counts.
Regression?
No
Testing
Validated on a reduced test case provided by a customer.
Risk
Low:
However: