You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a design issue that I introduced during the latest refactoring, and didn't realize it.
If the sync/consensus service subscribes the networking events then freezes for a while, the networking service will at some point stop polling the networking state machine.
Unfortunately, stopping the poll the networking state machine means that we also don't detect request responses.
If the sync/consensus service is frozen because it awaits a networking request, then nothing will ever wake up and there's a deadlock.
Marking a low priority because the sync/consensus service are never supposed to await networking requests. Preventing this deadlock is important because it can be hard to reason about these kind of things when writing new code, but in practice said deadlock can't happen right now.
The text was updated successfully, but these errors were encountered:
Stop accepting new requests from the API user if sending a subscription event is in progress. This requires creating a separate frontend -> background channel dedicated to requests.
Thanks to this, the queue of subscription events to send is still bounded, but there's no deadlock possible.
cc #1483 (comment)
This is a design issue that I introduced during the latest refactoring, and didn't realize it.
If the sync/consensus service subscribes the networking events then freezes for a while, the networking service will at some point stop polling the networking state machine.
Unfortunately, stopping the poll the networking state machine means that we also don't detect request responses.
If the sync/consensus service is frozen because it awaits a networking request, then nothing will ever wake up and there's a deadlock.
Marking a low priority because the sync/consensus service are never supposed to await networking requests. Preventing this deadlock is important because it can be hard to reason about these kind of things when writing new code, but in practice said deadlock can't happen right now.
The text was updated successfully, but these errors were encountered: