Open
Description
If a sender sets a timeout on a transport request and does not receive a response in time then today we make no attempt to inform the receiver that we no longer care about its response. This is particularly bad for stats requests that may be timing out on one broken node, but still continue to pile up there since that node has no way to know that these requests are now irrelevant and should not be processed.
A couple of possible solutions spring to mind:
-
When the sender times out it could sends a task cancellation request to the receiver.
-
The sender could indicate the timeout to the receiver, which could then implement its own local timeout-and-cancel behaviour.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment