-
Notifications
You must be signed in to change notification settings - Fork 435
Add support for retries #1708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for retries #1708
Conversation
Motivation: The `ClientRPCExecutor` currently ignores retry and hedging policies. This change adds support for retries. Modifications: - Add a retry executor and wire it up to the client rpc executor - Add a few missing state transitions to the broadcasts sequence Result: RPC can be retried under certain conditions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - I think the comments make the logic pretty straightforward to follow. I wonder if it would make sense to break each task up in small methods to make it a bit more palatable, but I don't have strong opinions to be honest.
Sources/GRPCCore/Call/Client/Internal/ClientRPCExecutor+RetryExecutor.swift
Outdated
Show resolved
Hide resolved
Sources/GRPCCore/Call/Client/Internal/ClientRPCExecutor+RetryExecutor.swift
Outdated
Show resolved
Hide resolved
| method: MethodDescriptor, | ||
| responseHandler: @Sendable @escaping (ClientResponse.Stream<Output>) async throws -> R | ||
| ) async throws -> R { | ||
| // There's quite a lot going on here... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the comments on this method, it makes things a lot easier to follow.
| _ id: _BroadcastSequenceStateMachine<Element>.Subscriptions.ID | ||
| ) -> OnCancelSubscription { | ||
| let (removed, continuation) = self.subscriptions.removeSubscriber(withID: id) | ||
| assert(removed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we removing this because it would sometimes fail or just because it seemed unnecessary to assert it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me. Just two minor comments
| if let timeout = self.timeout { | ||
| group.addTask { | ||
| let result = await Result { | ||
| try await Task.sleep(for: timeout, clock: .continuous) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about iOS use-cases here this might not be what people expect. If you execute a request and the user backgrounds your app you might still be interested in the result later but it might have timedout on the continuous clocked whereas it would still be okay on the suspending one.
It all depends on how the underlying transport handles this of course
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In gRPC timeouts are propagated to the server (we don't do that here, yet) which will also cancel the RPC so I'm not it makes sense to make the change. We also can't do background networking anyway so if you background the app your RPC is toast anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also can't do background networking anyway so if you background the app your RPC is toast anyway.
Is that really true? If you were to build a transport on top of URLSession one could do background networking with the right entitlements.
| let result = await Result { | ||
| // Check for cancellation; the RPC may have timed out in which case we should skip | ||
| // the response handler. | ||
| try Task.checkCancellation() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit odd to me. Why do we have to explicitly check here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because nothing above is throwing: in the case of cancellation we will have a response of some kind (it may be failed already, and if it is failed the failure is constrained to RPCError). This is our last chance to jump into a CancellationError before reaching into user code.
Co-authored-by: Gustavo Cairo <me@gustavocairo.com>
Motivation:
The
ClientRPCExecutorcurrently ignores retry and hedging policies. This change adds support for retries.Modifications:
Result:
RPC can be retried under certain conditions