Rewrite FailureDetector interface and implementations to also work with the multi-stage engine#15005
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #15005 +/- ##
============================================
+ Coverage 61.75% 63.31% +1.56%
- Complexity 207 1483 +1276
============================================
Files 2436 2731 +295
Lines 133233 153622 +20389
Branches 20636 23730 +3094
============================================
+ Hits 82274 97267 +14993
- Misses 44911 49002 +4091
- Partials 6048 7353 +1305
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
| * <p> | ||
| * This class doesn't currently implement any additional logic over BaseExponentialBackoffRetryFailureDetector and is | ||
| * retained for backward compatibility. |
There was a problem hiding this comment.
On second thought we could probably remove this class altogether since the class name is not exposed via the user configuration directly.
67f10b0 to
e8c1cfa
Compare
…th the multi-stage engine
e8c1cfa to
46616f0
Compare
46616f0 to
a8aabd3
Compare
| /** | ||
| * Check if a server that was previously detected as unhealthy is now healthy. | ||
| */ | ||
| public boolean retryUnhealthyServer(String instanceId) { |
There was a problem hiding this comment.
Oops, my bad, left over from a previous iteration with a different rewrite implementation. I've update it to private.
| for (Function<String, Boolean> unhealthyServerRetrier : _unhealthyServerRetriers) { | ||
| if (!unhealthyServerRetrier.apply(instanceId)) { | ||
| recovered = false; |
There was a problem hiding this comment.
I was just going to comment about this section when it was written the other way around, but it looks like you just changed it!
There was a problem hiding this comment.
Yeah with the previous implementation, it was possible for multiple listeners to mark the same server as healthy on a call to retryUnhealthyServer. I've updated the implementation so that the failure detector itself marks the server as healthy if all the listeners / retriers report that the server is healthy again (to be safe).
...ker/src/main/java/org/apache/pinot/broker/requesthandler/MultiStageBrokerRequestHandler.java
Outdated
Show resolved
Hide resolved
…th the multi-stage engine (apache#15005)
…th the multi-stage engine (apache#15005)
FailureDetectorinterface and implementations so that it works with both the engines.QueryRouter(v1) orQueryDispatcher(v2). In theFailureDetectorimplementation that retries the connection with an exponential backoff, we'll only re-add a server to the routing table if both the connections succeed - the Netty channel used for v1 queries as well as the gRPC channel used for v2 queries.FailureDetectorand listener based interface has been rewritten significantly in order to avoid multiple attempts to modify the broker routing table whenever a server is detected as healthy / unhealthy.GrpcBrokerRequestHandlerto make use of theFailureDetector.