Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed logic to attempt reconnections to same broker #414

Merged
merged 1 commit into from
Dec 12, 2020

Conversation

merlimat
Copy link
Contributor

Motivation

There is a problem with the re-connection logic introduced in #157.

The change added a logic to keep retrying to establish a TCP connection with broker up to the "operation timeout" (default 30seconds).

There are few issues with it:

  1. (minor) It's not checking that the error is indeed a TCP error (eg: it would retry on auth failures too)
  2. (major) After a TCP connection failure, reconnecting to the same broker is always the wrong approach, because the most likely outcome is that the next attempt will also fail and, worse, the IP might just be unresponsive and we will then have to wait for the full connection timeout time.

The correct solution after a connection failure is to re-do the topic lookup, since the topic will be moving to a different broker and we need to reconnect to the new broker asap.

The only time we can do this connection retry logic is for requests that are not specific to a particular broker (eg: lookup operations). In this case a quick retry on a connection failure will probably land the request on a different, healthy, broker.

@merlimat merlimat added this to the 0.4.0 milestone Dec 11, 2020
@merlimat merlimat self-assigned this Dec 11, 2020
@wolfstudy wolfstudy merged commit 0296890 into apache:master Dec 12, 2020
@merlimat merlimat deleted the fix-reconnection-attempts branch February 5, 2021 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants