-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge linkedin/go-zk changes upstream #122
base: master
Are you sure you want to change the base?
Conversation
* Fix the runtime of TestChildWatch and TestSetWatchers TestChildWatch was introduced to test what happens when the response is too large for the default allocated buffer size. It used to create 10k nodes which could take a very significant amount of time. The same effect can be achieved with far fewer nodes, significantly speeding up the test's runtime. The test now runs in 5s on my machine instead of sometimes multiple minutes... TestSetWatchers tested something similar, except that it checks that the outgoing setWatchers packet is broken up into multiple packets when it's too large. Using a similar trick we can generate names of specific lengths to test that the behavior is correct. It was also flaky because if your local ZK deployment is a little slow, deleting all the nodes can take longer than the session timeout, spuriously failing the test. This has also been fixed, and the test now runs in a little over 5 seconds as well, instead of failing. Finally, standardize the ZK server version checking to be a bit more flexible and friendlier towards future versions of ZooKeeper (note: the original implementation doesn't even work because the env variable name is incorrect... It `ZK_VERSION`, not `zk_version`) * Support persistent watches Implements the new persistent watch types introduced in 3.6, along with some utilities that are critical when implementing local caches. * Add metrics receiver and fix persistent watch EventQueue * Add error to RequestCompleted
…tchers (#2) Like the comment on Event.Zxid mentions, watch events now include the zxid as of ZK 3.9.0. Additionally, relaying pings to persistent watchers allows them to keep track of whether they are behind on consuming updates from ZK, particularly since the zxid is relayed.
This field can be used to track the end-to-end time in between receiving the event and actually processing it. Since events end up in a queue, it's hard to tell how far behind the consumer on the EventQueue is, and this timestamp can be used to track that.
This data structure has many other applications, especially when dealing with ZK.
* Upgrade HostProvider This change fixes the behavior of DNSHostProvider where it does not refresh its cached IP addresses that it resolves once on startup for the configured ZK servers. This new behavior more closely matches the Java client's behavior by randomly selecting an address after resolving the host. It slightly changes the semantics of `HostProvider` with an off-by-one, otherwise the `connect` loop could end up in a situation where it attempts to connect to a stale address. This is fixed by moving the backoff to _before_ getting the address, rather than _after_. * Bump linter version to support generics * Fix linter and integration test actions * Add docs
* Improve logging and error handling This overhauls the logging and error responses from the *Conn. Namely, it changes the logging backend to `log/slog` which makes it easier to manage the library's logging centrally (however the old functionally to provide a custom logger is retained). It also captures the reason why a specific connection to a ZK server was closed in the error that is propagated to all callers. This can be checked with `errors.Is`. This also adds an exponential backoff feature to the client, which will make it back off when reconnecting to a ZK server. * Remove logger option, it messes with the logging * Change default behavior of DNSHostProvider * Split implementation of HostProvider, fix up tests
This test was broken by the new error format returned by the connection
When adding a port to an IPv6 address, it must be wrapped with [...].
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #122 +/- ##
==========================================
- Coverage 75.94% 75.00% -0.95%
==========================================
Files 7 10 +3
Lines 1193 1452 +259
==========================================
+ Hits 906 1089 +183
- Misses 196 252 +56
- Partials 91 111 +20 ☔ View full report in Codecov by Sentry. |
@@ -69,7 +103,7 @@ func StartTestCluster(t *testing.T, size int, stdout, stderr io.Writer) (*TestCl | |||
|
|||
for serverN := 0; serverN < size; serverN++ { | |||
srvPath := filepath.Join(tmpPath, fmt.Sprintf("srv%d", serverN+1)) | |||
requireNoError(t, os.Mkdir(srvPath, 0700), "failed to make server path") | |||
requireNoErrorf(t, os.Mkdir(srvPath, 0700), "failed to make server path") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pluck these changes into a separate diff since they do not change any substance to the code. Just trying to remove lines to review as change vs not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack makes sense! As i do that, any opposition to using testify to clean this up further ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have maintained using only the stdlib as it does have nice properties of not conflicting or dealing with any dependencies. I'm not against it longer term, but for this round of changes lets stick to stdlib.
I want to keep the scope of change tighter as we get to the meat of the changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chipping away and doing small PRs like errors.Is will also not change behavior and good to clean up this larger PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK makes sense, it's not necessary, just a nicety. I can move some of the errors.Is
stuff but I did change the error signatures of some of these methods, and it does actually break if you don't do errors.Is
anymore
It may be more convenient to look at these changes sequentially:
zk.Event
which is set when the event is received. Useful for tracking how long it took to get to a specific event (especially relevant when consuming from persistent watches).log/slog
instead of providing a custom logger interface. Change theDialer
interface to something that's trivially implemented bytls.Dialer
andnet.Dialer
, making it much easier to provide client certificates when connecting to ZK. Improve the error handling by returning stacked errors. Especially convenient when trying to understand why a specific connection was closed. Note that this PR would require a new major version as it does introduce backwards incompatible changes.HostProvider
that more closely matches the Java client's implementation, i.e. it resolves the address on-demand rather than once statically. This is also backwards incompatible as it changes the semantics of theHostProvider
implementation to support sleeping after attempting to connect to the address returned byNext
, rather than before.