-
Notifications
You must be signed in to change notification settings - Fork 2.1k
cli: set a timeout for the initial connection ping #3663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminds me that I still wanted to look into this TODO as well;
cli/cli/command/cli.go
Lines 135 to 138 in d0df532
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious though; #3652 describes that the CLI hangs indefinitely; but there's already a 2 second timeout. I understand that this PR makes that timeout configurable, but wouldn't #3652 already timeout after 2 seconds? Or is the
unix://addition the actual fix here?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the
unix://addition is the fix. The configurability wasn't in the original pr, it's something @vvoland asked for in review.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for those details (I probably should've read back the whole discussion above)
So some thoughts;
First of all, "the fix"; I think this was an incorrect assumption on my side. I added this check in b397391 / #2424, wich the assumption that a local socket would not really need a timeout (unless something was horribly wrong), and of course "horribly wrong" now appears to not be theoretical (per your ticket) 😂
I assume the same could also happen on a named pipe (
npipe://) on Windows orfd://(default on Linux for systemd created socket)? If so, we could be explicit;We should verify though if
cli.DockerEndpoint().Hostis guaranteed to always have a scheme (I know there's some "fuzzy" / "relaxed" handling of some values in our code-base, and we should verify that the values is at least normalized (scheme added if missing) early on). At a quick glance, I think that's the case though; callsgetServerHost(), which does normalising (lol, we need to get rid of some abstraction here and there);cli/cli/command/cli.go
Line 324 in a32cd16
As to making the timeout configurable; in general, I'm not against adding configurable options; that said, once added, it's pretty hard to (deprecate and) remove an option, so I also try to be conservative where possible. I realize the 2 seconds was just some arbitrary value ("2 seconds should be enough?"); for your use-case, are you planning on adjusting that default, or would the "fix" be enough (only needed to prevent a full hang, but default is "fine" otherwise)?
What I'm thinking is, that we could; split the fix from the feature; the fix (one line change, and appears to be low risk) could be candidate for backporting to 20.10 if we think that's good. For the feature (make the timeout configurable), we could keep that in draft (for now), and still decide to take it if it turns out there's good use-cases for overriding. Ultimately (also see my other comment above), I think it would be great if we didn't bother with pinging at all, unless we need it (in most cases that would be "the moment we need to connect to the API anyway").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
I backed out the WithInitializationTimeout setting. I also changed it to check "not ssh://".
ya, i'm pretty confident that this codepath always normalizes the Host to have a scheme:
cli/opts/hosts.go
Line 71 in a32cd16
speaking to some of the more high-level comments:
I still don't know how the user's Docker engine got into the state where it was hanging on the ping, but I think it's a good engineering value to be paranoid.
I really like the general Go convention of splitting "Connection" from "Client", where the Connection handles pings/healthchecks and the Client handles API types. grpc or kubernetes/client-go both separate them out this way. i think the Ping() here is a symptom of that more high-level problem of the client trying to do both in a single object. but appreciate that's a larger change.