Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/types: Support Unix sockets in NewURLS #12469

Merged
merged 2 commits into from
Aug 8, 2022
Merged

pkg/types: Support Unix sockets in NewURLS #12469

merged 2 commits into from
Aug 8, 2022

Conversation

trawler
Copy link

@trawler trawler commented Nov 12, 2020

Resolves #12450
This commits adds support to unix/unixs socket URLs, which currently
fail with the message "URL address does not have the form "host:port".
It also replaces the work started in #11747.

@cfc4n
Copy link
Contributor

cfc4n commented Nov 13, 2020

Etcd is a distributed key-vaule store databaes, Means that this is a system composed of multiple network servers, which must be non-stand-alone communication.

This is in conflict with unix socket. I don't think unix socket is necessary.

@asauber
Copy link

asauber commented Nov 18, 2020

Thank you for your work on this. I have a need for this in an embedded etcd backup validation service. It would be a huge help in that context when validating multiple backups at once.

@jnummelin
Copy link

Means that this is a system composed of multiple network servers, which must be non-stand-alone communication

There's two things we need to remember.

  1. Nothing prohibits (I don't see a valid use case for this though besides testing) you to run multiple etcd instances on a single node. In this case we should be able to use unix sockets.
  2. Etcd distinguishes client and peering addresses. So when I create a cluster for k8s I'd like each etcd node to use unix socket for client comms (for k8s api-server which is running on the same node) and real network address for peering. This way the client interface is not exposed on the network at all

@@ -29,9 +29,6 @@ func TestValidateURLsValueBad(t *testing.T) {
// bad port specification
"127.0.0.1:foo",
"127.0.0.1:",
// unix sockets not supported
"unix://",
"unix://tmp/etcd.sock",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, can you add the valid UDS URL test cases to TestNewURLsValue?

@wenjiaswe
Copy link
Contributor

cc @jingyih @gyuho @ptabor

@ptabor
Copy link
Contributor

ptabor commented Dec 15, 2020

On purpose: This would help with etcd e2e tests behavior

Etcd extensively uses unix:// communication in 'e2e' tests. The consequence of
lack of support for unix:///tmp/127.27.84.4:23432 naming schema leads to many
'abandoned' socket files staying in the source directories. This would allow to fix that problem !

On exact supported syntax (grpc naming scheme ?)

Does it mean etcd will start to support exact syntax of grpc Dial for unix:// namespace:

https://github.com/grpc/grpc/blob/master/doc/naming.md

Warning: It might be subtle in terms of backward compatiblity

If so, as I mentioned in kubernetes/kubernetes#97257 (comment), I have some doubts about backward compatiblity of this change.

Is unix://127.0.0.5:1456 still considered a file '127.0.0.5:1456' in working directory ?

https://github.com/grpc/grpc/blob/master/doc/naming.md suggests that "unix://127.0.0.5:1456" would be considered an absolute path, so put 127.0.0.5:1456 in the root directory:

in the second form, the path must be absolute (i.e., there will actually be three slashes, two prior to the path and another to begin the absolute path).

@caesarxuchao
Copy link

The grpc spec said "there will actually be three slashes" for the path to be considered as an absolute path, so unix://127.0.0.5:1456 should be considered as a related path.

I agree that we need more unit tests to verify the syntax and backwards compatibility concerns.

@trawler
Copy link
Author

trawler commented Dec 23, 2020

Not entirely sure why tests have failed for this PR, since I only added more unit testing and rebased it since last they ran.

@ptabor
Copy link
Contributor

ptabor commented Jan 8, 2021

The grpc spec said "there will actually be three slashes" for the path to be considered as an absolute path, so unix://127.0.0.5:1456 should be considered as a related path.

So this is kind of a 'third' format: officially not supported (documented) by grpc (as it has 2 slashes, so neither 0 nor 3) ,
that's good that it allows to stay backward compatible, but we should be:

  • be explicit about this in the methods documentation
  • have test coverage

Please also run:

git commit --amend
git push --force 

to trigger the re-testing.

@ptabor
Copy link
Contributor

ptabor commented Feb 16, 2021

In #12671 the test is related:
0b75fed#diff-46e9c3eaee9d6846f4cb50dae30fe8fb56db34b7155a36cc31881e85b29fafb1

that shows mapping between legacy "etcd" naming and grpc supported names.

@trawler -> Please rebase (that will also retrigger tests) to get attention to this PR.

@trawler
Copy link
Author

trawler commented Feb 16, 2021

@trawler -> Please rebase (that will also retrigger tests) to get attention to this PR.

done

@ptabor
Copy link
Contributor

ptabor commented Mar 7, 2021

I played a little bit with the fix, to migrate all etcd integration tests to use
unix:///tmp/testTempDir/127.0.0.1:32432
endpoints, but it was more challenging than I expected:

  • There are tests that speak v2 protocol over http. The standard transports does not know how to speak HTTP over UNIX socket, it particular its not clear what's 'host' in the address: unix:///tmp/testTempDir/127.0.0.1:32432
    [From URL parsing its '' (illegal), from certificate validation its '127.0.0.1'.
    There is special handling for unis://localdir:port files here:
    https://github.com/etcd-io/etcd/blob/792b7f57d3505b6c87fe46a6b9b4875de0c2c042/pkg/transport/transport.go#L69
    but it does not handles the potential names introduces by this PR.

  • We might have code in server itself that speak V2 over HTTP. For example:

    func (s *EtcdServer) publish(timeout time.Duration) {

  • In general raft speaks over HTTP in etcd.

For now I decided to workaround the 'test-sockets' location problem by changing the working directory of the test execution.

To declare support for such URLs, I think we need to have a dedicated testcase, e.g. in tests/integration/clientv3/connectivity/dial_test.go that:

  • creates 3 instance etcd instance using such URLs (to cover infra-cluster V2 functionalities or wait for deprecation )
  • connect to it using the URL, including the TLS case.

@ptabor ptabor closed this Mar 7, 2021
@ptabor ptabor reopened this Mar 7, 2021
@codecov-io
Copy link

Codecov Report

Merging #12469 (982a763) into master (73c50b8) will increase coverage by 20.74%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #12469       +/-   ##
===========================================
+ Coverage   48.88%   69.63%   +20.74%     
===========================================
  Files         378      413       +35     
  Lines       30871    32087     +1216     
===========================================
+ Hits        15092    22343     +7251     
+ Misses      13814     7822     -5992     
+ Partials     1965     1922       -43     
Flag Coverage Δ
all 69.63% <100.00%> (+20.74%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/types/urls.go 90.90% <100.00%> (+49.52%) ⬆️
etcdctl/ctlv3/command/watch_command.go 5.29% <0.00%> (-37.65%) ⬇️
server/proxy/grpcproxy/watch.go 65.03% <0.00%> (-7.98%) ⬇️
etcdctl/ctlv3/command/lease_command.go 62.50% <0.00%> (-5.69%) ⬇️
server/etcdmain/grpc_proxy.go 57.89% <0.00%> (-4.30%) ⬇️
server/proxy/httpproxy/director.go 62.85% <0.00%> (-2.86%) ⬇️
etcdctl/ctlv2/command/util.go 40.25% <0.00%> (-2.52%) ⬇️
etcdctl/ctlv3/command/printer_simple.go 65.68% <0.00%> (-0.99%) ⬇️
pkg/fileutil/dir_unix.go 100.00% <0.00%> (ø)
pkg/fileutil/lock_flock.go 0.00% <0.00%> (ø)
... and 272 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 73c50b8...982a763. Read the comment docs.

@stale
Copy link

stale bot commented Aug 11, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Aug 11, 2021
@kke
Copy link

kke commented Aug 26, 2021

Not stale, but does have a conflict now, re @trawler

@stale stale bot removed the stale label Aug 26, 2021
Karen Almog added 2 commits October 7, 2021 11:59
Resolves #12450
This commits adds support to unix/unixs socket URLs, which currently
fail with the message "URL address does not have the form "host:port".
It also replaces the work started in #11747.
@trawler
Copy link
Author

trawler commented Oct 7, 2021

rebased against upstream/main

@trawler
Copy link
Author

trawler commented Oct 7, 2021

@ptabor can you approve the workflows for this PR?

Copy link
Contributor

@gyuho gyuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@serathius
Copy link
Member

Would be great if we could add some e2e tests(could be in separate PR). For example add more cases to those https://github.com/etcd-io/etcd/blob/main/tests/e2e/ctl_v3_grpc_test.go

@stale
Copy link

stale bot commented Jan 17, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jan 17, 2022
@kke
Copy link

kke commented Jan 17, 2022

This is still relevant.

@stale stale bot removed the stale label Jan 17, 2022
@stale
Copy link

stale bot commented Apr 17, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 17, 2022
@kke
Copy link

kke commented Apr 19, 2022

Still relevant.

@stale stale bot removed the stale label Apr 19, 2022
@stale
Copy link

stale bot commented Jul 30, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jul 30, 2022
@serathius
Copy link
Member

cc @ahrtr @spzala

@ahrtr
Copy link
Member

ahrtr commented Aug 8, 2022

Sorry for the late response. Could you rebase this PR again although there is no any conflict? @trawler

@stale stale bot removed the stale label Aug 8, 2022
Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thank you @trawler

But I'd suggest to add some e2e test cases just as @serathius pointed out, which of course can be addressed in a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Unix domain socket support