TCP sockets not closing properly when etcd is running proxy mode.

Setup a small cluster of 4 nodes, having **3 nodes + 1 proxy** (config has ETCD_PROXY="on" which is not the default).

On the proxy node, check for open TCP sockets like this:

    netstat -n -p -a | fgrep etcd

When local clients on the proxy node connect to 127.0.0.1:2379 we can see a new connection from an ephemeral TCP port on the proxy node over to port 2379 on one of the other three working nodes, and that's fine because it is the proxy behaviour in operation. **However these proxy connections do not properly clean up when etcd is long lived.** Checking with netstat as per above shows more and more lines of output as more activity goes via the proxy. Over time, the available file-handles are consumed and eventually it will refuse connections.

    etcd: http: Accept error: accept tcp [::]:2379: accept4: too many open files; retrying in 5ms


This appears related to an old issue which possibly has come back, or maybe never got fixed 100% in the first place which is here ... https://github.com/coreos/etcd/issues/1959

Restarting etcd temporarily gets it working again, only to have the file handles gradually get consumed over time, requiring more restarts. It is not necessary to restart the entire cluster, merely **restarting the proxy node is sufficient, so this guy is certainly the culprit**.

Platform is CentOS-7 using the packaged etcd installed by "yum" and launched via "systemd" as follows:

    Name        : etcd
    Arch        : x86_64
    Version     : 3.2.7
    Release     : 1.el7
    Size        : 39 M
    Repo        : installed
    From repo   : extras
    Summary     : A highly-available key value store for shared configuration
    URL         : https://github.com/coreos/etcd
    License     : ASL 2.0
    Description : A highly-available key value store for shared configuration.

In the maps I see the following libraries are being used by the etcd process:

    /usr/lib64/libc-2.17.so
    /usr/lib64/libdl-2.17.so
    /usr/lib64/libpthread-2.17.so
    /usr/lib64/ld-2.17.so

These are all very standard CentOS system libraries, I doubt the bug is in the library, but at least you should be able to reproduce the same setup fairly easily. Many people in comments on the older issues reported similar setup (3 nodes + 1 proxy) was the way to reproduce this problem, so it would appear to be quite consistently happening. We are running a bunch of web servers, and a load balancer, sharing session data via etcd which should be fairly simple read/write type operations that can easily be simulated for testing. I'm guessing that the content of the data is irrelevant; typical size of data block might be approx 1k bytes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TCP sockets not closing properly when etcd is running proxy mode. #9009

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TCP sockets not closing properly when etcd is running proxy mode. #9009

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions