Skip to content

Lower throughput while having more and more watchers #19064



Bug report criteria

What happened?

I am doing an ETCD throughput benchmark. I observed a throughput drop while having more and more watchers.

How I conduct my benchmark

  • I use a fixed number of separate clients, each of which keeps sending a simple Txn in Kubernetes-like optimistic creation.
  • In the meantime, I also launched a fixed number of watchers to watch the prefix I used to create KV.
  • And I also compact every 10 seconds.
    Key length ~10 bytes, value length 1301 bytes

The full code can be found here:

Here is my result.

watcher\conc		 60		 80		 100		 120		 140
0		 26765.07		 27658.51		 27951.77		 27953.14		 27954.7
1		 18788.5		 18431.04		 16221.12		 11767.03		 15444.2
2		 13639.76		 14557.84		 12761.36		 12464.89		 14349.55
3		 13157.18		 13431.09		 11564.61		 12073.8		 13138.41
4		 12520.72		 10658.89		 12019.56		 10515.21		 10127.3
5		 11439.27		 10491.39		 12060.64		 10877.8		 10575.94
6		 13070.41		 10405.48		 9658.23		 11835.03		 10982.19
7		 12127.91		 12062.77		 10176.37		 9965.35		 10284.55
8		 13128.63		 11080.99		 10346.09		 10189.54		 10012.19
9		 9548.81		 10232.87		 9440.67		 11225.33		 9655.85
10		 9449.93		 9440.84		 9908.77		 9808.65		 9530.57

What did you expect to happen?

I expect etcd has the same performance while having 0 or more watchers.

How can we reproduce it (as minimally and precisely as possible)?

I have a test script, use this combining the go benchmark code.
But you may need to set up an etcd yourself and do some small modifications to the script.

Anything else we need to know?

No response

Etcd version (please run commands below)


Etcd configuration (command line flags or environment variables)

        quota-backend-bytes: "8589934592" # 8Gi
        auto-compaction-retention: "120m"
        auto-compaction-mode: "periodic"

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

$ etcdctl member list -w table
    ETCDCTL_API=3 etcdctl \
        --cert /etc/kubernetes/pki/etcd/peer.crt \
        --key /etc/kubernetes/pki/etcd/peer.key \
        --cacert /etc/kubernetes/pki/etcd/ca.crt \
        --endpoints https://worker1:2379 member list -w table
|        ID        | STATUS  |  NAME   |       PEER ADDRS       |      CLIENT ADDRS      |
| 99b1b6bcd47e918c | started | worker1 | | |

$ etcdctl --endpoints=<member list> endpoint status -w table
    ETCDCTL_API=3 etcdctl \
        --cert /etc/kubernetes/pki/etcd/peer.crt \
        --key /etc/kubernetes/pki/etcd/peer.key \
        --cacert /etc/kubernetes/pki/etcd/ca.crt \
        --endpoints https://worker1:2379 endpoint status -w table
|       ENDPOINT       |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
| https://worker1:2379 | 99b1b6bcd47e918c |  3.5.10 |  1.0 GB |      true |         2 |     500011 |

Relevant log output

No response



No one assigned


    area/performancepriority/important-longtermImportant over the long term, but may not be staffed and/or may need multiple releases to complete.


    No type


    No projects


    No milestone


    None yet


    No branches or pull requests

    Issue actions