Skip to content

Lower throughput while having more and more watchers #19064

Open
@jokerwyt

Description

Bug report criteria

What happened?

I am doing an ETCD throughput benchmark. I observed a throughput drop while having more and more watchers.

How I conduct my benchmark

  • I use a fixed number of separate clients, each of which keeps sending a simple Txn in Kubernetes-like optimistic creation.
  • In the meantime, I also launched a fixed number of watchers to watch the prefix I used to create KV.
  • And I also compact every 10 seconds.
    Key length ~10 bytes, value length 1301 bytes

The full code can be found here: https://gist.github.com/jokerwyt/b29b5113d0a5f75f6d5621d05d627230

Here is my result.

watcher\conc		 60		 80		 100		 120		 140
0		 26765.07		 27658.51		 27951.77		 27953.14		 27954.7
1		 18788.5		 18431.04		 16221.12		 11767.03		 15444.2
2		 13639.76		 14557.84		 12761.36		 12464.89		 14349.55
3		 13157.18		 13431.09		 11564.61		 12073.8		 13138.41
4		 12520.72		 10658.89		 12019.56		 10515.21		 10127.3
5		 11439.27		 10491.39		 12060.64		 10877.8		 10575.94
6		 13070.41		 10405.48		 9658.23		 11835.03		 10982.19
7		 12127.91		 12062.77		 10176.37		 9965.35		 10284.55
8		 13128.63		 11080.99		 10346.09		 10189.54		 10012.19
9		 9548.81		 10232.87		 9440.67		 11225.33		 9655.85
10		 9449.93		 9440.84		 9908.77		 9808.65		 9530.57

What did you expect to happen?

I expect etcd has the same performance while having 0 or more watchers.

How can we reproduce it (as minimally and precisely as possible)?

I have a test script, use this combining the go benchmark code.
But you may need to set up an etcd yourself and do some small modifications to the script.

https://gist.github.com/jokerwyt/955a810bfe28b342f6ace11ba840e36c

Anything else we need to know?

No response

Etcd version (please run commands below)

3.5.10

Etcd configuration (command line flags or environment variables)

        quota-backend-bytes: "8589934592" # 8Gi
        auto-compaction-retention: "120m"
        auto-compaction-mode: "periodic"

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

$ etcdctl member list -w table
ytwu@worker1:~$ 
    ETCDCTL_API=3 etcdctl \
        --cert /etc/kubernetes/pki/etcd/peer.crt \
        --key /etc/kubernetes/pki/etcd/peer.key \
        --cacert /etc/kubernetes/pki/etcd/ca.crt \
        --endpoints https://worker1:2379 member list -w table
+------------------+---------+---------+------------------------+------------------------+
|        ID        | STATUS  |  NAME   |       PEER ADDRS       |      CLIENT ADDRS      |
+------------------+---------+---------+------------------------+------------------------+
| 99b1b6bcd47e918c | started | worker1 | https://10.10.1.4:2380 | https://10.10.1.4:2379 |
+------------------+---------+---------+------------------------+------------------------+

$ etcdctl --endpoints=<member list> endpoint status -w table
ytwu@worker1:~$ 
    ETCDCTL_API=3 etcdctl \
        --cert /etc/kubernetes/pki/etcd/peer.crt \
        --key /etc/kubernetes/pki/etcd/peer.key \
        --cacert /etc/kubernetes/pki/etcd/ca.crt \
        --endpoints https://worker1:2379 endpoint status -w table
+----------------------+------------------+---------+---------+-----------+-----------+------------+
|       ENDPOINT       |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------------+------------------+---------+---------+-----------+-----------+------------+
| https://worker1:2379 | 99b1b6bcd47e918c |  3.5.10 |  1.0 GB |      true |         2 |     500011 |
+----------------------+------------------+---------+---------+-----------+-----------+------------+

Relevant log output

No response

Metadata

Assignees

No one assigned

    Labels

    area/performancepriority/important-longtermImportant over the long term, but may not be staffed and/or may need multiple releases to complete.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions