Skip to content

3 nodes etcd cluster with two node in high cpu usage #11012

Closed
@phosae

Description

system aws ec2, Linux version 4.14.128-112.105 x86_64
etcd version 3.3.11

start up configuration

etcd --name m1 --initial-advertise-peer-urls http://etcd1:2380 \
--listen-peer-urls http://etcd1:2380 \
--listen-client-urls http://etcd1:2379 \ 
--advertise-client-urls http://etcd1:2379 --debug=true --log-output=stdout \
--auth-token=jwt,pub-key=./jwt_RS256.pub,priv-key=./jwt_RS256.rsa, \
sign-method=RS256 --data-dir /data/etcd --initial-cluster-token my-cluster\
--initial-cluster m1=http://etcd1:2380,m2=http://etcd2:2380,m3=http://etcd3:2380 \
--initial-cluster-state new

m2, m2 startup shell are similar to m1

# we use gateway
/usr/local/bin/etcd gateway start --endpoints=etcd1:2379,etcd2:2379,etcd3:2379

top info

PID   USER  PR   NI  VIRT   RES   SHR    S  %CPU   %MEM  TIME+      COMMAND
10297 root  20   0   11.2g  94708 17812  R  394.0  0.6   963:57.14  etcd

PID   USER  PR   NI  VIRT   RES    SHR   S  %CPU   %MEM  TIME+    COMMAND
7787  root  20   0   11.2g  91764  17760 S  393.4  0.6   1019:19  etcd

PID   USER   PR  NI   VIRT   RES    SHR   S   %CPU  %MEM  TIME+   COMMAND
12748 root   20   0   11.1g  55472  17724 S   0.3   0.3   1:31.41 etcd
(pprof) top10
Showing nodes accounting for 65960ms, 87.63% of 75270ms total
Dropped 349 nodes (cum <= 376.35ms)
Showing top 10 nodes out of 72
      flat  flat%   sum%        cum   cum%
   18990ms 25.23% 25.23%    18990ms 25.23%  math/big.mulAddVWW
   14630ms 19.44% 44.67%    14630ms 19.44%  math/big.addMulVVW
   12920ms 17.16% 61.83%    12920ms 17.16%  math/big.subVV
   12220ms 16.23% 78.07%    47160ms 62.65%  math/big.nat.divLarge
    2470ms  3.28% 81.35%    17210ms 22.86%  math/big.basicSqr
    1150ms  1.53% 82.87%     1980ms  2.63%  runtime.scanobject
    1100ms  1.46% 84.34%     1100ms  1.46%  math/big.shlVU
     890ms  1.18% 85.52%      890ms  1.18%  runtime.memclrNoHeapPointers
     820ms  1.09% 86.61%      820ms  1.09%  math/big.greaterThan (inline)
     770ms  1.02% 87.63%      770ms  1.02%  math/big.shrVU

this happen the next day after I change root passwd through gateway:
echo "newpass"|etcdctl user passwd root --user="root:root" --interactive=false --endpoints="http://gateway:23790" (though this may not the direct cause)

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions