Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky TestRandomKillEtcd #8091

Closed
nolouch opened this issue Apr 18, 2024 · 1 comment · Fixed by #8102
Closed

Flaky TestRandomKillEtcd #8091

nolouch opened this issue Apr 18, 2024 · 1 comment · Fixed by #8102
Labels
type/ci The issue is related to CI.

Comments

@nolouch
Copy link
Contributor

nolouch commented Apr 18, 2024

Flaky Test

[FAIL]  pkg/utils/etcdutil TestRandomKillEtcd
err=exit status 1
{"level":"warn","ts":"2024-04-18T14:09:10.670102+0800","caller":"embed/config.go:622","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"info","ts":"2024-04-18T14:09:10.670201+0800","caller":"embed/etcd.go:120","msg":"configuring peer listeners","listen-peer-urls":["http://127.0.0.1:51738"]}
{"level":"info","ts":"2024-04-18T14:09:10.670292+0800","caller":"embed/etcd.go:130","msg":"configuring client listeners","listen-client-urls":["http://127.0.0.1:51739"]}
{"level":"info","ts":"2024-04-18T14:09:10.670371+0800","caller":"embed/etcd.go:309","msg":"starting an etcd server","etcd-version":"3.4.31","git-sha":"Not provided (use ./build instead of go build)","go-version":"go1.21.1","go-os":"darwin","go-arch":"arm64","max-cpu-set":1,"max-cpu-available":12,"member-initialized":false,"name":"test_etcd_5000","data-dir":"/var/folders/jz/fpm69cgn4wg1qbk6f1bfmsgr0000gn/T/TestRandomKillEtcd14626914/001","wal-dir":"","wal-dir-dedicated":"","member-dir":"/var/folders/jz/fpm69cgn4wg1qbk6f1bfmsgr0000gn/T/TestRandomKillEtcd14626914/001/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":100000,"max-wals":5,"max-snapshots":5,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://127.0.0.1:51738"],"listen-peer-urls":["http://127.0.0.1:51738"],"advertise-client-urls":["http://127.0.0.1:51739"],"listen-client-urls":["http://127.0.0.1:51739"],"listen-metrics-urls":[],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"test_etcd_5000=http://127.0.0.1:51738","initial-cluster-state":"new","initial-cluster-token":"etcd-cluster","quota-backend-bytes":2147483648,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":false,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","auto-compaction-mode":"","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":""}
2024-04-18 14:09:10.670425 W | pkg/fileutil: check file permission: directory "/var/folders/jz/fpm69cgn4wg1qbk6f1bfmsgr0000gn/T/TestRandomKillEtcd14626914/001" exist, but the permission is "drwxr-xr-x". The recommended permission is "-rwx------" to prevent possible unprivileged access to the data.
{"level":"info","ts":"2024-04-18T14:09:10.772585+0800","caller":"etcdserver/backend.go:80","msg":"opened backend db","path":"/var/folders/jz/fpm69cgn4wg1qbk6f1bfmsgr0000gn/T/TestRandomKillEtcd14626914/001/member/snap/db","took":"43.5085ms"}
{"level":"info","ts":"2024-04-18T14:09:10.882594+0800","caller":"etcdserver/raft.go:537","msg":"starting local member","local-member-id":"1c7ffe1c972c0d13","cluster-id":"4792318a9e26446a"}
{"level":"info","ts":"2024-04-18T14:09:10.882669+0800","caller":"raft/raft.go:1523","msg":"1c7ffe1c972c0d13 switched to configuration voters=()"}
{"level":"info","ts":"2024-04-18T14:09:10.882691+0800","caller":"raft/raft.go:706","msg":"1c7ffe1c972c0d13 became follower at term 0"}
{"level":"info","ts":"2024-04-18T14:09:10.8827+0800","caller":"raft/raft.go:389","msg":"newRaft 1c7ffe1c972c0d13 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]"}
{"level":"info","ts":"2024-04-18T14:09:10.882705+0800","caller":"raft/raft.go:706","msg":"1c7ffe1c972c0d13 became follower at term 1"}
{"level":"info","ts":"2024-04-18T14:09:10.882724+0800","caller":"raft/raft.go:1523","msg":"1c7ffe1c972c0d13 switched to configuration voters=(2053639353853021459)"}
{"level":"warn","ts":"2024-04-18T14:09:10.969629+0800","caller":"auth/store.go:1391","msg":"simple token is not cryptographically signed"}
{"level":"info","ts":"2024-04-18T14:09:11.030004+0800","caller":"etcdserver/quota.go:98","msg":"enabled backend quota with default value","quota-name":"v3-applier","quota-size-bytes":2147483648,"quota-size":"2.1 GB"}
{"level":"info","ts":"2024-04-18T14:09:11.056165+0800","caller":"etcdserver/server.go:831","msg":"starting etcd server","local-member-id":"1c7ffe1c972c0d13","local-server-version":"3.4.31","cluster-version":"to_be_decided"}
{"level":"info","ts":"2024-04-18T14:09:11.056793+0800","caller":"embed/etcd.go:251","msg":"now serving peer/client/metrics","local-member-id":"1c7ffe1c972c0d13","initial-advertise-peer-urls":["http://127.0.0.1:51738"],"listen-peer-urls":["http://127.0.0.1:51738"],"advertise-client-urls":["http://127.0.0.1:51739"],"listen-client-urls":["http://127.0.0.1:51739"],"listen-metrics-urls":[]}
{"level":"info","ts":"2024-04-18T14:09:11.069629+0800","caller":"etcdserver/server.go:697","msg":"started as single-node; fast-forwarding election ticks","local-member-id":"1c7ffe1c972c0d13","forward-ticks":9,"forward-duration":"900ms","election-ticks":10,"election-timeout":"1s"}
{"level":"info","ts":"2024-04-18T14:09:11.069703+0800","caller":"fileutil/purge.go:48","msg":"started to purge file","dir":"/var/folders/jz/fpm69cgn4wg1qbk6f1bfmsgr0000gn/T/TestRandomKillEtcd14626914/001/member/snap","suffix":"snap.db","max":5,"interval":"30s"}
{"level":"info","ts":"2024-04-18T14:09:11.069726+0800","caller":"fileutil/purge.go:48","msg":"started to purge file","dir":"/var/folders/jz/fpm69cgn4wg1qbk6f1bfmsgr0000gn/T/TestRandomKillEtcd14626914/001/member/snap","suffix":"snap","max":5,"interval":"30s"}
{"level":"info","ts":"2024-04-18T14:09:11.06973+0800","caller":"fileutil/purge.go:48","msg":"started to purge file","dir":"/var/folders/jz/fpm69cgn4wg1qbk6f1bfmsgr0000gn/T/TestRandomKillEtcd14626914/001/member/wal","suffix":"wal","max":5,"interval":"30s"}
{"level":"warn","ts":"2024-04-18T14:09:11.069805+0800","caller":"etcdserver/metrics.go:213","msg":"failed to get file descriptor usage","error":"cannot get FDUsage on darwin"}
{"level":"info","ts":"2024-04-18T14:09:11.069873+0800","caller":"embed/etcd.go:605","msg":"serving peer traffic","address":"127.0.0.1:51738"}
{"level":"info","ts":"2024-04-18T14:09:11.076887+0800","caller":"raft/raft.go:1523","msg":"1c7ffe1c972c0d13 switched to configuration voters=(2053639353853021459)"}
{"level":"info","ts":"2024-04-18T14:09:11.076924+0800","caller":"membership/cluster.go:395","msg":"added member","cluster-id":"4792318a9e26446a","local-member-id":"1c7ffe1c972c0d13","added-peer-id":"1c7ffe1c972c0d13","added-peer-peer-urls":["http://127.0.0.1:51738"]}
{"level":"info","ts":"2024-04-18T14:09:11.531734+0800","caller":"raft/raft.go:929","msg":"1c7ffe1c972c0d13 is starting a new election at term 1"}
{"level":"info","ts":"2024-04-18T14:09:11.531759+0800","caller":"raft/raft.go:719","msg":"1c7ffe1c972c0d13 became candidate at term 2"}
{"level":"info","ts":"2024-04-18T14:09:11.531796+0800","caller":"raft/raft.go:830","msg":"1c7ffe1c972c0d13 received MsgVoteResp from 1c7ffe1c972c0d13 at term 2"}
{"level":"info","ts":"2024-04-18T14:09:11.531804+0800","caller":"raft/raft.go:771","msg":"1c7ffe1c972c0d13 became leader at term 2"}
{"level":"info","ts":"2024-04-18T14:09:11.53181+0800","caller":"raft/node.go:327","msg":"raft.node: 1c7ffe1c972c0d13 elected leader 1c7ffe1c972c0d13 at term 2"}
{"level":"info","ts":"2024-04-18T14:09:11.552602+0800","caller":"etcdserver/server.go:2639","msg":"setting up initial cluster version","cluster-version":"3.4"}
{"level":"info","ts":"2024-04-18T14:09:11.565771+0800","caller":"etcdserver/server.go:2140","msg":"published local member to cluster through raft","local-member-id":"1c7ffe1c972c0d13","local-member-attributes":"{Name:test_etcd_5000 ClientURLs:[http://127.0.0.1:51739]}","request-path":"/0/members/1c7ffe1c972c0d13/attributes","cluster-id":"4792318a9e26446a","publish-timeout":"7s"}
{"level":"info","ts":"2024-04-18T14:09:11.565789+0800","caller":"embed/serve.go:100","msg":"ready to serve client requests"}
{"level":"info","ts":"2024-04-18T14:09:11.565985+0800","caller":"embed/serve.go:209","msg":"serving client traffic insecurely; this is strongly discouraged!","traffic":"grpc+http","address":"127.0.0.1:51739"}
{"level":"info","ts":"2024-04-18T14:09:11.584723+0800","caller":"membership/cluster.go:561","msg":"set initial cluster version","cluster-id":"4792318a9e26446a","local-member-id":"1c7ffe1c972c0d13","cluster-version":"3.4"}
{"level":"info","ts":"2024-04-18T14:09:11.584763+0800","caller":"api/capability.go:77","msg":"enabled capabilities for version","cluster-version":"3.4"}
{"level":"info","ts":"2024-04-18T14:09:11.584832+0800","caller":"etcdserver/server.go:2671","msg":"cluster version is updated","cluster-version":"3.4"}
{"level":"info","ts":"2024-04-18T14:09:11.607671+0800","caller":"raft/raft.go:1523","msg":"1c7ffe1c972c0d13 switched to configuration voters=(2053639353853021459 4859018849094025460)"}
{"level":"info","ts":"2024-04-18T14:09:11.607735+0800","caller":"membership/cluster.go:395","msg":"added member","cluster-id":"4792318a9e26446a","local-member-id":"1c7ffe1c972c0d13","added-peer-id":"436eb3e6292d40f4","added-peer-peer-urls":["http://127.0.0.1:51845"]}
{"level":"info","ts":"2024-04-18T14:09:11.607747+0800","caller":"rafthttp/peer.go:128","msg":"starting remote peer","remote-peer-id":"436eb3e6292d40f4"}
{"level":"info","ts":"2024-04-18T14:09:11.607762+0800","caller":"rafthttp/pipeline.go:71","msg":"started HTTP pipelining with remote peer","local-member-id":"1c7ffe1c972c0d13","remote-peer-id":"436eb3e6292d40f4"}
{"level":"info","ts":"2024-04-18T14:09:11.609916+0800","caller":"rafthttp/peer.go:134","msg":"started remote peer","remote-peer-id":"436eb3e6292d40f4"}
{"level":"info","ts":"2024-04-18T14:09:11.609931+0800","caller":"rafthttp/transport.go:327","msg":"added remote peer","local-member-id":"1c7ffe1c972c0d13","remote-peer-id":"436eb3e6292d40f4","remote-peer-urls":["http://127.0.0.1:51845"]}
{"level":"info","ts":"2024-04-18T14:09:11.609941+0800","caller":"etcdserver/server.go:2068","msg":"applied a configuration change through raft","local-member-id":"1c7ffe1c972c0d13","raft-conf-change":"ConfChangeAddNode","raft-conf-change-node-id":"436eb3e6292d40f4"}
{"level":"info","ts":"2024-04-18T14:09:11.61001+0800","caller":"rafthttp/stream.go:166","msg":"started stream writer with remote peer","local-member-id":"1c7ffe1c972c0d13","remote-peer-id":"436eb3e6292d40f4"}
{"level":"info","ts":"2024-04-18T14:09:11.610017+0800","caller":"rafthttp/stream.go:166","msg":"started stream writer with remote peer","local-member-id":"1c7ffe1c972c0d13","remote-peer-id":"436eb3e6292d40f4"}
{"level":"info","ts":"2024-04-18T14:09:11.610026+0800","caller":"rafthttp/stream.go:406","msg":"started stream reader with remote peer","stream-reader-type":"stream MsgApp v2","local-member-id":"1c7ffe1c972c0d13","remote-peer-id":"436eb3e6292d40f4"}
{"level":"info","ts":"2024-04-18T14:09:11.610147+0800","caller":"rafthttp/stream.go:406","msg":"started stream reader with remote peer","stream-reader-type":"stream Message","local-member-id":"1c7ffe1c972c0d13","remote-peer-id":"436eb3e6292d40f4"}
{"level":"warn","ts":"2024-04-18T14:09:11.722853+0800","caller":"embed/config.go:622","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"info","ts":"2024-04-18T14:09:11.722872+0800","caller":"embed/etcd.go:120","msg":"configuring peer listeners","listen-peer-urls":["http://127.0.0.1:51845"]}
{"level":"info","ts":"2024-04-18T14:09:11.722909+0800","caller":"embed/etcd.go:130","msg":"configuring client listeners","listen-client-urls":["http://127.0.0.1:51846"]}
{"level":"info","ts":"2024-04-18T14:09:11.722972+0800","caller":"embed/etcd.go:309","msg":"starting an etcd server","etcd-version":"3.4.31","git-sha":"Not provided (use ./build instead of go build)","go-version":"go1.21.1","go-os":"darwin","go-arch":"arm64","max-cpu-set":1,"max-cpu-available":12,"member-initialized":false,"name":"test_etcd_5000","data-dir":"/var/folders/jz/fpm69cgn4wg1qbk6f1bfmsgr0000gn/T/TestRandomKillEtcd14626914/002","wal-dir":"","wal-dir-dedicated":"","member-dir":"/var/folders/jz/fpm69cgn4wg1qbk6f1bfmsgr0000gn/T/TestRandomKillEtcd14626914/002/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":100000,"max-wals":5,"max-snapshots":5,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://127.0.0.1:51845"],"listen-peer-urls":["http://127.0.0.1:51845"],"advertise-client-urls":["http://127.0.0.1:51846"],"listen-client-urls":["http://127.0.0.1:51846"],"listen-metrics-urls":[],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"test_etcd_5000=http://127.0.0.1:51738,test_etcd_5000=http://127.0.0.1:51845","initial-cluster-state":"existing","initial-cluster-token":"etcd-cluster","quota-backend-bytes":2147483648,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":false,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","auto-compaction-mode":"","auto-compaction-retention":"0s","auto-compaction-interval":"0s","discovery-url":"","discovery-proxy":""}
2024-04-18 14:09:11.723021 W | pkg/fileutil: check file permission: directory "/var/folders/jz/fpm69cgn4wg1qbk6f1bfmsgr0000gn/T/TestRandomKillEtcd14626914/002" exist, but the permission is "drwxr-xr-x". The recommended permission is "-rwx------" to prevent possible unprivileged access to the data.
{"level":"info","ts":"2024-04-18T14:09:11.745736+0800","caller":"etcdserver/backend.go:80","msg":"opened backend db","path":"/var/folders/jz/fpm69cgn4wg1qbk6f1bfmsgr0000gn/T/TestRandomKillEtcd14626914/002/member/snap/db","took":"8.748417ms"}
{"level":"warn","ts":"2024-04-18T14:09:11.771637+0800","caller":"etcdserver/util.go:167","msg":"apply request took too long","took":"101.382708ms","expected-duration":"100ms","prefix":"read-only range ","request":"key:\"health\" ","response":"","error":"context deadline exceeded"}
{"level":"info","ts":"2024-04-18T14:09:11.771663+0800","caller":"traceutil/trace.go:152","msg":"trace[591211803] range","detail":"{range_begin:health; range_end:; }","duration":"101.415208ms","start":"2024-04-18T14:09:11.670242+0800","end":"2024-04-18T14:09:11.771657+0800","steps":["trace[591211803] 'agreement among raft nodes before linearized reading'  (duration: 101.381583ms)"]}
[2024/04/18 14:09:11.771 +08:00] [WARN] [retry_interceptor.go:62] ["retrying of unary invoker failed"] [target=etcd-endpoints://0x140004c2960/127.0.0.1:51739] [attempt=0] [error="rpc error: code = DeadlineExceeded desc = context deadline exceeded"]
[2024/04/18 14:09:11.771 +08:00] [WARN] [health_checker.go:194] ["etcd endpoint is unhealthy"] [endpoint=http://127.0.0.1:51739] [took=101.65425ms] [source=default-etcd-client]
{"level":"info","ts":"2024-04-18T14:09:11.781987+0800","caller":"embed/etcd.go:375","msg":"closing etcd server","name":"test_etcd_5000","data-dir":"/var/folders/jz/fpm69cgn4wg1qbk6f1bfmsgr0000gn/T/TestRandomKillEtcd14626914/002","advertise-peer-urls":["http://127.0.0.1:51845"],"advertise-client-urls":["http://127.0.0.1:51846"]}
{"level":"info","ts":"2024-04-18T14:09:11.782029+0800","caller":"embed/etcd.go:379","msg":"closed etcd server","name":"test_etcd_5000","data-dir":"/var/folders/jz/fpm69cgn4wg1qbk6f1bfmsgr0000gn/T/TestRandomKillEtcd14626914/002","advertise-peer-urls":["http://127.0.0.1:51845"],"advertise-client-urls":["http://127.0.0.1:51846"]}
--- FAIL: TestRandomKillEtcd (1.17s)
    testutil.go:117:
        	Error Trace:	/Users/nolouch/program/go/pd/pkg/utils/etcdutil/testutil.go:117
        	            				/Users/nolouch/program/go/pd/pkg/utils/etcdutil/testutil.go:77
        	            				/Users/nolouch/program/go/pd/pkg/utils/etcdutil/etcdutil_test.go:249
        	Error:      	Received unexpected error:
        	            	cannot fetch cluster info from peer urls: could not retrieve cluster information from the given URLs
        	Test:       	TestRandomKillEtcd
FAIL

Which jobs are failing

CI link

local env

Reason for failure (if possible)

Anything else

@nolouch nolouch added the type/ci The issue is related to CI. label Apr 18, 2024
@rleungx
Copy link
Member

rleungx commented Apr 18, 2024

Should be TestRandomKillEtcd?

@nolouch nolouch changed the title Flaky TestEtcdWithHangLeaderEnableCheck Flaky TestRandomKillEtcd Apr 19, 2024
ti-chi-bot bot pushed a commit that referenced this issue Apr 19, 2024
ref #7969, close #8091

Signed-off-by: husharp <jinhao.hu@pingcap.com>
nolouch pushed a commit to nolouch/pd that referenced this issue Apr 29, 2024
ref tikv#7969, close tikv#8091

Signed-off-by: husharp <jinhao.hu@pingcap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/ci The issue is related to CI.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants