Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pd leader did not change after exec “/pd-ctl member leader transfer tc-pd-0” #8225

Closed
Lily2025 opened this issue May 30, 2024 · 2 comments · Fixed by #8226
Closed

pd leader did not change after exec “/pd-ctl member leader transfer tc-pd-0” #8225

Lily2025 opened this issue May 30, 2024 · 2 comments · Fixed by #8226
Assignees
Labels

Comments

@Lily2025
Copy link

Bug Report

What did you do?

1、transfer leader from tc-pd-2 to tc-pd-0 by pd-ctl
old leader=tc-pd-2
[2024/05/29 19:11:42.631 +08:00] /pd-ctl member leader transfer tc-pd-0

What did you expect to see?

leader can transfer success

What did you see instead?

pd leader did not change after exec “/pd-ctl member leader transfer tc-pd-0”

pd-0 logs:
[2024/05/29 19:11:43.071 +08:00] [INFO] [leadership.go:374] ["current leadership is deleted"] [revision=836215] [leader-key=/pd/7369932708090718560/leader] [purpose="leader election"] [2024/05/29 19:11:43.071 +08:00] [ERROR] [client.go:162] ["region sync with leader meet error"] [error="[PD:grpc:ErrGRPCRecv]receive response error: rpc error: code = Canceled desc = context canceled"] [2024/05/29 19:11:44.072 +08:00] [INFO] [server.go:1669] ["pd leader has changed, try to re-campaign a pd leader"] [2024/05/29 19:11:44.072 +08:00] [INFO] [server.go:1706] ["start to campaign PD leader"] [campaign-leader-name=tc-pd-0] [2024/05/29 19:11:44.072 +08:00] [INFO] [member.go:355] ["try to resign etcd leader to next pd-server"] [from=tc-pd-0] [to=] [2024/05/29 19:11:44.072 +08:00] [INFO] [server.go:1519] ["leadership transfer starting"] [local-member-id=8f262d98e424de82] [current-leader-member-id=8f262d98e424de82] [transferee-member-id=19a6910b7f01b399] [2024/05/29 19:11:44.072 +08:00] [INFO] [raft.go:1254] ["8f262d98e424de82 [term 6] starts to transfer leadership to 19a6910b7f01b399"] [2024/05/29 19:11:44.072 +08:00] [INFO] [raft.go:1260] ["8f262d98e424de82 sends MsgTimeoutNow to 19a6910b7f01b399 immediately as 19a6910b7f01b399 already has up-to-date log"] [2024/05/29 19:11:44.073 +08:00] [INFO] [raft.go:865] ["8f262d98e424de82 [term: 6] received a MsgVote message with higher term from 19a6910b7f01b399 [term: 7]"] [2024/05/29 19:11:44.073 +08:00] [INFO] [raft.go:706] ["8f262d98e424de82 became follower at term 7"] [2024/05/29 19:11:44.073 +08:00] [INFO] [raft.go:966] ["8f262d98e424de82 [logterm: 6, index: 1066131, vote: 0] cast MsgVote for 19a6910b7f01b399 [logterm: 6, index: 1066131] at term 7"] [2024/05/29 19:11:44.073 +08:00] [INFO] [node.go:333] ["raft.node: 8f262d98e424de82 lost leader 8f262d98e424de82 at term 7"] [2024/05/29 19:11:44.075 +08:00] [INFO] [node.go:327] ["raft.node: 8f262d98e424de82 elected leader 19a6910b7f01b399 at term 7"] [2024/05/29 19:11:44.573 +08:00] [INFO] [server.go:1540] ["leadership transfer finished"] [local-member-id=8f262d98e424de82] [old-leader-member-id=8f262d98e424de82] [new-leader-member-id=19a6910b7f01b399] [took=500.94386ms] [2024/05/29 19:11:44.573 +08:00] [ERROR] [server.go:1712] ["campaign PD leader meets error due to etcd error"] [campaign-leader-name=tc-pd-0] [error="[PD:server:ErrLeaderFrequentlyChange]leader tc-pd-0 frequently changed, leader-key is [/pd/7369932708090718560/leader]"] [2024/05/29 19:11:44.577 +08:00] [INFO] [server.go:1878] ["server enable region storage"] [2024/05/29 19:11:44.577 +08:00] [INFO] [server.go:1665] ["start to watch pd leader"] [pd-leader="name:\"tc-pd-2\" member_id:1848324175643653017 peer_urls:\"http://tc-pd-2.tc-pd-peer.testbed-glh-5rtws.svc:2380\" client_urls:\"http://tc-pd-2.tc-pd-peer.testbed-glh-5rtws.svc:2379\" "] [2024/05/29 19:11:44.577 +08:00] [INFO] [client.go:104] ["region syncer start load region"] [2024/05/29 19:11:44.577 +08:00] [INFO] [client.go:107] ["region syncer finished load regions"] [time-cost=700ns] [2024/05/29 19:11:44.579 +08:00] [INFO] [leadership.go:317] ["watch channel is created"] [revision=836216] [leader-key=/pd/7369932708090718560/leader] [purpose="leader election"] [2024/05/29 19:11:44.580 +08:00] [INFO] [client.go:157] ["server starts to synchronize with leader"] [server=tc-pd-0] [leader=tc-pd-2] [request-index=7850] [2024/05/29 19:27:40.578 +08:00] [INFO] [raft.go:1348] ["8f262d98e424de82 [term 7] received MsgTimeoutNow from 19a6910b7f01b399 and starts an election to get leadership."] [2024/05/29 19:27:40.578 +08:00] [INFO] [raft.go:719] ["8f262d98e424de82 became candidate at term 8"] [2024/05/29 19:27:40.578 +08:00] [INFO] [raft.go:830] ["8f262d98e424de82 received MsgVoteResp from 8f262d98e424de82 at term 8"] [2024/05/29 19:27:40.578 +08:00] [INFO] [raft.go:817] ["8f262d98e424de82 [logterm: 7, index: 1067227] sent MsgVote request to 19a6910b7f01b399 at term 8"] [2024/05/29 19:27:40.578 +08:00] [INFO] [raft.go:817] ["8f262d98e424de82 [logterm: 7, index: 1067227] sent MsgVote request to 7ef08c298097903a at term 8"] [2024/05/29 19:27:40.578 +08:00] [INFO] [node.go:333] ["raft.node: 8f262d98e424de82 lost leader 19a6910b7f01b399 at term 8"] [2024/05/29 19:27:40.579 +08:00] [INFO] [raft.go:830] ["8f262d98e424de82 received MsgVoteResp from 19a6910b7f01b399 at term 8"] [2024/05/29 19:27:40.579 +08:00] [INFO] [raft.go:1295] ["8f262d98e424de82 has received 2 MsgVoteResp votes and 0 vote rejections"] [2024/05/29 19:27:40.579 +08:00] [INFO] [raft.go:771] ["8f262d98e424de82 became leader at term 8"] [2024/05/29 19:27:40.579 +08:00] [INFO] [node.go:327] ["raft.node: 8f262d98e424de82 elected leader 8f262d98e424de82 at term 8"] [2024/05/29 19:27:40.620 +08:00] [INFO] [leadership.go:374] ["current leadership is deleted"] [revision=836974] [leader-key=/pd/7369932708090718560/leader] [purpose="leader election"] [2024/05/29 19:27:40.620 +08:00] [ERROR] [client.go:162] ["region sync with leader meet error"] [error="[PD:grpc:ErrGRPCRecv]receive response error: rpc error: code = Canceled desc = context canceled"] [2024/05/29 19:27:41.620 +08:00] [INFO] [server.go:1669] ["pd leader has changed, try to re-campaign a pd leader"]

pd-2 logs
[2024/05/29 19:11:43.075 +08:00] [INFO] [runner.go:122] ["stopping async task runner"] [name=heartbeat-async] [2024/05/29 19:11:43.075 +08:00] [INFO] [runner.go:122] ["stopping async task runner"] [name=misc-async] [2024/05/29 19:11:43.075 +08:00] [INFO] [runner.go:122] ["stopping async task runner"] [name=log-async] [2024/05/29 19:11:43.075 +08:00] [INFO] [cluster.go:769] ["raft cluster is stopped"] [2024/05/29 19:11:43.075 +08:00] [INFO] [tso.go:436] ["reset the timestamp in memory"] [] [2024/05/29 19:11:43.076 +08:00] [INFO] [server.go:1692] ["skip campaigning of pd leader and check later"] [server-name=tc-pd-2] [etcd-leader-id=10314982131224600194] [member-id=1848324175643653017] [2024/05/29 19:11:43.279 +08:00] [INFO] [server.go:1692] ["skip campaigning of pd leader and check later"] [server-name=tc-pd-2] [etcd-leader-id=10314982131224600194] [member-id=1848324175643653017] [2024/05/29 19:11:43.480 +08:00] [INFO] [server.go:1692] ["skip campaigning of pd leader and check later"] [server-name=tc-pd-2] [etcd-leader-id=10314982131224600194] [member-id=1848324175643653017] [2024/05/29 19:11:43.550 +08:00] [INFO] [server.go:1540] ["leadership transfer finished"] [local-member-id=19a6910b7f01b399] [old-leader-member-id=19a6910b7f01b399] [new-leader-member-id=8f262d98e424de82] [took=500.093938ms] [2024/05/29 19:11:43.682 +08:00] [INFO] [server.go:1692] ["skip campaigning of pd leader and check later"] [server-name=tc-pd-2] [etcd-leader-id=10314982131224600194] [member-id=1848324175643653017] [2024/05/29 19:11:43.884 +08:00] [INFO] [server.go:1692] ["skip campaigning of pd leader and check later"] [server-name=tc-pd-2] [etcd-leader-id=10314982131224600194] [member-id=1848324175643653017] [2024/05/29 19:11:44.076 +08:00] [INFO] [raft.go:1348] ["19a6910b7f01b399 [term 6] received MsgTimeoutNow from 8f262d98e424de82 and starts an election to get leadership."] [2024/05/29 19:11:44.076 +08:00] [INFO] [raft.go:719] ["19a6910b7f01b399 became candidate at term 7"] [2024/05/29 19:11:44.076 +08:00] [INFO] [raft.go:830] ["19a6910b7f01b399 received MsgVoteResp from 19a6910b7f01b399 at term 7"] [2024/05/29 19:11:44.076 +08:00] [INFO] [raft.go:817] ["19a6910b7f01b399 [logterm: 6, index: 1066131] sent MsgVote request to 7ef08c298097903a at term 7"] [2024/05/29 19:11:44.076 +08:00] [INFO] [raft.go:817] ["19a6910b7f01b399 [logterm: 6, index: 1066131] sent MsgVote request to 8f262d98e424de82 at term 7"] [2024/05/29 19:11:44.076 +08:00] [INFO] [node.go:333] ["raft.node: 19a6910b7f01b399 lost leader 8f262d98e424de82 at term 7"] [2024/05/29 19:11:44.077 +08:00] [INFO] [raft.go:830] ["19a6910b7f01b399 received MsgVoteResp from 8f262d98e424de82 at term 7"] [2024/05/29 19:11:44.077 +08:00] [INFO] [raft.go:1295] ["19a6910b7f01b399 has received 2 MsgVoteResp votes and 0 vote rejections"] [2024/05/29 19:11:44.077 +08:00] [INFO] [raft.go:771] ["19a6910b7f01b399 became leader at term 7"] [2024/05/29 19:11:44.077 +08:00] [INFO] [node.go:327] ["raft.node: 19a6910b7f01b399 elected leader 19a6910b7f01b399 at term 7"] [2024/05/29 19:11:44.086 +08:00] [INFO] [server.go:1704] ["start to campaign PD leader"] [campaign-leader-name=tc-pd-2] [2024/05/29 19:11:44.087 +08:00] [INFO] [lease.go:66] ["lease granted"] [lease-id=3718161029788048748] [lease-timeout=3] [purpose="leader election"] [2024/05/29 19:11:44.088 +08:00] [INFO] [leadership.go:181] ["check campaign resp"] [resp="{\"header\":{\"cluster_id\":17846199262841830238,\"member_id\":10314982131224600194,\"revision\":836216,\"raft_term\":7},\"succeeded\":true,\"responses\":[{\"Response\":{\"ResponsePut\":{\"header\":{\"revision\":836216}}}}]}"] [2024/05/29 19:11:44.088 +08:00] [INFO] [leadership.go:190] ["write leaderData to leaderPath ok"] [leader-key=/pd/7369932708090718560/leader] [purpose="leader election"] [2024/05/29 19:11:44.088 +08:00] [INFO] [server.go:1730] ["campaign PD leader ok"] [campaign-leader-name=tc-pd-2] [2024/05/29 19:11:44.088 +08:00] [INFO] [server.go:1738] ["initializing the global TSO allocator"] [2024/05/29 19:11:44.088 +08:00] [INFO] [tso.go:160] ["start to sync timestamp"] [] [2024/05/29 19:11:44.089 +08:00] [INFO] [lease.go:155] ["start lease keep alive worker"] [interval=1s] [purpose="leader election"] [2024/05/29 19:11:44.091 +08:00] [INFO] [tso.go:220] ["sync and save timestamp"] [] [last=2024/05/29 19:11:43.944 +08:00] [last-saved=0001/01/01 00:00:00.000 +00:00] [save=2024/05/29 19:11:47.090 +08:00] [next=2024/05/29 19:11:44.090 +08:00] [2024/05/29 19:11:44.094 +08:00] [INFO] [server.go:1876] ["server enable region storage"] [2024/05/29 19:11:44.095 +08:00] [INFO] [server.go:1770] ["triggering the leader callback functions"] [2024/05/29 19:11:44.103 +08:00] [INFO] [manager.go:185] ["resource group manager finishes initialization"]

What version of PD are you using (pd-server -V)?

./pd-server -V
Release Version: v8.2.0-alpha
Edition: Community
Git Commit Hash: 59e29cc
Git Branch: heads/refs/tags/v8.2.0-alpha
UTC Build Time: 2024-05-16 11:39:07
2024-05-29T18:56:27.485+0800

@Lily2025 Lily2025 added the type/bug The issue is confirmed as a bug. label May 30, 2024
@Lily2025
Copy link
Author

/assign HuSharp
/severity major

@Lily2025
Copy link
Author

/assign rleungx

ti-chi-bot bot pushed a commit that referenced this issue May 30, 2024
close #8225

Signed-off-by: husharp <jinhao.hu@pingcap.com>
ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this issue May 30, 2024
close tikv#8225

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
ti-chi-bot bot pushed a commit that referenced this issue May 30, 2024
close #8225

Signed-off-by: husharp <jinhao.hu@pingcap.com>

Co-authored-by: husharp <jinhao.hu@pingcap.com>
ti-chi-bot bot pushed a commit that referenced this issue May 30, 2024
close #8225

Signed-off-by: husharp <jinhao.hu@pingcap.com>

Co-authored-by: husharp <jinhao.hu@pingcap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging a pull request may close this issue.

3 participants