Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DFE: Can't start worker normally when met worker suicide for master failover #10752

Open
ywqzzy opened this issue Mar 11, 2024 · 1 comment
Open
Labels
area/dm Issues or PRs related to DM. severity/moderate type/bug The issue is confirmed as a bug.

Comments

@ywqzzy
Copy link

ywqzzy commented Mar 11, 2024

What did you do?

  1. master-0 failover to master-1
  2. previous worker commit suicide
  3. start a new worker but the worker stopped quickly.
  4. repeat step 3 forever

What did you expect to see?

Worker can start normally.

2024-03-11 17:31:07	
[2024/03/11 09:31:07.053 +00:00] [INFO] [dm_jobmaster.go:113] ["initializing the dm jobmaster components"] [job_id=341546-sync]
2024-03-11 17:31:07	
[2024/03/11 09:31:07.053 +00:00] [INFO] [dm_jobmaster.go:153] ["recovering the dm jobmaster"] [job_id=341546-sync]
2024-03-11 17:31:03	
[2024/03/11 09:31:03.842 +00:00] [INFO] [executor_manager.go:238] ["check alive"] [exec=dm-job-341546-tiflow-executor-0.tiflow-prod-1372813089454524673-eks-ap-northeast-1-40dc86fe-473285a9]
2024-03-11 17:30:58	
[2024/03/11 09:30:58.343 +00:00] [INFO] [executor_manager.go:238] ["check alive"] [exec=dm-job-341546-tiflow-executor-0.tiflow-prod-1372813089454524673-eks-ap-northeast-1-40dc86fe-473285a9]
2024-03-11 17:30:53	
[2024/03/11 09:30:53.342 +00:00] [INFO] [executor_manager.go:238] ["check alive"] [exec=dm-job-341546-tiflow-executor-0.tiflow-prod-1372813089454524673-eks-ap-northeast-1-40dc86fe-473285a9]
2024-03-11 17:30:50	
[2024/03/11 09:30:50.200 +00:00] [INFO] [jobmanager.go:724] ["on worker online"] [id=341546-sync]
2024-03-11 17:30:50	
[2024/03/11 09:30:50.127 +00:00] [INFO] [server.go:593] ["peer connection received"] [senderID=shared-1708333393-204b8949-tiflow-master-0.tiflow-prod-1372813089454524673-eks-ap-northeast-1-40dc86fe-1285336b] [senderAdvertiseAddr=shared-1708333393-204b8949-tiflow-master-0.shared-1708333393-204b8949-tiflow-master-peer.tiflow-prod-1372813089454524673-eks-ap-northeast-1-40dc86fe.svc:10240] [addr=[10.250.173.69:37812](http://10.250.173.69:37812/)] [epoch=1]
2024-03-11 17:30:50	
[2024/03/11 09:30:50.015 +00:00] [INFO] [server.go:593] ["peer connection received"] [senderID=dm-job-341546-tiflow-executor-0.tiflow-prod-1372813089454524673-eks-ap-northeast-1-40dc86fe-473285a9] [senderAdvertiseAddr=[dm-job-341546-tiflow-executor-0.dm](http://dm-job-341546-tiflow-executor-0.dm/)-job-341546-tiflow-executor-peer.tiflow-prod-1372813089454524673-eks-ap-northeast-1-40dc86fe.svc:10241] [addr=[10.250.172.57:45594](http://10.250.172.57:45594/)] [epoch=1]
2024-03-11 17:30:47	
[2024/03/11 09:30:47.843 +00:00] [INFO] [executor_manager.go:238] ["check alive"] [exec=dm-job-341546-tiflow-executor-0.tiflow-prod-1372813089454524673-eks-ap-northeast-1-40dc86fe-473285a9]
2024-03-11 17:30:46	
[2024/03/11 09:30:46.999 +00:00] [INFO] [worker_creator.go:143] ["Dispatch Worker succeeded"] [job_id=dataflow-engine-job-manager] [args="{\"ProjectInfo\":{},\"WorkerID\":\"341546-sync\",\"MasterID\":\"dataflow-engine-job-manager\",\"WorkerType\":4,\"WorkerConfig\":\"dGFzay1tb2RlOiBsb2FkJnN5bmMKc2hhcmQtbW9kZTogIiIKaWdub3JlLWNoZWNraW5nLWl0ZW1zOgotIGFsbAp0aW1lem9uZTogIiIKY29sbGF0aW9uX2NvbXBhdGlibGU6IGxvb3NlCnRhcmdldC1kYXRhYmFzZToKICBob3N0OiBkYi10aWRiLnRpZGIxMzc5NjYxOTQ0NjQ2NDEzOTg5LnN2YwogIHBvcnQ6IDQwMDAKICB1c2VyOiByb290CiAgcGFzc3dvcmQ6IHlCZWNsZVZWMHhLM2FrNGoKICBtYXgtYWxsb3dlZC1wYWNrZXQ6IG51bGwKICBzZXNzaW9uOiB7fQogIHNlY3VyaXR5OgogICAgc3NsLWNhOiAiIgogICAgc3NsLWNlcnQ6ICIiCiAgICBzc2wta2V5OiAiIgogICAgY2VydC1hbGxvd2VkLWNuOiBbXQogICAgc3NsLWNhLWJ5dGVzOiBbXQogICAgc3NsLWtleS1ieXRlczogW10KICAgIHNzbC1jZXJ0LWJ5dGVzOiBbXQogICAgc3NsLWNhLWJhc2U2NDogTFMwdExTMUNSVWRKVGlCRFJWSlVTVVpKUTBGVVJTMHRMUzB0Q2sxSlNVUnRWRU5EUVc5SFowRjNTVUpCWjBsU1FVNXNLMFF5VVRSMFRVbDNXV3hWVEROblJXdFNkVWwzUkZGWlNrdHZXa2xvZG1OT1FWRkZURUpSUVhjS1dsUkZURTFCYTBkQk1WVkZRbWhOUTFFd05IaEZSRUZQUW1kT1ZrSkJiMDFDTVVKd1ltMWtSRkZXUVhoRlJFRlBRbWRPVmtKQmMwMUNNVUp3WW0xa1JBcFJWa0Y0UkdwQlRVSm5UbFpDUVdkTlFsVk9iMkZYTldoTlVrRjNSR2RaUkZaUlVVUkVRV1JSWVZjMWJsRXdSbEZOVWtGM1JHZFpSRlpSVVVoRVFXUnBDbHBYYkhGaFZ6VnVUVU5CV0VSVVNYaE5SRkYzVG5wQk1rMUVXWGhPUm05WlJIcE5kMDFxUVhkT1JFRXpUVVJqZDA1cVJUQlhha0pzVFZGemQwTlJXVVFLVmxGUlIwVjNTa1JVYWtWUlRVRTBSMEV4VlVWRFozZElWVWRzZFZvd1RrSlZSRVZSVFVFMFIwRXhWVVZEZDNkSVZVZHNkVm93VGtKVlJFVlBUVUYzUndwQk1WVkZRMEYzUmxFeWFIQmliVVY0UlVSQlQwSm5UbFpDUVUxTlFqRkNjR0p0WkVSUlZrRjRSVVJCVDBKblRsWkNRV05OUWpKS2JHRlhjSEJpYldOM0NtZG5SV2xOUVRCSFExTnhSMU5KWWpORVVVVkNRVkZWUVVFMFNVSkVkMEYzWjJkRlMwRnZTVUpCVVVOVFpYUnRlbU4wV1dKa1oxcGplbXRNV2xObVkwTUtiR1JxZDFscVMwNWhVREJNTjBRd2IyNXRiVW93YVdweFJYSkVZUzlSY1RoUGEzSTJUV3RDZFRWclZsZzViRUp4ZGxoalVWWnlMMjlNTUhnNUwyWkhUd3BIT1RRemNtVkdNSHBPTUM5SlJuSnVjR2xGVDBwRVVqUk1SVmhUVmxsU2FVazVkRmxwUVZkU1JWY3ZjazF6U3pCcGFrOHpkR05pZEVsVFUyOXpSSE5yQ2pWM2NXaG9WMlJtV1dWTmJrRmphVUZLUlVaemRubEZSVk42TXpkWE1YTnlTSEpGYmt0MVlUaDJWMkppUzBwMlp6RkdRWHBNVlcweGVWQTRaa3BCZEhZS2RHOUNUM2xuTlRWdWFIZFpSRU5SVEhJMEx6aFJWalYxVEdSc1QyUXZVVXRWV2s0M2VscGlTWE5PWldNdmVUUm5SVFZzYlVaMlVVTjBZVXRNVDAxNVdBbzFlRTVOV1ZrME1rUm9VMFJFVjFwaVJWRklTbHBUTUhCWldUaEZkbVJIWjNJdkwxaHRVVVl3TURCb05HUTRkRUV5ZUcxMGFqaFZhVTQzTW5jNVZuSnlDa0ZuVFVKQlFVZHFVV3BDUVUxQk9FZEJNVlZrUlhkRlFpOTNVVVpOUVUxQ1FXWTRkMGhSV1VSV1VqQlBRa0paUlVaRGVsbFJjWGhDVFc0d2QyRnZZV3NLVjNOdlJpODBjazVDT1ZsWFRVRTBSMEV4VldSRWQwVkNMM2RSUlVGM1NVSm9ha0ZPUW1kcmNXaHJhVWM1ZHpCQ1FWRnpSa0ZCVDBOQlVVVkJRbEZWZVFwcmNGUXpPRU5LZWtncmRETnNlVmhZVWtGcVFqbDZhbU5rVFVSbGJsbG1ialpDVGxSbk9HczNSVkEwWm1oVmNGZHVhREF5UVZsYVREWmFRVEJaV0RSb0NuSjZlR0ZvUzBod0wyTlJTRk5vY0dRNU5YTXhPV0phTWxSaWJVaFZja2h6ZG1wUUwwZHBZVVZWVlVkTGRISTNSR2t5ZUdSQlRUUlVSVUZqVEhWUlFsQUtOM0IwTHpGWE9EaGFXRXQySzI1cVNHWTNUV3BEYUhKcmJETkVNVU01VDBsMWNqSTJNRFpSTVZSUVFXWlBVMnA0WWxkTU5tdzBTVFZuTlVoaU9HOUJkd3BhSzBzclRXaE9aR1ZNY2pCWk56ZHpVV3RFTm5kM1NIWjFUak5LVWtSMVYxaGxNMDVIZEZndlEwaEthM280WTI1aVNVaGtiM1IyVmxCdlZYVllTVGhyQ21obldqSkdaWFp4VkZFMmJXOVNjMnhPTlRNclZFbFdURmhsYTBJclFtMUhaR0Y2TjNOYWRITXdSa3cwUzBGRVUwRTVVSHBVWWxkcWIwMHlOMEl4YmpRS1pFZzJkbHBDUzJwMlRscExWVTAzV1V0blBUMEtMUzB0TFMxRlRrUWdRMFZTVkVsR1NVTkJWRVV0TFMwdExRbz0KICAgIHNzbC1rZXktYmFzZTY0OiBMUzB0TFMxQ1JVZEpUaUJTVTBFZ1VGSkpWa0ZVUlNCTFJWa3RMUzB0TFFwTlNVbEZjR2RKUWtGQlMwTkJVVVZCZFZwdmRYbFphemd4Y0hSelowazBWbVJvV214SmN6TXJjM1owTkhsSU5GWlpMMlV4VldRM2VVUmhiRFUxYm0wMUNtd3JORVpOTkhodGQwUkxTV0ZCV1ZBd1QzQjRaWGQxZUZGNFNHY3laRE5EUTJZcmExUldXakJKU1hwaVZsTTBWRGhTTlVoMk5tOVhTVVoyV1dwUFRWSUtZa0psY1ZGcGIyRmpNMWhFVEhGc1pESldZMlpwUTJReWN6ZERlbmxWUWxGR1YzSnBlRTFMVW1SWU1uQndPV1k1VW5kSE9DOWxXblpVYzJSWU56aDFid3BYWjJwbE9ETnBNWHBOWjFSdGVGTmFSVTF6ZG14RVdIUk5XbUk0TDBOMkwxcFZiV3BXVlU1U1FtbGxaRFJYVkd4cVFtRTBkMVJSTkdjeVYxRkdaVVZUQ2pKdlNYbFBiM2xzZFc0MlpYRkdRMWxyTVdjelRHUXdVM3BxWkc1Rk5uTnhPVzQ0UWpoNWJGWjZlVGRCVmtwVmMwbDRTa3gzWWtFdlNGVTFhMHhNV2xVS1JEWlZMMlIxUkU5eVpXdEhMMnhXVGpGdllsSldOV3NyV1hkNlRFbGhUMjR3VEhKdVNuZEpSRUZSUVVKQmIwbENRVkZEU0RKWGRTOW9aSHB4T0NzMGVRcEdhekYwUm5FeFIwVlhUVWt6U0cxdGRtMXhiblJwY1ZsV01HWjVZV2RpV1doSFdYTTFNRWxXV0dkSWFsVmtiRE13YlVGSVQxZExRV1p4YkhKbWR6SkhDbE5uVGpVd1FUQlNjMjFNYWxwR1dYZ3ZZakJKT0c1SWRuWjFNMlJHTTNwclpXeDBTazB2VldnMGNrWnFjVEpIUjNGT09GSmpkWGxaWlhSSmRXRm1jbVVLU213eU0yNHpiVWRNVTJaTmVVbE1RMWxLZURWUFlrNDRXVUZ6SzNFdmFXSnVTMlF5V0RkQ1JYbzJZVVJXTkdoS1dUWnpiRk51TmxoQ1ZtZHBUWEZJWVFvelpGbHFNRmhHV0dSemJqRXhja3R2VDNBdlEwdEpjRU5HU25KSWRrRkpaVVExVmsxUlpHbEJXRVJtUVVOWGNXZEtObFZZUVVwdWJXdFdVR1ZFYkM5dkNsbHRORzVXUkRkWE5WVmthRTF6ZFZoR2VsQkJkbkJ1WVROUU1VdFRlV0UyYVdKb2QwY3hVVWx0V0hsbFZtVldZekZQZERkMGNHUjZjSGRwVEhGNE4zVUtUVkl4VVVkaFNWSkJiMGRDUVUwNWVYcFdRbUpYYm10YVRtZE5VRzl6YTFoQ1ZsbGFVbEJzTVZjeFZ6VnRSVE5vUWtWdGVWbGFjalZCTVc4MVkxTmFTQXBuT1ZWUU5qSjZSMDVaWjFoMmFWaGpZMmcxYkRNeWN6aHNkSGRIYzNKbWEzQXlNa3Q0UW1OdlJFZFVNME4xU0RORmMxTTVMMkZRTkhCUU1XUXhRbVZwQ25WbWIyZDNOR3MwYUZCQ1JtcHBUamcwUzJONUswdHdOblprVUZKelpHRnhVVTFNTlRVd01FOTJRVTV1VDB4eFVIYzVZMnhXTUU0MVFXOUhRa0ZQVlVzS1pXSnlUWGR3ZDNwSWEza3daVzQyYjBkQlRGVmFhMVkzWmtKYVVHUXZXa3BFZGxCUmQzcGhaWFpDYmtSVVR6UkRUazVUWm05QllXeHZOU3RDY0hvMk1RcE9NVGRWVVRoNlNVSmtVbXhYUlRKbmNpOTVOVU51WmpnMmRVczNUbkJZV1VGeVRXbDJRV04zTjAxRE1UQjJaVlpoV1hsdk5qUmxja3hSZDBVcmJGQlJDbmh5V1Vkek9XMVZUa3BYTVU1T1JFZFlSWGxTVTJWdU1YRXZPV1p0TUc1NUx6Uk9Ua1ZVWldaQmIwZENRVWxaYzFwdlkxZDZkMjFOVDA1aU5rRm1kVElLU0VGc05FSndhV2xSZUZSTVIyRnpiRmx5VTFnek0wWjNjR0UxZUdkTVIxWm9OM05EZEU4MlFuUktXREZhVmt4MmNHY3Zja3N2YjJORmFXSXhXR2h3Tmdwc1dpOVJTazEzYkZwMk4xZHRaamxWTkhCeVJtdzBkVFpuYkhkMGJEaFNRbmcyTHpCRFVWSjZkbVZtTlRGT2F6TktPRm80ZGxWRWJXRnRhbUZOTHk4ekNqRnFUa1pzTUdwTWQzVmpaMjFDZUhsYUsxSTNhbXhzYUVGdlIwSkJTamx1U1dZNGVYZEpTMHhTVW5kdVVURldaekJaTUhWdE5YSm9OMU5PZEROV1UwZ0tNV2xOZWl0S2FrdzRjV1ZRWkRkMmVIaDVhblpwZVZReFNrdEpMMVIwUm1sME5ESkZPVEJpVUUxMFpESlVVRmRDZW5Sc05WRjFUSFJJT1M4MldWZDJOUW80VXpoU1JHOUhSM1ZGT0V0bVJFRmFVMWROSzNOUFZYbzJaRGhEYlN0SE0xaEpVMVYwTldwSEx6WXpkR1JxZFZOM1pYWTNPV1JCYUZSaWIyMXdUMGhpQ21ZMU5qTkNWRTlNUVc5SFFrRk5iWEEzU1VGRk9FUm9VVXRVUjIxV1MxUmlRM1JqUkhwQk1qSm5PVXBrZFVoQlRGTnVVRmx0Y0VoTk9WQm1VRGhhT1hjS04xbHFXbEJhUkV3d1pGbFlhWEpVVGpKbmJVSm1iMDl5Y1ZWT1kwdFRWV2xZVld4SFpVZDZkME4zVlM5TGFEVjZkMG81VW5STmJIQm1jMHd4WlZWSlRncHVlRE5ZU1c1dU1GUmhTR014S3pJdlNHWmtlVXNyTmpCc1ExTllTVE5DWmpaR1JGVXdkSFpSWkhoSlRHUXZZM2RSVEVWVmVqbDNNd290TFMwdExVVk9SQ0JTVTBFZ1VGSkpWa0ZVUlNCTFJWa3RMUzB0TFFvPQogICAgc3NsLWNlcnQtYmFzZTY0OiBMUzB0TFMxQ1JVZEpUaUJEUlZKVVNVWkpRMEZVUlMwdExTMHRDazFKU1VSb2VrTkRRVzByWjBGM1NVSkJaMGxTUVU0d1pHcGpPRlJqVkZSaU5HRXJaRUpVTUV0Mk9EaDNSRkZaU2t0dldrbG9kbU5PUVZGRlRFSlJRWGNLV2xSRlRFMUJhMGRCTVZWRlFtaE5RMUV3TkhoRlJFRlBRbWRPVmtKQmIwMUNNVUp3WW0xa1JGRldRWGhGUkVGUFFtZE9Wa0pCYzAxQ01VSndZbTFrUkFwUlZrRjRSR3BCVFVKblRsWkNRV2ROUWxWT2IyRlhOV2hOVWtGM1JHZFpSRlpSVVVSRVFXUlJZVmMxYmxFd1JsRk5Va0YzUkdkWlJGWlJVVWhFUVdScENscFhiSEZoVnpWdVRVSTBXRVJVU1RCTlJFbDRUbFJCZWsxcVdYcE5WbTlZUkZSTk1FMUVTWGhOYWtFd1RXcFplazFXYjNkS1ZFVlJUVUUwUjBFeFZVVUtRMmhOU0ZWSGJIVmFNRTVDVlVSRlVrMUJPRWRCTVZWRlFYaE5TVmxVVFhwTlJFRTFUbnBCZDJkblJXbE5RVEJIUTFOeFIxTkpZak5FVVVWQ1FWRlZRUXBCTkVsQ1JIZEJkMmRuUlV0QmIwbENRVkZETlcxcE4wcHBWSHBYYlRKNVFXcG9WakpHYlZWcGVtWTJlU3N6YWtsbWFGWnFPVGRXVWpOMlNVNXhXRzV0Q21WaWJWZzNaMVY2YWtkaVFVMXZhRzlDWnk5Uk5tNUdOME0zUmtSRlpVUmFNMk5KU2k4MlVrNVdibEZuYWs1MFZreG9VSGhJYTJVdmNXaFpaMWM1YVUwS05IaEdjMFkyY0VOTGFIQjZaR05OZFhGV00xcFdlQ3RKU2pOaGVuTk1VRXBSUmtGV1lYVk1SWGR3UmpGbVlXMXVNUzh4U0VGaWVqazFiVGxQZURGbWRncDVObWhoUTA0M2VtVk1XRTE1UWs5aVJrcHJVWGw1SzFWT1pUQjRiSFo2T0Vzdk9XeFRZVTVXVVRGRlIwbzFNMmhhVDFkTlJuSnFRazVFYVVSYVdrRldDalJTVEdGbmFrazJha3RYTm1ad05tOVZTbWxVVjBSamRETlNURTlPTW1OVWNYbHlNbVozU0hwTFZsaFFUSE5DVld4VGQycEZhM1pDYzBRNFpGUnRVWE1LZEd4UlVIQlVPVEkwVFRaME5sRmlLMVpWTTFkb2RFWlliVlExYWtSTmMyaHZObVpSZFhWamJrRm5UVUpCUVVkcVkycENkMDFCYTBkQk1WVmtSWGRSUXdwTlFVRjNTSGRaUkZaU01HcENRbWQzUm05QlZVeE9hRU55UlVWNVpsUkNjV2h4VW1GNVoxZ3ZhWE13U0RGb1dYZElVVmxFVmxJd1QwSkNXVVZHUzNCcENtdHpaRlJ0ZUVOMlpFSTNVVGR5VVV0R1YySkpVR1JCUlUxQk5FZEJNVlZrUkhkRlFpOTNVVVZCZDBsR2IwUkJWRUpuVGxaSVUxVkZSRVJCUzBKblozSUtRbWRGUmtKUlkwUkJha0ZPUW1kcmNXaHJhVWM1ZHpCQ1FWRnpSa0ZCVDBOQlVVVkJWM3BGTlZWaGJqRjJTeTloUjNnelIzSkZXRXBwYldvNWNucEdXQXB0WlRScU4xWnNPVlZCWm1wVlZtVk9UbTFNZDBOTFJuWjVVSFpvTkRGWFNqVk1iamxIVmpObE1sWXdkUzgyWlhCMU1WazNVR2hRTDNORlFUTTNOR05ZQ2xOSGJHNXNTRTVhWkU4NGVrWTJWblZaWm5oSGFHOTBWMFJZZVZOTVRUZEhUWEpIU0ZGNGFTdFlTRWhJTkRGVFJsSjVLMFZTYWpkcVFXcENhM3BYTDNvS1lqQXJjekkyY2pKeFpYbDFVSFkyWTNwcVNGVk9WR0UzYjNRNWVUTkZNR3RYWldNMlZXSTNaMmhtYUhSd1ZtZDJkVXQ1UzJsUVVtbHJLMnhpVTNoTlVncGxVbWRHT0VadGVWSXZUbTUxYVdSWWEyWnVLMjVqY0ZOTGJqQkpkblJKVTBzMGIwUTFXa0Z1VW1oNFlqVXpWMEpuZFUxUVdVUTVMMkYxTmtaWmFWWTRDa1E0T1hGclNGUnFlREprV1V0NlpERXZiVVJKVUc4dmMyTnVjeTlRTHpSWFJVcEtMemh5TnpjeE0yb3dNRTVvUVU1aldFaFdRelJXTDJjOVBRb3RMUzB0TFVWT1JDQkRSVkpVU1VaSlEwRlVSUzB0TFMwdENnPT0Kc2hhZG93LXRhYmxlLXJ1bGVzOiBbXQp0cmFzaC10YWJsZS1ydWxlczogW10KZmlsdGVyczoge30KZXhwcmVzc2lvbi1maWx0ZXI6IHt9CmJsb2NrLWFsbG93LWxpc3Q6CiAgYmEtOTAwMzQwOgogICAgZG8tdGFibGVzOiBbXQogICAgZG8tZGJzOiBbXQogICAgaWdub3JlLXRhYmxlczogW10KICAgIGlnbm9yZS1kYnM6CiAgICAtIGluZm9ybWF0aW9uX3NjaGVtYQogICAgLSBteXNxbAogICAgLSBwZXJmb3JtYW5jZV9zY2hlbWEKICAgIC0gc3lzCm15ZHVtcGVyczoge30KbG9hZGVyczoge30Kc3luY2Vyczoge30Kcm91dGVzOiB7fQp2YWxpZGF0b3JzOiB7fQp1cHN0cmVhbXM6Ci0gc291cmNlLWlkOiBzb3VyY2UtOTAwMzQwCiAgbWV0YTogbnVsbAogIGZpbHRlci1ydWxlczogW10KICBjb2x1bW4tbWFwcGluZy1ydWxlczogW10KICByb3V0ZS1ydWxlczogW10KICBleHByZXNzaW9uLWZpbHRlcnM6IFtdCiAgYmxhY2std2hpdGUtbGlzdDogIiIKICBibG9jay1hbGxvdy1saXN0OiBiYS05MDAzNDAKICBteWR1bXBlci1jb25maWctbmFtZTogIiIKICBteWR1bXBlcjogbnVsbAogIG15ZHVtcGVyLXRocmVhZDogMAogIGxvYWRlci1jb25maWctbmFtZTogIiIKICBsb2FkZXI6CiAgICBwb29sLXNpemU6IDE2CiAgICBkaXI6IHMzOi8vZG0tYXAtbm9ydGhlYXN0LTEtMTM3MjgxMzA4OTQ1NDUyNDY3My1jZDI4MTZjYS9kYXRhZmxvdy0zNDE1NDYvc291cmNlLTkwMDM0MC8KICAgIHNvcnRpbmctZGlyLXBoeXNpY2FsOiAiIgogICAgaW1wb3J0LW1vZGU6IGxvZ2ljYWwKICAgIG9uLWR1cGxpY2F0ZTogIiIKICAgIG9uLWR1cGxpY2F0ZS1sb2dpY2FsOiByZXBsYWNlCiAgICBvbi1kdXBsaWNhdGUtcGh5c2ljYWw6ICIiCiAgICBkaXNrLXF1b3RhLXBoeXNpY2FsOiAwCiAgICBjaGVja3N1bS1waHlzaWNhbDogIiIKICBsb2FkZXItdGhyZWFkOiAwCiAgc3luY2VyLWNvbmZpZy1uYW1lOiAiIgogIHN5bmNlcjoKICAgIG1ldGEtZmlsZTogIiIKICAgIHdvcmtlci1jb3VudDogMTYKICAgIGJhdGNoOiAxMDAKICAgIHF1ZXVlLXNpemU6IDEwMjQKICAgIGNoZWNrcG9pbnQtZmx1c2gtaW50ZXJ2YWw6IDMwCiAgICBjb21wYWN0OiBmYWxzZQogICAgbXVsdGlwbGUtcm93czogZmFsc2UKICAgIG1heC1yZXRyeTogMAogICAgYXV0by1maXgtZ3RpZDogZmFsc2UKICAgIGVuYWJsZS1ndGlkOiBmYWxzZQogICAgZGlzYWJsZS1kZXRlY3Q6IGZhbHNlCiAgICBzYWZlLW1vZGU6IGZhbHNlCiAgICBzYWZlLW1vZGUtZHVyYXRpb246IDYwcwogICAgZW5hYmxlLWFuc2ktcXVvdGVzOiBmYWxzZQogIHN5bmNlci10aHJlYWQ6IDAKICB2YWxpZGF0b3ItY29uZmlnLW5hbWU6ICIiCiAgZGItY29uZmlnOgogICAgaG9zdDogcHJvZC12b2ljeS1hdXJvcmEtY2x1c3Rlci5jbHVzdGVyLWNiMXRpdnM4dmJoay5hcC1ub3J0aGVhc3QtMS5yZHMuYW1hem9uYXdzLmNvbQogICAgcG9ydDogMzMwNgogICAgdXNlcjogdm9pY3kKICAgIHBhc3N3b3JkOiAnJVZrUSw0Z1h0TkpWJwogICAgbWF4LWFsbG93ZWQtcGFja2V0OiBudWxsCiAgICBzZXNzaW9uOiB7fQogICAgc2VjdXJpdHk6IG51bGwKICBzZXJ2ZXItaWQ6IDAKICBmbGF2b3I6ICIiCiAgZW5hYmxlLWd0aWQ6IGZhbHNlCiAgY2FzZS1zZW5zaXRpdmU6IGZhbHNlCmV4cGVyaW1lbnRhbDoKICBhc3luYy1jaGVja3BvaW50LWZsdXNoOiBmYWxzZQptZXRhLXNjaGVtYTogdGlkYmNsb3VkX2RtX21ldGEKb25saW5lLWRkbDogZmFsc2UKY29sdW1uLW1hcHBpbmdzOiB7fQptb2QtcmV2aXNpb246IDAK\",\"WorkerEpoch\":617}"]
2024-03-11 17:30:46	
[2024/03/11 09:30:46.998 +00:00] [INFO] [task_runner.go:199] ["Launching task"] [id=341546-sync] [runtime-task-count=1]
2024-03-11 17:30:46	
[2024/03/11 09:30:46.898 +00:00] [INFO] [dm_jobmaster.go:89] ["new dm jobmaster"] [job_id=341546-sync]

What did you see instead?

[2024/03/10 09:44:22.448 +00:00] [WARN] [server.go:297] ["handler not found"] [topic=heartbeat-ping-341546-sync]
2024-03-10 17:44:22 
[2024/03/10 09:44:22.448 +00:00] [WARN] [server.go:499] ["topic handler returned error"] [error="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled"] [errorVerbose="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled\[ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20221009092201-b66cddb77c32/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/errors@v0.11.5-0.20221009092201-b66cddb77c32/normalize.go:164\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister.func1\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:230\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:252\ngithub.com/pingcap/tiflow/pkg/p2p.(*MessageServer).run.func2\n\tgithub.com/pingcap/tiflow/pkg/p2p/server.go:275\nruntime.goexit\n\truntime/asm_amd64.s:1598](http://ngithub.com/pingcap/errors.AddStack/n/tgithub.com/pingcap/errors@v0.11.5-0.20221009092201-b66cddb77c32/errors.go:174/ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs/n/tgithub.com/pingcap/errors@v0.11.5-0.20221009092201-b66cddb77c32/normalize.go:164/ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister.func1/n/tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:230/ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister/n/tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:252/ngithub.com/pingcap/tiflow/pkg/p2p.(*MessageServer).run.func2/n/tgithub.com/pingcap/tiflow/pkg/p2p/server.go:275/nruntime.goexit/n/truntime/asm_amd64.s:1598)"]
2024-03-10 17:44:22 
[2024/03/10 09:44:22.448 +00:00] [WARN] [server.go:297] ["handler not found"] [topic=worker-status-change-req-dataflow-engine-job-manager-341546-sync]
2024-03-10 17:44:22 
[2024/03/10 09:44:22.448 +00:00] [WARN] [server.go:499] ["topic handler returned error"] [error="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled"] [errorVerbose="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled\[ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20221009092201-b66cddb77c32/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/errors@v0.11.5-0.20221009092201-b66cddb77c32/normalize.go:164\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister.func1\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:230\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:252\ngithub.com/pingcap/tiflow/pkg/p2p.(*MessageServer).run.func2\n\tgithub.com/pingcap/tiflow/pkg/p2p/server.go:275\nruntime.goexit\n\truntime/asm_amd64.s:1598](http://ngithub.com/pingcap/errors.AddStack/n/tgithub.com/pingcap/errors@v0.11.5-0.20221009092201-b66cddb77c32/errors.go:174/ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs/n/tgithub.com/pingcap/errors@v0.11.5-0.20221009092201-b66cddb77c32/normalize.go:164/ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister.func1/n/tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:230/ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister/n/tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:252/ngithub.com/pingcap/tiflow/pkg/p2p.(*MessageServer).run.func2/n/tgithub.com/pingcap/tiflow/pkg/p2p/server.go:275/nruntime.goexit/n/truntime/asm_amd64.s:1598)"]
2024-03-10 17:44:22 
[2024/03/10 09:44:22.448 +00:00] [WARN] [server.go:297] ["handler not found"] [topic=heartbeat-pong-dataflow-engine-job-manager-341546-sync]
2024-03-10 17:44:22 
[2024/03/10 09:44:22.448 +00:00] [WARN] [server.go:499] ["topic handler returned error"] [error="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled"] [errorVerbose="[CDC:ErrWorkerPoolHandleCancelled]workerpool handle is cancelled\[ngithub.com/pingcap/errors.AddStack\n\tgithub.com/pingcap/errors@v0.11.5-0.20221009092201-b66cddb77c32/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/errors@v0.11.5-0.20221009092201-b66cddb77c32/normalize.go:164\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister.func1\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:230\ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister\n\tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:252\ngithub.com/pingcap/tiflow/pkg/p2p.(*MessageServer).run.func2\n\tgithub.com/pingcap/tiflow/pkg/p2p/server.go:275\nruntime.goexit\n\truntime/asm_amd64.s:1598](http://ngithub.com/pingcap/errors.AddStack/n/tgithub.com/pingcap/errors@v0.11.5-0.20221009092201-b66cddb77c32/errors.go:174/ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs/n/tgithub.com/pingcap/errors@v0.11.5-0.20221009092201-b66cddb77c32/normalize.go:164/ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister.func1/n/tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:230/ngithub.com/pingcap/tiflow/pkg/workerpool.(*defaultEventHandle).GracefulUnregister/n/tgithub.com/pingcap/tiflow/pkg/workerpool/pool_impl.go:252/ngithub.com/pingcap/tiflow/pkg/p2p.(*MessageServer).run.func2/n/tgithub.com/pingcap/tiflow/pkg/p2p/server.go:275/nruntime.goexit/n/truntime/asm_amd64.s:1598)"]
2024-03-10 17:44:22

Versions of the cluster

DM version (run dmctl -V or dm-worker -V or dm-master -V):

7.5.0

Upstream MySQL/MariaDB server version:

(paste upstream MySQL/MariaDB server version here)

Downstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

7.5.0

How did you deploy DM: tiup or manually?

(leave TiUP or manually here)

Other interesting information (system version, hardware config, etc):

>
>

current status of DM cluster (execute query-status <task-name> in dmctl)

(paste current status of DM cluster here)
@ywqzzy ywqzzy added area/dm Issues or PRs related to DM. type/bug The issue is confirmed as a bug. labels Mar 11, 2024
@ywqzzy ywqzzy changed the title can't start worker normally when met worker suicide for master failover [DFE]can't start worker normally when met worker suicide for master failover Mar 11, 2024
@ywqzzy ywqzzy changed the title [DFE]can't start worker normally when met worker suicide for master failover DFE: Can't start worker normally when met worker suicide for master failover Mar 11, 2024
@fubinzh
Copy link

fubinzh commented Jun 18, 2024

/severity moderate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dm Issues or PRs related to DM. severity/moderate type/bug The issue is confirmed as a bug.
Projects
Status: Need Triage
Development

No branches or pull requests

2 participants