Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM worker can't exit in CI #5089

Closed
Tracked by #2246
lance6716 opened this issue Apr 1, 2022 · 19 comments · Fixed by #5427 or #6213
Closed
Tracked by #2246

DM worker can't exit in CI #5089

lance6716 opened this issue Apr 1, 2022 · 19 comments · Fixed by #5427 or #6213
Assignees
Labels
area/dm Issues or PRs related to DM. component/test Unit tests and integration tests component. severity/minor type/bug The issue is confirmed as a bug.

Comments

@lance6716
Copy link
Contributor

Which jobs are flaking?

IT-ha_cases3_1

Which test(s) are flaking?

DM IT

Jenkins logs or GitHub Actions link

[2022-03-31T09:47:44.177Z] kill dm-worker2
[2022-03-31T09:47:44.177Z] kill: sending signal to 1344 failed: No such process
[2022-03-31T09:47:44.177Z] [Thu Mar 31 17:47:44 CST 2022] <<<<<< START DM-WORKER on port 8263, config: /home/jenkins/agent/workspace/dm_ghpr_integration_test/go/src/github.com/pingcap/tiflow/dm/tests/ha_cases3_1/conf/dm-worker2.toml >>>>>>
[2022-03-31T09:47:44.438Z] dmctl test cmd: "pause-task test"
[2022-03-31T09:47:47.719Z] got=3 expected=3
[2022-03-31T09:47:47.974Z] dmctl test cmd: "resume-task test"
[2022-03-31T09:47:47.974Z] got=3 expected=3
[2022-03-31T09:49:39.369Z] use sync_diff_inspector to check increment data
[2022-03-31T09:49:39.369Z] check diff successfully
[2022-03-31T09:49:39.369Z] [Thu Mar 31 17:49:29 CST 2022] <<<<<< finish test_isolate_master_and_worker >>>>>>
[2022-03-31T09:49:39.369Z] 5 dm-master alive
[2022-03-31T09:49:39.369Z] 6 dm-worker alive
[2022-03-31T09:49:39.369Z] 0 dm-syncer alive
...
[2022-03-31T09:50:11.859Z] wait process dm-worker.test exit...
[2022-03-31T09:50:12.794Z] wait process dm-worker.test exit...
[2022-03-31T09:50:13.728Z] wait process dm-worker.test exit...
[2022-03-31T09:50:14.661Z] wait process dm-worker.test exit...
[2022-03-31T09:50:14.661Z] process dm-worker.test didn't exit after 30 seconds

DM worker log :
worker.log

Anything else we need to know

  • Does this test exist for other branches as well?

  • Has there been a high frequency of failure lately?

@lance6716 lance6716 added the component/test Unit tests and integration tests component. label Apr 1, 2022
@lance6716 lance6716 added the area/dm Issues or PRs related to DM. label Apr 1, 2022
@lance6716

This comment was marked as resolved.

@lance6716
Copy link
Contributor Author

lance6716 commented May 15, 2022

[2022-05-15T12:16:29.803Z] [2022/05/15 18:42:42.885 +08:00] [INFO] [main.go:90] ["got signal to exit"] [signal=terminated]
...
[2022-05-15T12:16:29.804Z] [2022/05/15 18:42:43.172 +08:00] [WARN] [join.go:105] ["keepalive with master goroutine paused"] []

the log didn't contain "dm-worker exit"

cant_exit.txt from https://ci2.pingcap.net/blue/organizations/jenkins/dm_ghpr_integration_test/detail/dm_ghpr_integration_test/5696/pipeline/204

@lance6716
Copy link
Contributor Author

@D3Hunter
Copy link
Contributor

@D3Hunter D3Hunter reopened this May 19, 2022
@lance6716
Copy link
Contributor Author

[2022-06-13T05:01:11.943Z] process dm-worker.test didn't exit after 120 seconds, current processlist: jenkins     9647       0 10 12:58 ?        00:00:15 /home/jenkins/agent/workspace/dm_ghpr_integration_test/go/src/github.com/pingcap/tiflow/dm/bin/dm-worker.test -test.coverprofile=/tmp/dm_test/cov.openapi.worker.8262.1655096321.out DEVEL --worker-addr=0.0.0.0:8262 --advertise-addr=127.0.0.1:8262 --log-file=/tmp/dm_test/openapi/worker1/log/dm-worker.log -L=debug --config=/home/jenkins/agent/workspace/dm_ghpr_integration_test/go/src/github.com/pingcap/tiflow/dm/tests/openapi/conf/dm-worker1.toml
...
[2022-06-13T05:01:11.943Z] curl: (7) Failed connect to 127.0.0.1:8262; Connection refused

https://ci2.pingcap.net/blue/organizations/jenkins/dm_ghpr_integration_test/detail/dm_ghpr_integration_test/6687/pipeline

@lance6716
Copy link
Contributor Author

Forget to show STAT in process 😂

@D3Hunter
Copy link
Contributor

D3Hunter commented Jun 14, 2022

current processlist: jenkins 9647 0 10 12:58 ? 00:00:15 /home/jenkins/agent/w

seems a zombie process, but it's ppid is 0, should be 1 normally, rare but possible, see https://stackoverflow.com/questions/37577931/how-can-i-kill-a-zombie-process-whose-parent-is-pid-0

@lance6716
Copy link
Contributor Author

lance6716 commented Jun 14, 2022

[2022-06-14T09:22:50.852Z] process dm-worker.test didn't exit after 120 seconds, current processlist: jenkins   8752  9.1  0.4 8664040 985212 ?      Sl   17:20   0:13 /home/jenkins/agent/workspace/dm_ghpr_integration_test/go/src/github.com/pingcap/tiflow/dm/bin/dm-worker.test -test.coverprofile=/tmp/dm_test/cov.openapi.worker.8262.1655198421.out DEVEL --worker-addr=0.0.0.0:8262 --advertise-addr=127.0.0.1:8262 --log-file=/tmp/dm_test/openapi/worker1/log/dm-worker.log -L=debug --config=/home/jenkins/agent/workspace/dm_ghpr_integration_test/go/src/github.com/pingcap/tiflow/dm/tests/openapi/conf/dm-worker1.toml

https://ci2.pingcap.net/blue/organizations/jenkins/dm_ghpr_integration_test/detail/dm_ghpr_integration_test/6748/pipeline

S    interruptible sleep (waiting for an event to complete)
l    is multi-threaded (using CLONE_THREAD, like NPTL pthreads do)

@lance6716
Copy link
Contributor Author

output.txt

this is dumped from macOS, the process stat is S.

@lance6716
Copy link
Contributor Author

output.txt

this is dumped from macOS, the process stat is S.

syncer is still running, not knowing the worker should be canceled.

@buchuitoudegou
Copy link
Contributor

 process dm-worker.test didn't exit after 120 seconds, current processlist: jenkins     9244 11.3  0.5 8344328 1105780 ?     Sl   13:06   0:16 /home/jenkins/agent/workspace/dm_ghpr_integration_test/go/src/github.com/pingcap/tiflow/dm/bin/dm-worker.test -test.coverprofile=/tmp/dm_test/cov.openapi.worker.8263.1656565586.out DEVEL --worker-addr=0.0.0.0:8263 --advertise-addr=127.0.0.1:8263 --log-file=/tmp/dm_test/openapi/worker2/log/dm-worker.log -L=debug --

https://ci2.pingcap.net/blue/organizations/jenkins/dm_ghpr_integration_test/detail/dm_ghpr_integration_test/7331/pipeline

ti-chi-bot added a commit that referenced this issue Jul 4, 2022
@D3Hunter
Copy link
Contributor

D3Hunter commented Jul 4, 2022

met again
https://ci2.pingcap.net/blue/organizations/jenkins/dm_ghpr_integration_test/detail/dm_ghpr_integration_test/7418/pipeline

should unify process exit check check_process_exit to wait_process_exit

@lance6716
Copy link
Contributor Author

lance6716 commented Jul 8, 2022

goroutine stack by dlv

(dlv) grs -t
  Goroutine 1 - User: /usr/lib/go/src/testing/testing.go:1487 testing.(*T).Run (0x195fd10) [chan receive 460277h24m26.932054638s]
         0  0x00000000017b40b6 in runtime.gopark
             at /usr/lib/go/src/runtime/proc.go:362
         1  0x000000000177cd05 in runtime.chanrecv
             at /usr/lib/go/src/runtime/chan.go:577
         2  0x000000000177c798 in runtime.chanrecv1
             at /usr/lib/go/src/runtime/chan.go:440
         3  0x000000000195fd10 in testing.(*T).Run
             at /usr/lib/go/src/testing/testing.go:1487
         4  0x000000000196325a in testing.runTests.func1
             at /usr/lib/go/src/testing/testing.go:1839
         5  0x000000000195e654 in testing.tRunner
             at /usr/lib/go/src/testing/testing.go:1439
         6  0x00000000019630a5 in testing.runTests
             at /usr/lib/go/src/testing/testing.go:1837
         7  0x0000000001960c92 in testing.(*M).Run
             at /usr/lib/go/src/testing/testing.go:1719
         8  0x0000000005fea8ca in main.main
             at _testmain.go:703
         9  0x00000000017b3cf2 in runtime.main
             at /usr/lib/go/src/runtime/proc.go:250
        10  0x00000000017e8621 in runtime.goexit
             at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 2 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [force gc (idle) 460277h22m11.428647095s]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x00000000017b3f4d in runtime.goparkunlock
            at /usr/lib/go/src/runtime/proc.go:367
        2  0x00000000017b3f4d in runtime.forcegchelper
            at /usr/lib/go/src/runtime/proc.go:301
        3  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 3 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [GC sweep wait]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x000000000179d9d7 in runtime.goparkunlock
            at /usr/lib/go/src/runtime/proc.go:367
        2  0x000000000179d9d7 in runtime.bgsweep
            at /usr/lib/go/src/runtime/mgcsweep.go:297
        3  0x0000000001793586 in runtime.gcenable.func1
            at /usr/lib/go/src/runtime/mgc.go:177
        4  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 4 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [GC scavenge wait]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x000000000179b7e5 in runtime.goparkunlock
            at /usr/lib/go/src/runtime/proc.go:367
        2  0x000000000179b7e5 in runtime.bgscavenge
            at /usr/lib/go/src/runtime/mgcscavenge.go:364
        3  0x0000000001793526 in runtime.gcenable.func2
            at /usr/lib/go/src/runtime/mgc.go:178
        4  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 5 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [finalizer wait]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x00000000017925f3 in runtime.goparkunlock
            at /usr/lib/go/src/runtime/proc.go:367
        2  0x00000000017925f3 in runtime.runfinq
            at /usr/lib/go/src/runtime/mfinal.go:177
        3  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 6 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [GC worker (idle)]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x0000000001795665 in runtime.gcBgMarkWorker
            at /usr/lib/go/src/runtime/mgc.go:1207
        2  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 7 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [GC worker (idle)]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x0000000001795665 in runtime.gcBgMarkWorker
            at /usr/lib/go/src/runtime/mgc.go:1207
        2  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 8 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [GC worker (idle)]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x0000000001795665 in runtime.gcBgMarkWorker
            at /usr/lib/go/src/runtime/mgc.go:1207
        2  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 18 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [GC worker (idle)]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x0000000001795665 in runtime.gcBgMarkWorker
            at /usr/lib/go/src/runtime/mgc.go:1207
        2  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 19 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [GC worker (idle)]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x0000000001795665 in runtime.gcBgMarkWorker
            at /usr/lib/go/src/runtime/mgc.go:1207
        2  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 20 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [GC worker (idle)]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x0000000001795665 in runtime.gcBgMarkWorker
            at /usr/lib/go/src/runtime/mgc.go:1207
        2  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 21 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [GC worker (idle)]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x0000000001795665 in runtime.gcBgMarkWorker
            at /usr/lib/go/src/runtime/mgc.go:1207
        2  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 22 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [GC worker (idle)]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x0000000001795665 in runtime.gcBgMarkWorker
            at /usr/lib/go/src/runtime/mgc.go:1207
        2  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 23 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [GC worker (idle)]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x0000000001795665 in runtime.gcBgMarkWorker
            at /usr/lib/go/src/runtime/mgc.go:1207
        2  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 24 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [GC worker (idle)]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x0000000001795665 in runtime.gcBgMarkWorker
            at /usr/lib/go/src/runtime/mgc.go:1207
        2  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 34 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [GC worker (idle)]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x0000000001795665 in runtime.gcBgMarkWorker
            at /usr/lib/go/src/runtime/mgc.go:1207
        2  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 35 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [GC worker (idle)]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x0000000001795665 in runtime.gcBgMarkWorker
            at /usr/lib/go/src/runtime/mgc.go:1207
        2  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 50 - User: /home/lance/go/pkg/mod/github.com/golang/glog@v1.0.0/glog.go:882 github.com/golang/glog.(*loggingT).flushDaemon (0x39e876b) [chan receive 460277h22m11.430890977s]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x000000000177cd05 in runtime.chanrecv
            at /usr/lib/go/src/runtime/chan.go:577
        2  0x000000000177c7d8 in runtime.chanrecv2
            at /usr/lib/go/src/runtime/chan.go:445
        3  0x00000000039e876b in github.com/golang/glog.(*loggingT).flushDaemon
            at /home/lance/go/pkg/mod/github.com/golang/glog@v1.0.0/glog.go:882
        4  0x00000000039e4dda in github.com/golang/glog.init.0.func1
            at /home/lance/go/pkg/mod/github.com/golang/glog@v1.0.0/glog.go:410
        5  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 52 - User: /home/lance/go/pkg/mod/go.opencensus.io@v0.23.0/stats/view/worker.go:276 go.opencensus.io/stats/view.(*worker).start (0x4724917) [select]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x00000000017c55d2 in runtime.selectgo
            at /usr/lib/go/src/runtime/select.go:328
        2  0x0000000004724917 in go.opencensus.io/stats/view.(*worker).start
            at /home/lance/go/pkg/mod/go.opencensus.io@v0.23.0/stats/view/worker.go:276
        3  0x000000000472343a in go.opencensus.io/stats/view.init.0.func1
            at /home/lance/go/pkg/mod/go.opencensus.io@v0.23.0/stats/view/worker.go:34
        4  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 75 - User: /usr/lib/go/src/runtime/sigqueue.go:151 os/signal.signal_recv (0x17e44cf) (thread 892018)
        0  0x00000000017eac23 in runtime.futex
            at /usr/lib/go/src/runtime/sys_linux_amd64.s:553
        1  0x00000000017ad3b6 in runtime.futexsleep
            at /usr/lib/go/src/runtime/os_linux.go:66
        2  0x0000000001782ca5 in runtime.notetsleep_internal
            at /usr/lib/go/src/runtime/lock_futex.go:182
        3  0x0000000001782dc5 in runtime.notetsleepg
            at /usr/lib/go/src/runtime/lock_futex.go:236
        4  0x00000000017e44cf in os/signal.signal_recv
            at /usr/lib/go/src/runtime/sigqueue.go:151
        5  0x0000000001a24845 in os/signal.loop
            at /usr/lib/go/src/os/signal/signal_unix.go:23
        6  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 83 - User: ./dm/cmd/dm-worker/main_test.go:70 github.com/pingcap/tiflow/dm/cmd/dm-worker.TestRunMain (0x5fe5a9c) [select 460277h24m26.936209329s]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x00000000017c55d2 in runtime.selectgo
            at /usr/lib/go/src/runtime/select.go:328
        2  0x0000000005fe5a9c in github.com/pingcap/tiflow/dm/cmd/dm-worker.TestRunMain
            at ./dm/cmd/dm-worker/main_test.go:70
        3  0x000000000195e654 in testing.tRunner
            at /usr/lib/go/src/testing/testing.go:1439
        4  0x000000000195fe08 in testing.(*T).Run.func1
            at /usr/lib/go/src/testing/testing.go:1486
        5  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 84 - User: /usr/lib/go/src/runtime/sema.go:71 sync.runtime_SemacquireMutex (0x17e3e85) [semacquire 460277h24m11.444348431s]
         0  0x00000000017b40b6 in runtime.gopark
             at /usr/lib/go/src/runtime/proc.go:362
         1  0x00000000017c6af3 in runtime.goparkunlock
             at /usr/lib/go/src/runtime/proc.go:367
         2  0x00000000017c6af3 in runtime.semacquire1
             at /usr/lib/go/src/runtime/sema.go:144
         3  0x00000000017e3e85 in sync.runtime_SemacquireMutex
             at /usr/lib/go/src/runtime/sema.go:71
         4  0x000000000180ff1d in sync.(*Mutex).lockSlow
             at /usr/lib/go/src/sync/mutex.go:162
         5  0x000000000180fc25 in sync.(*Mutex).Lock
             at /usr/lib/go/src/sync/mutex.go:81
         6  0x0000000005fae73e in github.com/pingcap/tiflow/dm/dm/worker.(*Server).doClose
             at ./dm/dm/worker/server.go:454
         7  0x0000000005faec05 in github.com/pingcap/tiflow/dm/dm/worker.(*Server).Close
             at ./dm/dm/worker/server.go:484
         8  0x0000000005fe51a5 in github.com/pingcap/tiflow/dm/cmd/dm-worker.main
             at ./dm/cmd/dm-worker/main.go:97
         9  0x0000000005fe5e6c in github.com/pingcap/tiflow/dm/cmd/dm-worker.TestRunMain.func3
             at ./dm/cmd/dm-worker/main_test.go:66
        10  0x00000000017e8621 in runtime.goexit
             at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 85 - User: /home/lance/go/pkg/mod/gopkg.in/natefinch/lumberjack.v2@v2.0.0/lumberjack.go:379 gopkg.in/natefinch/lumberjack%2ev2.(*Logger).millRun (0x1e3bf65) [chan receive 460277h24m26.937234569s]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x000000000177cd05 in runtime.chanrecv
            at /usr/lib/go/src/runtime/chan.go:577
        2  0x000000000177c7d8 in runtime.chanrecv2
            at /usr/lib/go/src/runtime/chan.go:445
        3  0x0000000001e3bf65 in gopkg.in/natefinch/lumberjack%2ev2.(*Logger).millRun
            at /home/lance/go/pkg/mod/gopkg.in/natefinch/lumberjack.v2@v2.0.0/lumberjack.go:379
        4  0x0000000001e3c1ba in gopkg.in/natefinch/lumberjack%2ev2.(*Logger).mill.func1.1
            at /home/lance/go/pkg/mod/gopkg.in/natefinch/lumberjack.v2@v2.0.0/lumberjack.go:390
        5  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 86 - User: /usr/lib/go/src/runtime/proc.go:362 runtime.gopark (0x17b40b6) [select 460277h24m26.937472422s]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x00000000017c55d2 in runtime.selectgo
            at /usr/lib/go/src/runtime/select.go:328
        2  0x00000000017c9ddb in runtime.ensureSigM.func1
            at /usr/lib/go/src/runtime/signal_unix.go:973
        3  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 87 - User: /usr/lib/go/src/runtime/sema.go:56 sync.runtime_Semacquire (0x17e3d65) [semacquire 460277h24m11.445530348s]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x00000000017c6af3 in runtime.goparkunlock
            at /usr/lib/go/src/runtime/proc.go:367
        2  0x00000000017c6af3 in runtime.semacquire1
            at /usr/lib/go/src/runtime/sema.go:144
        3  0x00000000017e3d65 in sync.runtime_Semacquire
            at /usr/lib/go/src/runtime/sema.go:56
        4  0x000000000181257c in sync.(*WaitGroup).Wait
            at /usr/lib/go/src/sync/waitgroup.go:136
        5  0x0000000005fae7f8 in github.com/pingcap/tiflow/dm/dm/worker.(*Server).doClose
            at ./dm/dm/worker/server.go:462
        6  0x0000000005faec05 in github.com/pingcap/tiflow/dm/dm/worker.(*Server).Close
            at ./dm/dm/worker/server.go:484
        7  0x0000000005fe5548 in github.com/pingcap/tiflow/dm/cmd/dm-worker.main.func2
            at ./dm/cmd/dm-worker/main.go:91
        8  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 126 - User: /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/balancer_conn_wrappers.go:112 google.golang.org/grpc.(*ccBalancerWrapper).watcher (0x210e6f0) [select]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x00000000017c55d2 in runtime.selectgo
            at /usr/lib/go/src/runtime/select.go:328
        2  0x000000000210e6f0 in google.golang.org/grpc.(*ccBalancerWrapper).watcher
            at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/balancer_conn_wrappers.go:112
        3  0x000000000210e55a in google.golang.org/grpc.newCCBalancerWrapper.func1
            at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/balancer_conn_wrappers.go:73
        4  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 128 - User: /home/lance/go/pkg/mod/go.etcd.io/etcd/client/v3@v3.5.2/client.go:199 go.etcd.io/etcd/client/v3.(*Client).autoSync (0x2acf649) [select]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x00000000017c55d2 in runtime.selectgo
            at /usr/lib/go/src/runtime/select.go:328
        2  0x0000000002acf649 in go.etcd.io/etcd/client/v3.(*Client).autoSync
            at /home/lance/go/pkg/mod/go.etcd.io/etcd/client/v3@v3.5.2/client.go:199
        3  0x0000000002ad45fa in go.etcd.io/etcd/client/v3.newClient.func5
            at /home/lance/go/pkg/mod/go.etcd.io/etcd/client/v3@v3.5.2/client.go:444
        4  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 162 - User: /usr/lib/go/src/runtime/sema.go:71 sync.runtime_SemacquireMutex (0x17e3e85) [semacquire 460277h24m11.446358812s]
         0  0x00000000017b40b6 in runtime.gopark
             at /usr/lib/go/src/runtime/proc.go:362
         1  0x00000000017c6af3 in runtime.goparkunlock
             at /usr/lib/go/src/runtime/proc.go:367
         2  0x00000000017c6af3 in runtime.semacquire1
             at /usr/lib/go/src/runtime/sema.go:144
         3  0x00000000017e3e85 in sync.runtime_SemacquireMutex
             at /usr/lib/go/src/runtime/sema.go:71
         4  0x000000000180ff1d in sync.(*Mutex).lockSlow
             at /usr/lib/go/src/sync/mutex.go:162
         5  0x000000000180fc25 in sync.(*Mutex).Lock
             at /usr/lib/go/src/sync/mutex.go:81
         6  0x0000000005faf845 in github.com/pingcap/tiflow/dm/dm/worker.(*Server).stopSourceWorker
             at ./dm/dm/worker/server.go:545
         7  0x0000000005fa354e in github.com/pingcap/tiflow/dm/dm/worker.(*Server).KeepAlive
             at ./dm/dm/worker/join.go:111
         8  0x0000000005faad7c in github.com/pingcap/tiflow/dm/dm/worker.(*Server).doStartKeepAlive
             at ./dm/dm/worker/server.go:259
         9  0x0000000005faacba in github.com/pingcap/tiflow/dm/dm/worker.(*Server).startKeepAlive.func1
             at ./dm/dm/worker/server.go:254
        10  0x00000000017e8621 in runtime.goexit
             at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 178 - User: /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/http2_client.go:1572 google.golang.org/grpc/internal/transport.(*http2Client).keepalive (0x2059614) [select]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x00000000017c55d2 in runtime.selectgo
            at /usr/lib/go/src/runtime/select.go:328
        2  0x0000000002059614 in google.golang.org/grpc/internal/transport.(*http2Client).keepalive
            at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/http2_client.go:1572
        3  0x0000000002049c3a in google.golang.org/grpc/internal/transport.newHTTP2Client.func6
            at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/http2_client.go:360
        4  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 179 - User: /usr/lib/go/src/net/fd_posix.go:55 net.(*netFD).Read (0x1c0e551) [IO wait]
         0  0x00000000017b40b6 in runtime.gopark
             at /usr/lib/go/src/runtime/proc.go:362
         1  0x00000000017ac757 in runtime.netpollblock
             at /usr/lib/go/src/runtime/netpoll.go:522
         2  0x00000000017e1749 in internal/poll.runtime_pollWait
             at /usr/lib/go/src/runtime/netpoll.go:302
         3  0x0000000001857fbd in internal/poll.(*pollDesc).wait
             at /usr/lib/go/src/internal/poll/fd_poll_runtime.go:83
         4  0x0000000001859e15 in internal/poll.(*pollDesc).waitRead
             at /usr/lib/go/src/internal/poll/fd_poll_runtime.go:88
         5  0x0000000001859e15 in internal/poll.(*FD).Read
             at /usr/lib/go/src/internal/poll/fd_unix.go:167
         6  0x0000000001c0e551 in net.(*netFD).Read
             at /usr/lib/go/src/net/fd_posix.go:55
         7  0x0000000001c2eb91 in net.(*conn).Read
             at /usr/lib/go/src/net/net.go:183
         8  0x0000000001c54d25 in net.(*TCPConn).Read
             at <autogenerated>:1
         9  0x000000000196c4bb in bufio.(*Reader).Read
             at /usr/lib/go/src/bufio/bufio.go:236
        10  0x000000000184dd7e in io.ReadAtLeast
             at /usr/lib/go/src/io/io.go:331
        (truncated)
  Goroutine 180 - User: /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/controlbuf.go:407 google.golang.org/grpc/internal/transport.(*controlBuffer).get (0x2037a8d) [select]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x00000000017c55d2 in runtime.selectgo
            at /usr/lib/go/src/runtime/select.go:328
        2  0x0000000002037a8d in google.golang.org/grpc/internal/transport.(*controlBuffer).get
            at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/controlbuf.go:407
        3  0x000000000203867b in google.golang.org/grpc/internal/transport.(*loopyWriter).run
            at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/controlbuf.go:534
        4  0x0000000002049a06 in google.golang.org/grpc/internal/transport.newHTTP2Client.func3
            at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/http2_client.go:415
        5  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 195 - User: /usr/lib/go/src/runtime/sema.go:71 sync.runtime_SemacquireMutex (0x17e3e85) [semacquire 460277h24m11.44894535s]
         0  0x00000000017b40b6 in runtime.gopark
             at /usr/lib/go/src/runtime/proc.go:362
         1  0x00000000017c6af3 in runtime.goparkunlock
             at /usr/lib/go/src/runtime/proc.go:367
         2  0x00000000017c6af3 in runtime.semacquire1
             at /usr/lib/go/src/runtime/sema.go:144
         3  0x00000000017e3e85 in sync.runtime_SemacquireMutex
             at /usr/lib/go/src/runtime/sema.go:71
         4  0x000000000180ff1d in sync.(*Mutex).lockSlow
             at /usr/lib/go/src/sync/mutex.go:162
         5  0x000000000180fc25 in sync.(*Mutex).Lock
             at /usr/lib/go/src/sync/mutex.go:81
         6  0x0000000005faf365 in github.com/pingcap/tiflow/dm/dm/worker.(*Server).setSourceStatus
             at ./dm/dm/worker/server.go:521
         7  0x0000000005fb0215 in github.com/pingcap/tiflow/dm/dm/worker.(*Server).handleSourceBound
             at ./dm/dm/worker/server.go:575
         8  0x0000000005facdc9 in github.com/pingcap/tiflow/dm/dm/worker.(*Server).observeSourceBound
             at ./dm/dm/worker/server.go:383
         9  0x0000000005faa6cc in github.com/pingcap/tiflow/dm/dm/worker.(*Server).Start.func1.3
             at ./dm/dm/worker/server.go:181
        10  0x0000000005faa579 in github.com/pingcap/tiflow/dm/dm/worker.(*Server).Start.func1.10
             at ./dm/dm/worker/server.go:187
        (truncated)
  Goroutine 201 - User: /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/transport.go:191 google.golang.org/grpc/internal/transport.(*recvBufferReader).readClient (0x207009e) [select 460277h24m11.449826126s]
         0  0x00000000017b40b6 in runtime.gopark
             at /usr/lib/go/src/runtime/proc.go:362
         1  0x00000000017c55d2 in runtime.selectgo
             at /usr/lib/go/src/runtime/select.go:328
         2  0x000000000207009e in google.golang.org/grpc/internal/transport.(*recvBufferReader).readClient
             at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/transport.go:191
         3  0x000000000206fb45 in google.golang.org/grpc/internal/transport.(*recvBufferReader).Read
             at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/transport.go:171
         4  0x0000000002072446 in google.golang.org/grpc/internal/transport.(*transportReader).Read
             at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/transport.go:485
         5  0x000000000184dd7e in io.ReadAtLeast
             at /usr/lib/go/src/io/io.go:331
         6  0x0000000002072365 in io.ReadFull
             at /usr/lib/go/src/io/io.go:350
         7  0x0000000002072365 in google.golang.org/grpc/internal/transport.(*Stream).Read
             at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/transport.go:469
         8  0x0000000002127495 in google.golang.org/grpc.(*parser).recvMsg
             at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/rpc_util.go:559
         9  0x0000000002128228 in google.golang.org/grpc.recvAndDecompress
             at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/rpc_util.go:690
        10  0x0000000002128cba in google.golang.org/grpc.recv
             at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/rpc_util.go:756
        (truncated)
  Goroutine 202 - User: /home/lance/go/pkg/mod/go.etcd.io/etcd/client/v3@v3.5.2/lease.go:539 go.etcd.io/etcd/client/v3.(*lessor).deadlineLoop (0x2adeaa9) [select]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x00000000017c55d2 in runtime.selectgo
            at /usr/lib/go/src/runtime/select.go:328
        2  0x0000000002adeaa9 in go.etcd.io/etcd/client/v3.(*lessor).deadlineLoop
            at /home/lance/go/pkg/mod/go.etcd.io/etcd/client/v3@v3.5.2/lease.go:539
        3  0x0000000002adb87a in go.etcd.io/etcd/client/v3.(*lessor).KeepAlive.func1.2
            at /home/lance/go/pkg/mod/go.etcd.io/etcd/client/v3@v3.5.2/lease.go:300
        4  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 203 - User: /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/stream.go:355 google.golang.org/grpc.newClientStreamWithParams.func4 (0x2140b6c) [select 460277h24m26.944425119s]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x00000000017c55d2 in runtime.selectgo
            at /usr/lib/go/src/runtime/select.go:328
        2  0x0000000002140b6c in google.golang.org/grpc.newClientStreamWithParams.func4
            at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/stream.go:355
        3  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 204 - User: /home/lance/go/pkg/mod/go.etcd.io/etcd/client/v3@v3.5.2/lease.go:579 go.etcd.io/etcd/client/v3.(*lessor).sendKeepAliveLoop (0x2adef29) [select]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x00000000017c55d2 in runtime.selectgo
            at /usr/lib/go/src/runtime/select.go:328
        2  0x0000000002adef29 in go.etcd.io/etcd/client/v3.(*lessor).sendKeepAliveLoop
            at /home/lance/go/pkg/mod/go.etcd.io/etcd/client/v3@v3.5.2/lease.go:579
        3  0x0000000002addf59 in go.etcd.io/etcd/client/v3.(*lessor).resetRecv.func3
            at /home/lance/go/pkg/mod/go.etcd.io/etcd/client/v3@v3.5.2/lease.go:489
        4  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 280 - User: /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/http2_client.go:1572 google.golang.org/grpc/internal/transport.(*http2Client).keepalive (0x2059614) [select]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x00000000017c55d2 in runtime.selectgo
            at /usr/lib/go/src/runtime/select.go:328
        2  0x0000000002059614 in google.golang.org/grpc/internal/transport.(*http2Client).keepalive
            at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/http2_client.go:1572
        3  0x0000000002049c3a in google.golang.org/grpc/internal/transport.newHTTP2Client.func6
            at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/http2_client.go:360
        4  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
  Goroutine 281 - User: /usr/lib/go/src/net/fd_posix.go:55 net.(*netFD).Read (0x1c0e551) [IO wait]
         0  0x00000000017b40b6 in runtime.gopark
             at /usr/lib/go/src/runtime/proc.go:362
         1  0x00000000017ac757 in runtime.netpollblock
             at /usr/lib/go/src/runtime/netpoll.go:522
         2  0x00000000017e1749 in internal/poll.runtime_pollWait
             at /usr/lib/go/src/runtime/netpoll.go:302
         3  0x0000000001857fbd in internal/poll.(*pollDesc).wait
             at /usr/lib/go/src/internal/poll/fd_poll_runtime.go:83
         4  0x0000000001859e15 in internal/poll.(*pollDesc).waitRead
             at /usr/lib/go/src/internal/poll/fd_poll_runtime.go:88
         5  0x0000000001859e15 in internal/poll.(*FD).Read
             at /usr/lib/go/src/internal/poll/fd_unix.go:167
         6  0x0000000001c0e551 in net.(*netFD).Read
             at /usr/lib/go/src/net/fd_posix.go:55
         7  0x0000000001c2eb91 in net.(*conn).Read
             at /usr/lib/go/src/net/net.go:183
         8  0x0000000001c54d25 in net.(*TCPConn).Read
             at <autogenerated>:1
         9  0x000000000196c4bb in bufio.(*Reader).Read
             at /usr/lib/go/src/bufio/bufio.go:236
        10  0x000000000184dd7e in io.ReadAtLeast
             at /usr/lib/go/src/io/io.go:331
        (truncated)
  Goroutine 282 - User: /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/controlbuf.go:407 google.golang.org/grpc/internal/transport.(*controlBuffer).get (0x2037a8d) [select]
        0  0x00000000017b40b6 in runtime.gopark
            at /usr/lib/go/src/runtime/proc.go:362
        1  0x00000000017c55d2 in runtime.selectgo
            at /usr/lib/go/src/runtime/select.go:328
        2  0x0000000002037a8d in google.golang.org/grpc/internal/transport.(*controlBuffer).get
            at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/controlbuf.go:407
        3  0x000000000203867b in google.golang.org/grpc/internal/transport.(*loopyWriter).run
            at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/controlbuf.go:534
        4  0x0000000002049a06 in google.golang.org/grpc/internal/transport.newHTTP2Client.func3
            at /home/lance/go/pkg/mod/google.golang.org/grpc@v1.46.2/internal/transport/http2_client.go:415
        5  0x00000000017e8621 in runtime.goexit
            at /usr/lib/go/src/runtime/asm_amd64.s:1571
[39 goroutines]

reproduced in local !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dm Issues or PRs related to DM. component/test Unit tests and integration tests component. severity/minor type/bug The issue is confirmed as a bug.
Projects
None yet
4 participants