Keep alive for inactive worker/scheduler connection #2524

fjetter · 2019-02-12T12:49:32Z

I noticed that the connection between workers and schedulers (and effectively all Server subclasses using handle_stream) are kept open to continuously listen to incoming requests. This connection is solely used to listen to stream_handler requests while, for instance the heartbeat, uses a different connection from the connection pool. This means that for a longer time of cluster inactivity, the primary connection is inactive. The issue about inactive connections is that some external tools are inclined to kill inactive connections (for instance, we're using haproxy and we kill inactive connections after 1h) which requires for the worker to reconnect and register with the scheduler every hour during inactivity.

I was wondering if this is an issue for somebody else and whether it would be desired to either send a keep alive using the open connection every X secs (simple message to keep the connection active) or even send the hearbeat over this connection. I believe the first option could be implemented fairly easily in handle_stream while the latter is a bit more difficult. In either case I wanted to get some feedback before implementing anything.

(dedicated worker-scheduler connection creation see here, reconnect if connection is broken see here)

The text was updated successfully, but these errors were encountered:

mrocklin · 2019-02-20T16:44:42Z

Thank you for the excellently worded issue, and my apologies for the delay in response.

I'm hearing two possible solutions:

Move the heartbeat to the long-running connection.

I suspect that the current reason for the heartbeat to be on a separate connection is to trigger a reconnect if something goes wrong with the long-running connection. However, as you point out in Race condition between worker heartbeat and reconnect #2525 we're already handling this, so perhaps this reason is not sufficient (or in fact, problematic).

If so then moving the heartbeat to the long-running connection seems like a good idea
Add a second keep-alive route to the server (probably in the handlers in core.py::Server) and add a periodic callback that sends a message every minute or so. This is easy to do and has low impact, but slightly increases complexity.

Either is fine. If you were interested and have the time to investigate option 1 I think that that would be best, but it's also more work.

lr4d · 2019-07-16T15:40:35Z

I've been taking a look at the codebase and I found this function

distributed/distributed/comm/tcp.py

Line 47 in 0a5b8da

def set_tcp_timeout(stream):

, which appears to do what @fjetter suggests, that is it uses TCP keep-alives, sending a packet every x seconds if the connection is idle.

Hence, it seems like TCP keep-alives should already be enabled on scheduler-worker connections.

Is there anything I may be missing here?

lr4d · 2019-07-16T16:20:15Z

I have confirmed TCP keep-alives are being sent on the long running connection by looking at the TCP traffic when the heartbeat is disabled.

scheduler_1_4aa062432c65 | distributed.comm.tcp - DEBUG - Setting TCP keepalive: nprobes=10, idle=10, interval=2
worker_1_ac8a9b1b2af2 | distributed.comm.tcp - DEBUG - Setting TCP keepalive: nprobes=10, idle=10, interval=2

The scheduler is at tcp://172.22.0.2:8786:

root@f07be9f53108:/# tcpdump -pn "host 172.22.0.2 and tcp"
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
15:57:47.361159 IP 172.22.0.4.46782 > 172.22.0.2.8786: Flags [.], ack 1838822340, win 229, options [nop,nop,TS val 48374592 ecr 48373568], length 0
15:57:47.361326 IP 172.22.0.2.8786 > 172.22.0.4.46782: Flags [.], ack 1, win 235, options [nop,nop,TS val 48374592 ecr 48373568], length 0
15:57:47.361359 IP 172.22.0.2.8786 > 172.22.0.4.46782: Flags [.], ack 1, win 235, options [nop,nop,TS val 48374592 ecr 48373568], length 0
15:57:47.361385 IP 172.22.0.4.46782 > 172.22.0.2.8786: Flags [.], ack 1, win 229, options [nop,nop,TS val 48374592 ecr 48373568], length 0
15:57:57.600890 IP 172.22.0.4.46782 > 172.22.0.2.8786: Flags [.], ack 1, win 229, options [nop,nop,TS val 48375616 ecr 48374592], length 0
15:57:57.600986 IP 172.22.0.2.8786 > 172.22.0.4.46782: Flags [.], ack 1, win 235, options [nop,nop,TS val 48375616 ecr 48374592], length 0
15:57:57.601001 IP 172.22.0.2.8786 > 172.22.0.4.46782: Flags [.], ack 1, win 235, options [nop,nop,TS val 48375616 ecr 48374592], length 0
15:57:57.601025 IP 172.22.0.4.46782 > 172.22.0.2.8786: Flags [.], ack 1, win 229, options [nop,nop,TS val 48375616 ecr 48375616], length 0
15:58:07.808206 IP 172.22.0.2.8786 > 172.22.0.4.46782: Flags [.], ack 1, win 235, options [nop,nop,TS val 48376640 ecr 48375616], length 0
15:58:07.808296 IP 172.22.0.4.46782 > 172.22.0.2.8786: Flags [.], ack 1, win 229, options [nop,nop,TS val 48376640 ecr 48375616], length 0
15:58:07.816519 IP 172.22.0.4.46782 > 172.22.0.2.8786: Flags [.], ack 1, win 229, options [nop,nop,TS val 48376641 ecr 48375616], length 0
15:58:07.816616 IP 172.22.0.2.8786 > 172.22.0.4.46782: Flags [.], ack 1, win 235, options [nop,nop,TS val 48376641 ecr 48376640], length 0
15:58:18.047129 IP 172.22.0.2.8786 > 172.22.0.4.46782: Flags [.], ack 1, win 235, options [nop,nop,TS val 48377664 ecr 48376640], length 0
15:58:18.047129 IP 172.22.0.4.46782 > 172.22.0.2.8786: Flags [.], ack 1, win 229, options [nop,nop,TS val 48377664 ecr 48376641], length 0
15:58:18.047184 IP 172.22.0.4.46782 > 172.22.0.2.8786: Flags [.], ack 1, win 229, options [nop,nop,TS val 48377664 ecr 48376641], length 0
15:58:18.047232 IP 172.22.0.2.8786 > 172.22.0.4.46782: Flags [.], ack 1, win 235, options [nop,nop,TS val 48377664 ecr 48377664], length 0
15:58:28.286598 IP 172.22.0.4.46782 > 172.22.0.2.8786: Flags [.], ack 1, win 229, options [nop,nop,TS val 48378688 ecr 48377664], length 0
15:58:28.286623 IP 172.22.0.2.8786 > 172.22.0.4.46782: Flags [.], ack 1, win 235, options [nop,nop,TS val 48378688 ecr 48377664], length 0
15:58:28.286652 IP 172.22.0.4.46782 > 172.22.0.2.8786: Flags [.], ack 1, win 229, options [nop,nop,TS val 48378688 ecr 48377664], length 0
15:58:28.286745 IP 172.22.0.2.8786 > 172.22.0.4.46782: Flags [.], ack 1, win 235, options [nop,nop,TS val 48378688 ecr 48378688], length 0

StephanErb · 2019-07-17T11:19:18Z

Sending TCP keep-alives will ensure that firewalls and networking equipment does not close the Dask connection. HAProxy however will close the connection if there is no traffic on the application layer, so even with TCP keep-alives it will close the connection (https://stackoverflow.com/questions/32634980/haproxy-closes-long-living-tcp-connections-ignoring-tcp-keepalive).

The proposal of Matthew to move the heartbeat to the long-running connection sounds sane and should solve this problem.

lr4d · 2019-07-17T11:58:30Z

thanks @StephanErb

mrocklin · 2019-07-29T20:31:50Z

We could solve this by moving the heartbeat as suggested, but it seems like this might have other effects?

As an alternative, maybe just a periodic callback that sends a trivial message across every minute or so? We could make a new route that did nothing and send an operation that just hits that route.

This is effectively a heartbeat, but much simpler and less frequent than our current heartbeats Fixes dask#2524

mrocklin · 2019-07-29T23:13:47Z

Would this work? #2907

This is effectively a heartbeat, but much simpler and less frequent than our current heartbeats Fixes #2524

fjetter mentioned this issue Feb 12, 2019

Race condition between worker heartbeat and reconnect #2525

Closed

pacman82 mentioned this issue Jul 18, 2019

WIP: Move heartbeat to long running connection #2854

Closed

mrocklin added a commit to mrocklin/distributed that referenced this issue Jul 29, 2019

Add keep-alive message between worker and scheduler

6032f81

This is effectively a heartbeat, but much simpler and less frequent than our current heartbeats Fixes dask#2524

mrocklin mentioned this issue Jul 29, 2019

Add keep-alive message between worker and scheduler #2907

Merged

mrocklin closed this as completed in #2907 Aug 2, 2019

mrocklin added a commit that referenced this issue Aug 2, 2019

Add keep-alive message between worker and scheduler (#2907)

4dc3d19

This is effectively a heartbeat, but much simpler and less frequent than our current heartbeats Fixes #2524

fjetter mentioned this issue Jan 21, 2022

Conditions under which a TCP connection may fail / close? #5678

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Keep alive for inactive worker/scheduler connection #2524

Keep alive for inactive worker/scheduler connection #2524

fjetter commented Feb 12, 2019

mrocklin commented Feb 20, 2019

lr4d commented Jul 16, 2019

lr4d commented Jul 16, 2019

StephanErb commented Jul 17, 2019

lr4d commented Jul 17, 2019

mrocklin commented Jul 29, 2019

mrocklin commented Jul 29, 2019

Keep alive for inactive worker/scheduler connection #2524

Keep alive for inactive worker/scheduler connection #2524

Comments

fjetter commented Feb 12, 2019

mrocklin commented Feb 20, 2019

lr4d commented Jul 16, 2019

lr4d commented Jul 16, 2019

StephanErb commented Jul 17, 2019

lr4d commented Jul 17, 2019

mrocklin commented Jul 29, 2019

mrocklin commented Jul 29, 2019