Skip to content

Plain OSError (EHOSTUNREACH) escapes TCPServer._close() on abrupt client disconnect — unhandled exception in client_connected_cb + propagates into the ASGI app #361

Description

@joanfabregat

Summary

When a client peer becomes unreachable mid-connection (e.g. the client host vanishes — common on Kubernetes when a client pod is force-deleted while keep-alive connections are open), await self.writer.wait_closed() in TCPServer._close() raises a plain OSError ([Errno 113] No route to host, EHOSTUNREACH). The except tuple at tcp_server.py:121-127 only catches ConnectionError subclasses (+ RuntimeError, CancelledError), so the exception escapes — with two distinct manifestations.

CPython only maps ECONNRESET / EPIPE / ESHUTDOWN / ECONNABORTED / ECONNREFUSED to ConnectionError subclasses; EHOSTUNREACH, ENETUNREACH and ETIMEDOUT stay plain OSError and slip through.

Manifestation 1 — unhandled exception in the connection task

_close() is awaited from the finally: block of TCPServer.run(), which is outside run()'s own except OSError: pass. The exception therefore escapes the client_connected_cb task and asyncio reports it via the loop exception handler:

Unhandled exception in client_connected_cb
Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/asyncio/streams.py", line 364, in wait_closed
OSError: [Errno 113] No route to host

Observed as a burst of 84 of these within ~2 s — one per open keep-alive connection — when the client pod was force-deleted.

Manifestation 2 — propagates into the ASGI application

The same unprotected _close() is also reached from protocol_send()'s Closed branch (tcp_server.py:87-88) while the app is sending a response. The OSError then propagates backwards through the application's send call chain and surfaces inside the ASGI app as if the app itself failed (captured by Starlette's error middleware / error trackers like Sentry):

OSError: [Errno 113] No route to host
  ...
  starlette/responses.py:167 in __call__
  hypercorn/protocol/http_stream.py:200 in app_send
  hypercorn/protocol/http_stream.py:247 in _send_closed
  hypercorn/protocol/h11.py:151 in stream_send
  hypercorn/protocol/h11.py:289 in _maybe_recycle
  hypercorn/asyncio/tcp_server.py:88 in protocol_send
  hypercorn/asyncio/tcp_server.py:120 in _close

Suggested fix

Add OSError to the except tuple in _close() — consistent with run()'s existing except OSError: pass and with the # Already closed intent. Since the three Connection*Errors are OSError subclasses, the tuple collapses to:

        try:
            self.writer.close()
            await self.writer.wait_closed()
        except (
            OSError,
            RuntimeError,
            asyncio.CancelledError,
        ):
            pass  # Already closed

Possibly related: the RawData branch of protocol_send() (tcp_server.py:85) catches (ConnectionError, RuntimeError) around writer.drain() — a drain() to an unreachable host can presumably raise the same plain OSError there too.

This is the same pattern previously fixed for ConnectionAbortedError (#134) and asyncio.CancelledError (#172), and currently open for TimeoutError (#342) — EHOSTUNREACH is the next variant.

Environment

  • Hypercorn 0.18.0 (code path unchanged on current main), asyncio worker
  • Python 3.11, Linux (Kubernetes)
  • FastAPI/Starlette app

Reproduction

Open HTTP keep-alive connections from a client, then make the client host unreachable abruptly (on Kubernetes: kubectl delete pod <client> --grace-period=0 --force; on bare Linux: drop the route / power off the peer). Each open connection produces the unhandled exception on teardown. For a unit-level repro, a stream writer whose wait_closed() raises OSError(errno.EHOSTUNREACH, "No route to host") exercises the same path without needing a vanishing host.

Happy to submit a PR for this if the approach looks right.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions