Description
Bug report
Bug description:
Short story:
- Socket is closed in
_SelectorTransport.__del__
, so it doesn't own the fd anymore. - Anything else in the process can create a new file descriptor at this point.
- If the transport is gc-resurrected to call
close()
, it can then remove the reader/writer fromloop._selector
here or here for a fd it doesn't control anymore, causing access to that fd to hang.
Long story:
- Introduce a GC cycle, such that an
asyncio.selector_events._SelectorTransport
and the object owning it are collected at the same time. - The transport
__del__
will be called, closing the socket, freeing its file descriptor, but not performing any other cleanup. - The owning object, being a good citizen, resurrects the transport during gc to have
close()
called on it properly. In the case of this issue, an asyncio task was created to close it later. (I definitely don't love seeingasyncio.create_task
in a__del__
method, but I expect that's not the only way to trigger this) - At this point, anything in your asyncio runloop can open a file descriptor. In my case, it was
asyncio.create_subprocess_exec
callingos.pidfd_create()
inPidfdChildWatcher.add_child_handler
. If you're lucky (or running a lot of tasks at once), the new fd will be the same as the resurrected socket's fd. - Now
transport.close()
can run any time later, which will callloop._selector.remove_child_watcher(fd)
, despite not really owning the fd anymore. - Now the pidfd has been removed from
loop._selector
, soawait proc.wait()
will hang forever. Or whatever asyncio stream you were trying to use got removed from the selector and you won't be able to wait on it.
I definitely think the non-stdlib code in play here was unsound, but I don't think that should trigger such a cursed result in the stdlib, so I'd rather defend against this in asyncio.
I think this is the minimum patch to mitigate the issue, to prevent the remove_child_reader
and remove_child_writer
calls by breaking their pre-conditions. I'm not sure if it's worth trying to interact with the loop or selector from __del__
to clean up the fd. (Selecting on a missing fd is probably handled somewhere else anyway?)
--- a/Lib/asyncio/selector_events.py
+++ b/Lib/asyncio/selector_events.py
@@ -871,6 +871,8 @@ def close(self):
def __del__(self, _warn=warnings.warn):
if self._sock is not None:
_warn(f"unclosed transport {self!r}", ResourceWarning, source=self)
+ self._closing = True
+ self._buffer.clear()
self._sock.close()
if self._server is not None:
self._server._detach(self)
Another consideration is that asyncio is using a cached self._sock_fd
here without really knowing if the underlying socket was closed and fd was recycled or not. I think this isn't the first time the cached fd has caused a surprising behavior, see also #88968
CPython versions tested on:
3.12
Operating systems tested on:
Linux
Linked PRs
Metadata
Metadata
Assignees
Projects
Status