-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shutdown_default_executor / wait_for_tstate_lock deadlock (?) after Ctrl+C #111358
Comments
Hi, I would like to take that issue! |
@Yaro1 -- You also claimed another issue. Can you choose which one you'd like to tackle first? |
@gvanrossum I will take first this one #109564 |
tl;dr: The first "extra" keyboard interrupt is because the loop is waiting for the executor to shutdown. The second "extra" keyboard interrupt is because the Python interpreter will wait for all non-daemon threads to join. Waiting for
|
loop.run_until_complete( | |
loop.shutdown_default_executor(constants.THREAD_JOIN_TIMEOUT)) |
cpython/Lib/asyncio/base_events.py
Line 613 in 94ec2b9
self._default_executor.shutdown(wait=True) |
And ThreadPoolExecutor.shutdown(wait=True)
will wait for all of its threads to join. This is the code location which is "hanging".
Setting the wait flag to False
will allow the loop to finish and exit, and the KeyboardInterrupt
will be promptly shown to the user on the first SIGINT. Because the loop is no longer blocked by the executor shutdown, we need one less keyboard interrupt; just one more SIGINT is needed to stop the thread since the Python main thread will still wait for all non-daemon threads to join.
Cannot cancel concurrent.futures.Future
The final stack trace you see is from concurrent.futures.ThreadPoolExecutor
, you can run this without asyncio
and try to interrupt it to see the same error and hanging behaviour:
import concurrent.futures
import time
with concurrent.futures.ThreadPoolExecutor(1) as pool:
t = pool.submit(time.sleep, 10)
In the asyncio
example, The first SIGINT is indeed caught by the asyncio runner. It tries to cancel the ThreadPoolExecutor
created future, but concurrent.futures.Future
s cannot actually be cancelled once they have started. You will see the chained asyncio.Future
attempt this but the concurrent.futures.Future.cancel()
call will return False
.
cpython/Lib/concurrent/futures/_base.py
Lines 364 to 369 in e16d4ed
def cancel(self): | |
"""Cancel the future if possible. | |
Returns True if the future was cancelled, False otherwise. A future | |
cannot be cancelled if it is running or has already completed. | |
""" |
Thanks, @ordinary-jamie! Putting it in terms I can understand:
It appears the author of the code didn't think this through (or a change was made that broke this). A future is created and waited for before we join the thread -- the timeout is only used for joining the thread, but the future doesn't complete until the So how do we fix this? It looks like Jamie proposes to change the But that doesn't seem to be what the author of the code intended either. If there's a task running in the executor that's almost finished, we just abandon it. The author likely intended to wait up to 300 seconds for running tasks, and then give up. Setting So what to do? Maybe we should just use async with asyncio.timeout(timeout):
await future and we should probably only wait for the thread if we don't time out (and print the warning whenever we do). This solution can be backported to 3.12 and 3.11, which have the fancy timeout. |
Sorry for the confusion, I intended that to just be a demo of the shutdown blocking the keyboard interrupt from surfacing
Agreed! I can work on a fix :) |
@xbeastx If you read the PR you'll see that it doesn't really fix your issue -- it fixes a legitimate bug in the timeout handling, but you'll still need three |
Thank you for the response! Let's imagine a real program that call some blocking function (let's say it calls "input") in thread pool executor, and the user tries to abort it. How could he imagine the logic of "three ctrl+c", even if this in itself is not strange =) So from my point of view, an asyncio program here should behave in the same way as a non-async one. import time
def blocking():
print("sleep start")
time.sleep(1000)
print("sleep end")
def main():
blocking()
if __name__ == '__main__':
main() With one Ctrl+C it produces: $ python3.11 test.py
sleep start
^CTraceback (most recent call last):
File ".../test.py", line 15, in <module>
main()
File ".../test.py", line 11, in main
blocking()
File ".../test.py", line 6, in blocking
time.sleep(1000)
KeyboardInterrupt But I would agree that problems is more broader than it and technically asyncio code more equals something like this: import time
from concurrent.futures import ThreadPoolExecutor
def blocking():
print("sleep start")
time.sleep(1000)
print("sleep end")
def main():
with ThreadPoolExecutor(max_workers=1) as executor:
try:
future = executor.submit(blocking)
future.result()
except KeyboardInterrupt:
print("KeyboardInterrupt")
raise
if __name__ == '__main__':
main() Which leads to twice-ctrl+c logic =)
Those the problem is that there is no clean way to cancel the thread task in the threadpool executor? |
Such a real program should not use the default executor. The default executor exists because certain C library APIs (notable |
And yes, if you want to be able to kill threads, you have to bark up another tree. There have been endless discussions about that but the conclusion is always the same -- it's unsafe, it can get the runtime in an unknown state, so please do something else. |
…ult_executor (pythonGH-115622) (cherry picked from commit 53d5e67) Co-authored-by: Jamie Phan <jamie@ordinarylab.dev>
Bug report
Bug description:
Blocking function running in thread executor will produce deadlock on shutdown after Ctrl+C pressed.
Here is minimal reproducing example:
Steps to reproduce:
(Tested on 3.9<=python<=3.12 on Ubuntu 22.04 all have same behaviour)
Output:
Additional:
Found a blog post https://www.roguelynn.com/words/asyncio-sync-and-threaded/ with looks like similar behavior back in 2019 where someone is trying to workaround this.
CPython versions tested on:
3.9, 3.10, 3.11, 3.12
Operating systems tested on:
Linux
Linked PRs
The text was updated successfully, but these errors were encountered: