Skip to content

Feature Design Discussion: Blocking Analysis #12649

Open
@jhance

Description

@jhance

We would like to facilitate a migration from a large (formerly python 2 codebase) synchronous codebase to asyncio. Our codebase will forever probably contain a mixture of asyncio and synchronous code.

However, asyncio can be crippled if someone accidentally adds something like a time.sleep that blocks the event loop, or any synchronous network operation - or even a subprocess.run. In order to combat this, we are banning the function time.sleep and replacing it with something like this:

def blocking_sleep(t: float) -> None:
    assert_no_running_loop()
    time.sleep(t)

This covers the runtime case - instead of blocking at runtime we will raise an exception. This means our tests are likely to catch such issues. But not all of our code has tests. We would like to catch issues in mypy as well. One could imagine it is fairly simple to use either a mypy plugin or similar analysis tool to prevent things like this:

async def foo() -> None:
    blocking_sleep(10)

But the call to blocking_sleep may not be directly within the function foo and we still want to catch this.

One solution would be to make a monad-like construct Blocking like this:

from mypy_extensions import Blocking

def blocking_sleep(t: float) -> Blocking[None]:
    assert_no_running_loop()
    time.sleep(t)

The semantics would be simple - a call to a blocking callable would error if it is not itself within a blocking function. The only way to call it from outside a blocking function would be to use something like asyncio.to_thread or some utiltiy like allow_unsafe_blocking_call().

It would probably be feasible to annotate only a few function e.g. blocking_sleep and write a tool that automatically propagates all of the Blocking annotations and rewrites the code. This would work. The main problem is that the Blocking annotations may be annoying to developers/cause developer friction until more codebase is async.

The upside to this approach is that it facilites a migration to asyncio through a relatively simple substitution, blocking functions become async and calls to blocking functions get an await. One might note that this is mostly because annotating the blocking-ness is essentially isomorphic to annotating async-ness, but annotating the blocking-ness is much safer than a bulk conversion to asyncio becuase it won't have any runtime effect.

A second solution would be for a new mypy pass that, given a few root functions like blocking_sleep and other known blocking functions, automatically propagates the blocking-ness throughout and raises errors on invalid callsites. This has the advantage that we don't have to annotate everything. The disadvantages are that you can't tell if a function is blocking from reading it (although mypy could perhaps expose a query interface to ask it). Error generation may also be more difficult because we will need to produce some sort of callstack at which a function could block. We may need some way to be able to tell it that it is wrong, e.g. annotate a function as explicitly NotBlocking or similar.

I believe some sort of this feature will be required for asyncio to work on our large codebase. To make it will integrated with builtins and typeshed we would need a PEP, but I think it is fine for us to discuss here what the ideal state would be (internally we may use a mypy plugin or similar to rewrite typeshed stuff that should be blocking) without worrying about things like whether a PEP is necessary.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions