Skip to content

Fine-grained incremental step does lots of spurious stat() calls #5747

Open
@gvanrossum

Description

@gvanrossum

While working on #5745 I instrumented mypy/fscache.py and discovered that a lot of spurious os.stat() calls are made in fine-grained-incremental runs on behalf of find_module() (in mypy/modulefinder.py). This is invoked from is_module() (in mypy/build.py) whose only call site is in all_imported_modules_in_file(). This is used by compute_dependencies() to disambiguate from X import Y -- it needs to know whether Y is a submodule of X or just some object (like a function or class) defined in X. This in turn happens during an incremental call to load_graph() made from update_module_isolated() (in mypy/server/update.py), which is intended to recreate the mypy.build.State object for a specific module that is known to be changed in a run.

For example, the line from typing import Dict ends up calling stat() for the following files in my setup:

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/typing-stubs
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/typing/py.typed
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/typing/Dict/py.typed
/Library/Python/3.6/site-packages/typing-stubs
/Library/Python/3.6/site-packages/typing/py.typed
/Library/Python/3.6/site-packages/typing/Dict/py.typed
/Users/guido/Library/Python/3.6/lib/python/site-packages/typing-stubs
/Users/guido/Library/Python/3.6/lib/python/site-packages/typing/py.typed
/Users/guido/Library/Python/3.6/lib/python/site-packages/typing/Dict/py.typed
typing
/Users/guido/src/mypy/typing
/Users/guido/src/mypy/mypy/typeshed/stdlib/3.6/typing
/Users/guido/src/mypy/mypy/typeshed/stdlib/3.5/typing
/Users/guido/src/mypy/mypy/typeshed/third_party/3.5/typing
/Users/guido/src/mypy/mypy/typeshed/stdlib/3/typing
/Users/guido/src/mypy/mypy/typeshed/third_party/3/typing
/Users/guido/src/mypy/mypy/typeshed/stdlib/2and3/typing
/Users/guido/src/mypy/mypy/typeshed/third_party/2and3/typing
/usr/local/lib/mypy/typing

Note that all these stat() calls are cached by fscache.py, but the cache is flushed at the end of each incremental step, so each incremental step does each of these once. (Also note that the current directory seems to appear twice on the search path, once as '', once as its absolute path.)

I came up with a tentative fix, but it needs work (to account for typeshed) and since this is pre-existing behavior I decided to separate it from #5745.

It's not the end of the world, obviously, but I think this ends up doing ~20 stat() calls for each from import (though only in modified files), and since we're looking for a strategy to allow using Watchman instead of calling stat() for each source file, I think we might want to do something about this. (I also happen to know that from import is hugely popular in the Dropbox code base.) I'll assume this is low priority until I have determined how many stat() calls this does for the typical real Dropbox use case.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions