Description
While working on #5745 I instrumented mypy/fscache.py
and discovered that a lot of spurious os.stat()
calls are made in fine-grained-incremental runs on behalf of find_module()
(in mypy/modulefinder.py
). This is invoked from is_module()
(in mypy/build.py
) whose only call site is in all_imported_modules_in_file()
. This is used by compute_dependencies()
to disambiguate from X import Y
-- it needs to know whether Y
is a submodule of X
or just some object (like a function or class) defined in X
. This in turn happens during an incremental call to load_graph()
made from update_module_isolated()
(in mypy/server/update.py
), which is intended to recreate the mypy.build.State
object for a specific module that is known to be changed in a run.
For example, the line from typing import Dict
ends up calling stat()
for the following files in my setup:
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/typing-stubs
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/typing/py.typed
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/typing/Dict/py.typed
/Library/Python/3.6/site-packages/typing-stubs
/Library/Python/3.6/site-packages/typing/py.typed
/Library/Python/3.6/site-packages/typing/Dict/py.typed
/Users/guido/Library/Python/3.6/lib/python/site-packages/typing-stubs
/Users/guido/Library/Python/3.6/lib/python/site-packages/typing/py.typed
/Users/guido/Library/Python/3.6/lib/python/site-packages/typing/Dict/py.typed
typing
/Users/guido/src/mypy/typing
/Users/guido/src/mypy/mypy/typeshed/stdlib/3.6/typing
/Users/guido/src/mypy/mypy/typeshed/stdlib/3.5/typing
/Users/guido/src/mypy/mypy/typeshed/third_party/3.5/typing
/Users/guido/src/mypy/mypy/typeshed/stdlib/3/typing
/Users/guido/src/mypy/mypy/typeshed/third_party/3/typing
/Users/guido/src/mypy/mypy/typeshed/stdlib/2and3/typing
/Users/guido/src/mypy/mypy/typeshed/third_party/2and3/typing
/usr/local/lib/mypy/typing
Note that all these stat()
calls are cached by fscache.py
, but the cache is flushed at the end of each incremental step, so each incremental step does each of these once. (Also note that the current directory seems to appear twice on the search path, once as ''
, once as its absolute path.)
I came up with a tentative fix, but it needs work (to account for typeshed) and since this is pre-existing behavior I decided to separate it from #5745.
It's not the end of the world, obviously, but I think this ends up doing ~20 stat()
calls for each from import
(though only in modified files), and since we're looking for a strategy to allow using Watchman instead of calling stat()
for each source file, I think we might want to do something about this. (I also happen to know that from import
is hugely popular in the Dropbox code base.) I'll assume this is low priority until I have determined how many stat()
calls this does for the typical real Dropbox use case.