-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split fine_grained_deps out into its own .deps.json cache file #4906
Conversation
The goal here is to decouple loading of of the AST trees from the data cache and the fine-grained dependencies. This means that a regular incremental run can use cache files produced with --cache-fine-grained without needing to load the fine-grained dependencies. (To enable this, we drop "cache_fine_grained" from the OPTIONS_AFFECTING_CACHE that are checked when validating metadata. If we require a fine-grained cache but don't have one, that will be caught by the lack of deps_mtime.) More importantly, it will enable loading dependencies without needing to parse the data json, which is a good optimization for for a planned patch to load data caches on demand in fine-grained mode.
Failures look like a dict iteration order nondeterminism bug |
Oh no it was so much dumber than that. |
(And filed the crashes as #4909) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a few minor things. Note that this increases the number of files in the cache when using fine-grained incremental mode by about 50%, which adds some overhead due to additional file system operations (especially on macOS where file system operations can be kind of slow), but this seems like a reasonable trade-off as it buys us faster daemon startup.
Feel free to merge after you've addressed the comments.
mypy/build.py
Outdated
('data_json', str), # path of <id>.data.json | ||
('deps_json', Optional[str]), # path of <id>.deps.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add comment with hint about what the ...deps.json
files is used for (fine-grained deps?).
@@ -413,7 +415,8 @@ def default_lib_path(data_dir: str, | |||
# silent mode or simply not found. | |||
|
|||
|
|||
def cache_meta_from_dict(meta: Dict[str, Any], data_json: str) -> CacheMeta: | |||
def cache_meta_from_dict(meta: Dict[str, Any], | |||
data_json: str, deps_json: Optional[str]) -> CacheMeta: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a docstring and describe arguments, including deps_json
?
mypy/build.py
Outdated
A tuple with the file names to be used for the meta JSON and the | ||
data JSON, respectively. | ||
A tuple with the file names to be used for the meta JSON, the | ||
data JSON, and the deps JSON, respectively. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call the deps JSON "fine-grained deps JSON" to make it clear that this is different from module dependencies?
deps_mtime = None | ||
if manager.options.cache_fine_grained: | ||
assert meta.deps_json | ||
deps_mtime = getmtime(meta.deps_json) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the file system cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use the cache for meta files at all, currently
mypy/build.py
Outdated
if not atomic_write(deps_json, deps_str, '\n'): | ||
manager.log("Error writing deps JSON file {}".format(deps_json)) | ||
return interface_hash, None | ||
raise CompileError(["Error writing deps JSON file {}".format(deps_json)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This raises after return -- looks like the raise statement is redundant.
[file mypy.ini.2] | ||
[[mypy] | ||
cache_fine_grained = True | ||
-- Nothing should get rechecked |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment seems to conflict with what's below.
The goal here is to decouple loading of of the AST trees from the data
cache and the fine-grained dependencies. This means that a regular
incremental run can use cache files produced with --cache-fine-grained
without needing to load the fine-grained dependencies.
(To enable this, we drop "cache_fine_grained" from the OPTIONS_AFFECTING_CACHE
that are checked when validating metadata. If we require a fine-grained cache
but don't have one, that will be caught by the lack of deps_mtime.)
More importantly, it will enable loading dependencies without needing
to parse the data json, which is a good optimization for for a planned
patch to load data caches on demand in fine-grained mode.