Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split fine_grained_deps out into its own .deps.json cache file #4906

Merged
merged 4 commits into from
Apr 17, 2018

Conversation

msullivan
Copy link
Collaborator

The goal here is to decouple loading of of the AST trees from the data
cache and the fine-grained dependencies. This means that a regular
incremental run can use cache files produced with --cache-fine-grained
without needing to load the fine-grained dependencies.
(To enable this, we drop "cache_fine_grained" from the OPTIONS_AFFECTING_CACHE
that are checked when validating metadata. If we require a fine-grained cache
but don't have one, that will be caught by the lack of deps_mtime.)

More importantly, it will enable loading dependencies without needing
to parse the data json, which is a good optimization for for a planned
patch to load data caches on demand in fine-grained mode.

The goal here is to decouple loading of of the AST trees from the data
cache and the fine-grained dependencies. This means that a regular
incremental run can use cache files produced with --cache-fine-grained
without needing to load the fine-grained dependencies.
(To enable this, we drop "cache_fine_grained" from the OPTIONS_AFFECTING_CACHE
that are checked when validating metadata. If we require a fine-grained cache
but don't have one, that will be caught by the lack of deps_mtime.)

More importantly, it will enable loading dependencies without needing
to parse the data json, which is a good optimization for for a planned
patch to load data caches on demand in fine-grained mode.
@msullivan
Copy link
Collaborator Author

Failures look like a dict iteration order nondeterminism bug

@msullivan
Copy link
Collaborator Author

Oh no it was so much dumber than that. deps_mtime = getmtime(meta.data_json) almost always works...

@msullivan
Copy link
Collaborator Author

(And filed the crashes as #4909)

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few minor things. Note that this increases the number of files in the cache when using fine-grained incremental mode by about 50%, which adds some overhead due to additional file system operations (especially on macOS where file system operations can be kind of slow), but this seems like a reasonable trade-off as it buys us faster daemon startup.

Feel free to merge after you've addressed the comments.

mypy/build.py Outdated
('data_json', str), # path of <id>.data.json
('deps_json', Optional[str]), # path of <id>.deps.json
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add comment with hint about what the ...deps.json files is used for (fine-grained deps?).

@@ -413,7 +415,8 @@ def default_lib_path(data_dir: str,
# silent mode or simply not found.


def cache_meta_from_dict(meta: Dict[str, Any], data_json: str) -> CacheMeta:
def cache_meta_from_dict(meta: Dict[str, Any],
data_json: str, deps_json: Optional[str]) -> CacheMeta:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a docstring and describe arguments, including deps_json?

mypy/build.py Outdated
A tuple with the file names to be used for the meta JSON and the
data JSON, respectively.
A tuple with the file names to be used for the meta JSON, the
data JSON, and the deps JSON, respectively.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call the deps JSON "fine-grained deps JSON" to make it clear that this is different from module dependencies?

deps_mtime = None
if manager.options.cache_fine_grained:
assert meta.deps_json
deps_mtime = getmtime(meta.deps_json)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the file system cache?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use the cache for meta files at all, currently

mypy/build.py Outdated
if not atomic_write(deps_json, deps_str, '\n'):
manager.log("Error writing deps JSON file {}".format(deps_json))
return interface_hash, None
raise CompileError(["Error writing deps JSON file {}".format(deps_json)])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This raises after return -- looks like the raise statement is redundant.

[file mypy.ini.2]
[[mypy]
cache_fine_grained = True
-- Nothing should get rechecked
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment seems to conflict with what's below.

@msullivan msullivan merged commit 66bf773 into master Apr 17, 2018
@msullivan msullivan deleted the deps-file branch April 17, 2018 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants