-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better handling for caches in skyfunctions #23101
Comments
One potential solution: a way to get a cache from Skyframe, giving a root skykey, such that when the skykey is invalidated any entries added to the cache are also invalidated. This would prevent the above problems because when the platform BUILD file or the transition bzl file are changed, those skykeys are dirtied, and that propagates to the ConfiguredTargetKey for the ConfiguredTargetFunction call that actually uses these caches. This would require more state and more tracking from Skyframe, but would avoid the current fix, which is to entirely clear these caches between builds. |
Pinging @haxorz and @gregestren for thoughts. |
@aoeui Shahan can you please advise? |
@katre Can you explain why the simple approach of using a Skyframe node for each computation doesn't work? By construction, seems like it'd address the correctness concern. Is the issue that it wouldn't give you the memory properties you want? Looks like the code you linked uses a mix of weak and soft caches. |
@haxorz: The reason we've increased the use of state-machine producers for I suspect that Shahan knows more about this than I do. I'd be very happy to use the existing caching skyframe brings rather than adding new systems on top of it. |
Regarding weak and soft caches: I doubt we're being systematic here, and could probably do a lot to better tune these caches and how they handle references. I think this is actually another point in favor of having a generic system to get a cache from skyframe rather than adding them in an ad hoc manner throughout the system. |
I think each weak vs. soft choice was consciously considered. Doesn't mean the current choices are right but I'd be curious about the original rationales. Honestly, the surrounding code has changed so much since these caches were added that it's not remotely clear they still represent ideal implementation logic. I'm super-curious if a Skyframe-native caching approach could work for all the reasons discussed here. Acknowledged about extensive edges. Maybe there are ways to mitigate that by combining some keys or only introducing them when needed (especially |
We have several caches used currently in skyfunction implementations, with many specifically, used directly or indirectly from
ConfiguredTargetFunction
. Some examples are:StarlarkTransitionCache
BuildConfigurationKeyCache
PlatformMappingValue.parserCache
StarlarkDefinedConfigTransition.ruleTransitionCache
Each of these are used to cache expansive computations that may be repeated multiple times in the same build (ie, in one
bazel build
call). Each of these would ideally also persist from build to build in case the same computation is needed across builds.We've also seen several classes of error arising from the files used to determine cache values being changed between builds, but the cache not being invalidated:
platform
weren't re-computed when the underlying BUILD file changedThere may well be more of these.
The text was updated successfully, but these errors were encountered: