-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
bpart: Turn on invalidation for guard->defined transitions #57615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Keno
added a commit
that referenced
this pull request
Mar 3, 2025
This implements the optimizations I promised in #57602 by checking in invalidation whether or not the information that inference sees changes. The primary situation in which this is useful is avoiding an invalidation for `export` flag changes or changes of which module a binding is being imported from that do not change what the actual binding being imported is. Base itself uses these patterns sparingly, so the bootstrap impact over #57615 is minimal (though non-zero).
Keno
added a commit
that referenced
this pull request
Mar 3, 2025
This implements the optimizations I promised in #57602 by checking in invalidation whether or not the information that inference sees changes. The primary situation in which this is useful is avoiding an invalidation for `export` flag changes or changes of which module a binding is being imported from that do not change what the actual binding being imported is. Base itself uses these patterns sparingly, so the bootstrap impact over #57615 is minimal (though non-zero).
On master, when we invalidate a CodeInstance, we also invalidate the entire associated MethodInstance. However, this is highly problematic, because we have a lot of CodeInstances that are associated with `getproperty(::Module, ::Symbol)` through constant propagation. If one of these CodeInstances gets invalidated (e.g. because the resolution of const-propagated M.s binding changed), it would invalidate essentially the entire world. Prevent this by re-checking the forward edges list to make sure that the code instance we're invalidating is actually in there.
This addresses one of the last remaining TODOs of the binding partition work by performing invalidations when bindings transition from being undefined to being defined. This in particular finally addresses the performance issue that #54733 was intended to address (the issue was closed when we merged the mechanism, but it had so far been turned off). Turning on the invalidations themselves were always easy (a one line deletion). What is harder is making sure that the additional invalidations don't take extra time. To this end, we add two additional flags, one on Bindings, and one on methods. The flag on bindings tells us whether any method scan has so far found an implicit (not tracked in ->backedges) reference to this binding in any method body. The insight here is that most undefined bindings will not have been referenced previously (because they did not exist), so with a simple one bit saturating counter of the number of edges that would exist (if we did store them), we can fast-path the invalidation. However, this is not quite sufficient, as people often do things like: ``` foo() = bar() bar() = ... ... ``` which, without further improvements would incur an invalidation upon the definition of `bar`. The second insight (and what the flag on `Method` is for) is that we don't actually need to scan the method body until there is something to invalidate (i.e. until some `CodeInstance` has been created for the method). By defering the scanning until the first time that inference accesses the lowered code (with a flag to only do it once), we can easily avoid invalidation in the above scenario (while still invalidating if `foo()` was called before the definition of `bar`). As a further bonus, this also speeds up bootstrap by about 20% (putting us about back to where we used to be before the full bpart change) by skipping unnecessary invalidations even for non-guard transitions. Finally, this does not yet turn on inference's ability to infer guard partitions as `Union{}`. The reason for this is that such partitions can be replaced by backdated constants without invalidation. However, as soon as we remove the backdated const mechanism, this PR will allow us to turn on that change, further speeding up inference (by cutting off inference on branches known to error due to missing bindings).
Keno
added a commit
that referenced
this pull request
Mar 3, 2025
This implements the optimizations I promised in #57602 by checking in invalidation whether or not the information that inference sees changes. The primary situation in which this is useful is avoiding an invalidation for `export` flag changes or changes of which module a binding is being imported from that do not change what the actual binding being imported is. Base itself uses these patterns sparingly, so the bootstrap impact over #57615 is minimal (though non-zero).
Keno
added a commit
that referenced
this pull request
Mar 3, 2025
This implements the optimizations I promised in #57602 by checking in invalidation whether or not the information that inference sees changes. The primary situation in which this is useful is avoiding an invalidation for `export` flag changes or changes of which module a binding is being imported from that do not change what the actual binding being imported is. Base itself uses these patterns sparingly, so the bootstrap impact over #57615 is minimal (though non-zero).
Keno
added a commit
that referenced
this pull request
Mar 3, 2025
Our implicit edge tracking for bindings does not explicitly store any edges for bindings in the *current* module. The idea behind this is that this is a good time-space tradeoff for validation, because substantially all binding references in a module will be to its defining module, while the total number of methods within a module is limited and substantially smaller than the total number of methods in the entire system. However, we have an issue where the code that stores these edges and the invalidation code disagree on which module is the *current* one. The edge storing code was using the module in which the method was defined, while the invalidation code was using the one in which the MethodTable is defined. With these being misaligned, we can miss necessary invalidations. Both options are in principle possible, but I think the former is better, because the module in which the method is defined is also the module that we are likely to have a lot of references to (since they get referenced implicitly by just writing symbols in the code). However, this presents a problem: We don't actually have a way to iterate all the methods defined in a particular module, without just doing the brute force thing of scanning all methods and filtering. To address this, build on the deferred scanning code added in #57615 to also add any scanned modules to an explicit list in `Module`. This costs some space, but only proportional to the number of defined methods, (and thus proportional to the written source code). Note that we don't actually observe any issues in the test suite on master due to this bug. However, this is because we are grossly over-invalidating, which hides the missing invalidations from this issue (#57617).
Keno
added a commit
that referenced
this pull request
Mar 3, 2025
This implements the optimizations I promised in #57602 by checking in invalidation whether or not the information that inference sees changes. The primary situation in which this is useful is avoiding an invalidation for `export` flag changes or changes of which module a binding is being imported from that do not change what the actual binding being imported is. Base itself uses these patterns sparingly, so the bootstrap impact over #57615 is minimal (though non-zero).
Keno
added a commit
that referenced
this pull request
Mar 3, 2025
Our implicit edge tracking for bindings does not explicitly store any edges for bindings in the *current* module. The idea behind this is that this is a good time-space tradeoff for validation, because substantially all binding references in a module will be to its defining module, while the total number of methods within a module is limited and substantially smaller than the total number of methods in the entire system. However, we have an issue where the code that stores these edges and the invalidation code disagree on which module is the *current* one. The edge storing code was using the module in which the method was defined, while the invalidation code was using the one in which the MethodTable is defined. With these being misaligned, we can miss necessary invalidations. Both options are in principle possible, but I think the former is better, because the module in which the method is defined is also the module that we are likely to have a lot of references to (since they get referenced implicitly by just writing symbols in the code). However, this presents a problem: We don't actually have a way to iterate all the methods defined in a particular module, without just doing the brute force thing of scanning all methods and filtering. To address this, build on the deferred scanning code added in #57615 to also add any scanned modules to an explicit list in `Module`. This costs some space, but only proportional to the number of defined methods, (and thus proportional to the written source code). Note that we don't actually observe any issues in the test suite on master due to this bug. However, this is because we are grossly over-invalidating, which hides the missing invalidations from this issue (#57617).
Keno
added a commit
that referenced
this pull request
Mar 3, 2025
This implements the optimizations I promised in #57602 by checking in invalidation whether or not the information that inference sees changes. The primary situation in which this is useful is avoiding an invalidation for `export` flag changes or changes of which module a binding is being imported from that do not change what the actual binding being imported is. Base itself uses these patterns sparingly, so the bootstrap impact over #57615 is minimal (though non-zero).
Keno
added a commit
that referenced
this pull request
Mar 3, 2025
Our implicit edge tracking for bindings does not explicitly store any edges for bindings in the *current* module. The idea behind this is that this is a good time-space tradeoff for validation, because substantially all binding references in a module will be to its defining module, while the total number of methods within a module is limited and substantially smaller than the total number of methods in the entire system. However, we have an issue where the code that stores these edges and the invalidation code disagree on which module is the *current* one. The edge storing code was using the module in which the method was defined, while the invalidation code was using the one in which the MethodTable is defined. With these being misaligned, we can miss necessary invalidations. Both options are in principle possible, but I think the former is better, because the module in which the method is defined is also the module that we are likely to have a lot of references to (since they get referenced implicitly by just writing symbols in the code). However, this presents a problem: We don't actually have a way to iterate all the methods defined in a particular module, without just doing the brute force thing of scanning all methods and filtering. To address this, build on the deferred scanning code added in #57615 to also add any scanned modules to an explicit list in `Module`. This costs some space, but only proportional to the number of defined methods, (and thus proportional to the written source code). Note that we don't actually observe any issues in the test suite on master due to this bug. However, this is because we are grossly over-invalidating, which hides the missing invalidations from this issue (#57617).
Keno
added a commit
that referenced
this pull request
Mar 3, 2025
This implements the optimizations I promised in #57602 by checking in invalidation whether or not the information that inference sees changes. The primary situation in which this is useful is avoiding an invalidation for `export` flag changes or changes of which module a binding is being imported from that do not change what the actual binding being imported is. Base itself uses these patterns sparingly, so the bootstrap impact over #57615 is minimal (though non-zero).
Keno
added a commit
that referenced
this pull request
Mar 3, 2025
Our implicit edge tracking for bindings does not explicitly store any edges for bindings in the *current* module. The idea behind this is that this is a good time-space tradeoff for validation, because substantially all binding references in a module will be to its defining module, while the total number of methods within a module is limited and substantially smaller than the total number of methods in the entire system. However, we have an issue where the code that stores these edges and the invalidation code disagree on which module is the *current* one. The edge storing code was using the module in which the method was defined, while the invalidation code was using the one in which the MethodTable is defined. With these being misaligned, we can miss necessary invalidations. Both options are in principle possible, but I think the former is better, because the module in which the method is defined is also the module that we are likely to have a lot of references to (since they get referenced implicitly by just writing symbols in the code). However, this presents a problem: We don't actually have a way to iterate all the methods defined in a particular module, without just doing the brute force thing of scanning all methods and filtering. To address this, build on the deferred scanning code added in #57615 to also add any scanned modules to an explicit list in `Module`. This costs some space, but only proportional to the number of defined methods, (and thus proportional to the written source code). Note that we don't actually observe any issues in the test suite on master due to this bug. However, this is because we are grossly over-invalidating, which hides the missing invalidations from this issue (#57617).
Keno
added a commit
that referenced
this pull request
Mar 4, 2025
Our implicit edge tracking for bindings does not explicitly store any edges for bindings in the *current* module. The idea behind this is that this is a good time-space tradeoff for validation, because substantially all binding references in a module will be to its defining module, while the total number of methods within a module is limited and substantially smaller than the total number of methods in the entire system. However, we have an issue where the code that stores these edges and the invalidation code disagree on which module is the *current* one. The edge storing code was using the module in which the method was defined, while the invalidation code was using the one in which the MethodTable is defined. With these being misaligned, we can miss necessary invalidations. Both options are in principle possible, but I think the former is better, because the module in which the method is defined is also the module that we are likely to have a lot of references to (since they get referenced implicitly by just writing symbols in the code). However, this presents a problem: We don't actually have a way to iterate all the methods defined in a particular module, without just doing the brute force thing of scanning all methods and filtering. To address this, build on the deferred scanning code added in #57615 to also add any scanned modules to an explicit list in `Module`. This costs some space, but only proportional to the number of defined methods, (and thus proportional to the written source code). Note that we don't actually observe any issues in the test suite on master due to this bug. However, this is because we are grossly over-invalidating, which hides the missing invalidations from this issue (#57617).
Updated version merged in #57625. |
KristofferC
pushed a commit
that referenced
this pull request
Mar 4, 2025
Our implicit edge tracking for bindings does not explicitly store any edges for bindings in the *current* module. The idea behind this is that this is a good time-space tradeoff for validation, because substantially all binding references in a module will be to its defining module, while the total number of methods within a module is limited and substantially smaller than the total number of methods in the entire system. However, we have an issue where the code that stores these edges and the invalidation code disagree on which module is the *current* one. The edge storing code was using the module in which the method was defined, while the invalidation code was using the one in which the MethodTable is defined. With these being misaligned, we can miss necessary invalidations. Both options are in principle possible, but I think the former is better, because the module in which the method is defined is also the module that we are likely to have a lot of references to (since they get referenced implicitly by just writing symbols in the code). However, this presents a problem: We don't actually have a way to iterate all the methods defined in a particular module, without just doing the brute force thing of scanning all methods and filtering. To address this, build on the deferred scanning code added in #57615 to also add any scanned modules to an explicit list in `Module`. This costs some space, but only proportional to the number of defined methods, (and thus proportional to the written source code). Note that we don't actually observe any issues in the test suite on master due to this bug. However, this is because we are grossly over-invalidating, which hides the missing invalidations from this issue (#57617). (cherry picked from commit 274d80e)
serenity4
pushed a commit
to serenity4/julia
that referenced
this pull request
May 1, 2025
This implements the optimizations I promised in JuliaLang#57602 by checking in invalidation whether or not the information that inference sees changes. The primary situation in which this is useful is avoiding an invalidation for `export` flag changes or changes of which module a binding is being imported from that do not change what the actual binding being imported is. Base itself uses these patterns sparingly, so the bootstrap impact over JuliaLang#57615 is minimal (though non-zero).
serenity4
pushed a commit
to serenity4/julia
that referenced
this pull request
May 1, 2025
Our implicit edge tracking for bindings does not explicitly store any edges for bindings in the *current* module. The idea behind this is that this is a good time-space tradeoff for validation, because substantially all binding references in a module will be to its defining module, while the total number of methods within a module is limited and substantially smaller than the total number of methods in the entire system. However, we have an issue where the code that stores these edges and the invalidation code disagree on which module is the *current* one. The edge storing code was using the module in which the method was defined, while the invalidation code was using the one in which the MethodTable is defined. With these being misaligned, we can miss necessary invalidations. Both options are in principle possible, but I think the former is better, because the module in which the method is defined is also the module that we are likely to have a lot of references to (since they get referenced implicitly by just writing symbols in the code). However, this presents a problem: We don't actually have a way to iterate all the methods defined in a particular module, without just doing the brute force thing of scanning all methods and filtering. To address this, build on the deferred scanning code added in JuliaLang#57615 to also add any scanned modules to an explicit list in `Module`. This costs some space, but only proportional to the number of defined methods, (and thus proportional to the written source code). Note that we don't actually observe any issues in the test suite on master due to this bug. However, this is because we are grossly over-invalidating, which hides the missing invalidations from this issue (JuliaLang#57617).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This addresses one of the last remaining TODOs of the binding partition work by performing invalidations when bindings transition from being undefined to being defined. This in particular finally addresses the performance issue that #54733 was intended to address (the issue was closed when we merged the mechanism, but it had so far been turned off). Turning on the invalidations themselves were always easy (a one line deletion). What is harder is making sure that the additional invalidations don't take extra time.
To this end, we add two additional flags, one on Bindings, and one on methods. The flag on bindings tells us whether any method scan has so far found an implicit (not tracked in ->backedges) reference to this binding in any method body. The insight here is that most undefined bindings will not have been referenced previously (because they did not exist), so with a simple one bit saturating counter of the number of edges that would exist (if we did store them), we can fast-path the invalidation.
However, this is not quite sufficient, as people often do things like:
which, without further improvements would incur an invalidation upon the definition of
bar
.The second insight (and what the flag on
Method
is for) is that we don't actually need to scan the method body until there is something to invalidate (i.e. until someCodeInstance
has been created for the method). By defering the scanning until the first time that inference accesses the lowered code (with a flag to only do it once), we can easily avoid invalidation in the above scenario (while still invalidating iffoo()
was called before the definition ofbar
).As a further bonus, this also speeds up bootstrap by about 20% (putting us about back to where we used to be before the full bpart change) by skipping unnecessary invalidations even for non-guard transitions.
Finally, this does not yet turn on inference's ability to infer guard partitions as
Union{}
. The reason for this is that such partitions can be replaced by backdated constants without invalidation. However, as soon as we remove the backdated const mechanism, this PR will allow us to turn on that change, further speeding up inference (by cutting off inference on branches known to error due to missing bindings).