Skip to content

Be more selective when invalidating code instances #57617

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

Keno
Copy link
Member

@Keno Keno commented Mar 3, 2025

On master, when we invalidate a CodeInstance, we also invalidate the entire associated MethodInstance. However, this is highly problematic, because we have a lot of CodeInstances that are associated with getproperty(::Module, ::Symbol) through constant propagation. If one of these CodeInstances gets invalidated (e.g. because the resolution of const-propagated M.s binding changed), it would invalidate essentially the entire world.

Prevent this by re-checking the forward edges list to make sure that the code instance we're invalidating is actually in there.

@Keno Keno requested a review from vtjnash March 3, 2025 08:27
@Keno Keno added the backport 1.12 Change should be backported to release-1.12 label Mar 3, 2025
@Keno Keno force-pushed the kf/nogetpropertyinvalidate branch from 0360167 to ae2914a Compare March 3, 2025 08:34
On master, when we invalidate a CodeInstance, we also invalidate the
entire associated MethodInstance. However, this is highly problematic,
because we have a lot of CodeInstances that are associated with
`getproperty(::Module, ::Symbol)` through constant propagation.
If one of these CodeInstances gets invalidated (e.g. because
the resolution of const-propagated M.s binding changed), it
would invalidate essentially the entire world.

Prevent this by re-checking the forward edges list to make sure
that the code instance we're invalidating is actually in there.
@Keno Keno force-pushed the kf/nogetpropertyinvalidate branch from ae2914a to 4ef7c99 Compare March 3, 2025 08:46
Keno added a commit that referenced this pull request Mar 3, 2025
Our implicit edge tracking for bindings does not explicitly store
any edges for bindings in the *current* module. The idea behind
this is that this is a good time-space tradeoff for validation,
because substantially all binding references in a module will be
to its defining module, while the total number of methods within
a module is limited and substantially smaller than the total number
of methods in the entire system.

However, we have an issue where the code that stores these edges
and the invalidation code disagree on which module is the *current*
one. The edge storing code was using the module in which the method
was defined, while the invalidation code was using the one in which
the MethodTable is defined. With these being misaligned, we can
miss necessary invalidations.

Both options are in principle possible, but I think the former is
better, because the module in which the method is defined is also
the module that we are likely to have a lot of references to (since
they get referenced implicitly by just writing symbols in the code).

However, this presents a problem: We don't actually have a way to
iterate all the methods defined in a particular module, without just
doing the brute force thing of scanning all methods and filtering.

To address this, build on the deferred scanning code added in #57615
to also add any scanned modules to an explicit list in `Module`. This
costs some space, but only proportional to the number of defined methods,
(and thus proportional to the written source code).

Note that we don't actually observe any issues in the test suite on
master due to this bug. However, this is because we are grossly
over-invalidating, which hides the missing invalidations from this
issue (#57617).
@Keno
Copy link
Member Author

Keno commented Mar 3, 2025

The test failure here turned out to be a bigger bug (#57625). The PR in #57625 contains all the commits, so we may want to merge them together.

Keno added a commit that referenced this pull request Mar 3, 2025
Our implicit edge tracking for bindings does not explicitly store
any edges for bindings in the *current* module. The idea behind
this is that this is a good time-space tradeoff for validation,
because substantially all binding references in a module will be
to its defining module, while the total number of methods within
a module is limited and substantially smaller than the total number
of methods in the entire system.

However, we have an issue where the code that stores these edges
and the invalidation code disagree on which module is the *current*
one. The edge storing code was using the module in which the method
was defined, while the invalidation code was using the one in which
the MethodTable is defined. With these being misaligned, we can
miss necessary invalidations.

Both options are in principle possible, but I think the former is
better, because the module in which the method is defined is also
the module that we are likely to have a lot of references to (since
they get referenced implicitly by just writing symbols in the code).

However, this presents a problem: We don't actually have a way to
iterate all the methods defined in a particular module, without just
doing the brute force thing of scanning all methods and filtering.

To address this, build on the deferred scanning code added in #57615
to also add any scanned modules to an explicit list in `Module`. This
costs some space, but only proportional to the number of defined methods,
(and thus proportional to the written source code).

Note that we don't actually observe any issues in the test suite on
master due to this bug. However, this is because we are grossly
over-invalidating, which hides the missing invalidations from this
issue (#57617).
Keno added a commit that referenced this pull request Mar 3, 2025
Our implicit edge tracking for bindings does not explicitly store
any edges for bindings in the *current* module. The idea behind
this is that this is a good time-space tradeoff for validation,
because substantially all binding references in a module will be
to its defining module, while the total number of methods within
a module is limited and substantially smaller than the total number
of methods in the entire system.

However, we have an issue where the code that stores these edges
and the invalidation code disagree on which module is the *current*
one. The edge storing code was using the module in which the method
was defined, while the invalidation code was using the one in which
the MethodTable is defined. With these being misaligned, we can
miss necessary invalidations.

Both options are in principle possible, but I think the former is
better, because the module in which the method is defined is also
the module that we are likely to have a lot of references to (since
they get referenced implicitly by just writing symbols in the code).

However, this presents a problem: We don't actually have a way to
iterate all the methods defined in a particular module, without just
doing the brute force thing of scanning all methods and filtering.

To address this, build on the deferred scanning code added in #57615
to also add any scanned modules to an explicit list in `Module`. This
costs some space, but only proportional to the number of defined methods,
(and thus proportional to the written source code).

Note that we don't actually observe any issues in the test suite on
master due to this bug. However, this is because we are grossly
over-invalidating, which hides the missing invalidations from this
issue (#57617).
Keno added a commit that referenced this pull request Mar 3, 2025
Our implicit edge tracking for bindings does not explicitly store
any edges for bindings in the *current* module. The idea behind
this is that this is a good time-space tradeoff for validation,
because substantially all binding references in a module will be
to its defining module, while the total number of methods within
a module is limited and substantially smaller than the total number
of methods in the entire system.

However, we have an issue where the code that stores these edges
and the invalidation code disagree on which module is the *current*
one. The edge storing code was using the module in which the method
was defined, while the invalidation code was using the one in which
the MethodTable is defined. With these being misaligned, we can
miss necessary invalidations.

Both options are in principle possible, but I think the former is
better, because the module in which the method is defined is also
the module that we are likely to have a lot of references to (since
they get referenced implicitly by just writing symbols in the code).

However, this presents a problem: We don't actually have a way to
iterate all the methods defined in a particular module, without just
doing the brute force thing of scanning all methods and filtering.

To address this, build on the deferred scanning code added in #57615
to also add any scanned modules to an explicit list in `Module`. This
costs some space, but only proportional to the number of defined methods,
(and thus proportional to the written source code).

Note that we don't actually observe any issues in the test suite on
master due to this bug. However, this is because we are grossly
over-invalidating, which hides the missing invalidations from this
issue (#57617).
Keno added a commit that referenced this pull request Mar 4, 2025
Our implicit edge tracking for bindings does not explicitly store any
edges for bindings in the *current* module. The idea behind this is that
this is a good time-space tradeoff for validation, because substantially
all binding references in a module will be to its defining module, while
the total number of methods within a module is limited and substantially
smaller than the total number of methods in the entire system.

However, we have an issue where the code that stores these edges and the
invalidation code disagree on which module is the *current* one. The
edge storing code was using the module in which the method was defined,
while the invalidation code was using the one in which the MethodTable
is defined. With these being misaligned, we can miss necessary
invalidations.

Both options are in principle possible, but I think the former is
better, because the module in which the method is defined is also the
module that we are likely to have a lot of references to (since they get
referenced implicitly by just writing symbols in the code).

However, this presents a problem: We don't actually have a way to
iterate all the methods defined in a particular module, without just
doing the brute force thing of scanning all methods and filtering.

To address this, build on the deferred scanning code added in #57615 to
also add any scanned modules to an explicit list in `Module`. This costs
some space, but only proportional to the number of defined methods, (and
thus proportional to the written source code).

Note that we don't actually observe any issues in the test suite on
master due to this bug. However, this is because we are grossly
over-invalidating, which hides the missing invalidations from this issue
(#57617).
@Keno
Copy link
Member Author

Keno commented Mar 4, 2025

Updated version merged in #57625.

@Keno Keno closed this Mar 4, 2025
KristofferC pushed a commit that referenced this pull request Mar 4, 2025
Our implicit edge tracking for bindings does not explicitly store any
edges for bindings in the *current* module. The idea behind this is that
this is a good time-space tradeoff for validation, because substantially
all binding references in a module will be to its defining module, while
the total number of methods within a module is limited and substantially
smaller than the total number of methods in the entire system.

However, we have an issue where the code that stores these edges and the
invalidation code disagree on which module is the *current* one. The
edge storing code was using the module in which the method was defined,
while the invalidation code was using the one in which the MethodTable
is defined. With these being misaligned, we can miss necessary
invalidations.

Both options are in principle possible, but I think the former is
better, because the module in which the method is defined is also the
module that we are likely to have a lot of references to (since they get
referenced implicitly by just writing symbols in the code).

However, this presents a problem: We don't actually have a way to
iterate all the methods defined in a particular module, without just
doing the brute force thing of scanning all methods and filtering.

To address this, build on the deferred scanning code added in #57615 to
also add any scanned modules to an explicit list in `Module`. This costs
some space, but only proportional to the number of defined methods, (and
thus proportional to the written source code).

Note that we don't actually observe any issues in the test suite on
master due to this bug. However, this is because we are grossly
over-invalidating, which hides the missing invalidations from this issue
(#57617).

(cherry picked from commit 274d80e)
serenity4 pushed a commit to serenity4/julia that referenced this pull request May 1, 2025
Our implicit edge tracking for bindings does not explicitly store
any edges for bindings in the *current* module. The idea behind
this is that this is a good time-space tradeoff for validation,
because substantially all binding references in a module will be
to its defining module, while the total number of methods within
a module is limited and substantially smaller than the total number
of methods in the entire system.

However, we have an issue where the code that stores these edges
and the invalidation code disagree on which module is the *current*
one. The edge storing code was using the module in which the method
was defined, while the invalidation code was using the one in which
the MethodTable is defined. With these being misaligned, we can
miss necessary invalidations.

Both options are in principle possible, but I think the former is
better, because the module in which the method is defined is also
the module that we are likely to have a lot of references to (since
they get referenced implicitly by just writing symbols in the code).

However, this presents a problem: We don't actually have a way to
iterate all the methods defined in a particular module, without just
doing the brute force thing of scanning all methods and filtering.

To address this, build on the deferred scanning code added in JuliaLang#57615
to also add any scanned modules to an explicit list in `Module`. This
costs some space, but only proportional to the number of defined methods,
(and thus proportional to the written source code).

Note that we don't actually observe any issues in the test suite on
master due to this bug. However, this is because we are grossly
over-invalidating, which hides the missing invalidations from this
issue (JuliaLang#57617).
@KristofferC KristofferC removed the backport 1.12 Change should be backported to release-1.12 label Jun 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants