-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] Combine terminal statement support with statically known branches #15817
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not currently working because of visibility constraint simplification:
def _(cond: bool):
x = "a"
if cond:
x = "b"
# ← {0}
if True:
return
# ← {1}
# ← {2}
reveal_type(x) # revealed: Literal["a"]
At point {1}
, we're marking the x = "b"
binding as non-visible (by setting a visibility constraint of ALWAYS_FALSE
).
But at point {2}
, after we merge the two flows back together (the then
flow and the artifically inserted else
flow), we simplify the result relative to point {0}
. That sees that there weren't any new bindings of x
, and resets the visibility of x = "b"
back to what it was at point {0}
, forgetting the unreachability that we just introduced.
So I think I need to skip the simplification step when a flow contains a terminal statement.
// should be completely overwritten by the snapshot we're merging in. If the other snapshot | ||
// is unreachable, we should return without merging. | ||
if !snapshot.reachable { | ||
// As an optimization, if we know statically that either of the snapshots is always |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commenting out these two if
clauses is how we verify that this is truly an optimization — we should get the same results for the tests with and without it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I verified that we can indeed remove these checks and all tests pass! I'm not seeing a detectable performance improvement in the benchmark from including these lines; perhaps that just suggests conditional terminals aren't common enough for it to show up? It definitely seems like this should be faster in cases where it does apply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing a detectable performance improvement in the benchmark from including these lines
I think it might be that the merge step is now faster: if the other snapshot's reachability is ALWAYS_FALSE
, then the visibility of all of its bindings should also be ALWAYS_FALSE
. (I think there were cases before TDD normalization where we wouldn't be able to see that in the structure of the visibility constraint.) Merge will iterate through all of the bindings and AND their visibility constraints, but ANDing with ALWAYS_FALSE
is one of the fast-path returns.
To be clear, it's a hunch — I haven't backed any of ☝️ with data!
This sounds right. The simplification step is a bit of a performance hack. I think it could be eliminated if we used BDDs instead of syntax trees to represent visibility constraints (since BDDs self-simplify). But I think it should never be wrong to skip the simplification, just potentially hurt performance. Which in this case shouldn't be too bad, since it would only occur when there's a terminal statement in the branch, which won't be the common case. |
CodSpeed Performance ReportMerging #15817 will degrade performances by 7.96%Comparing Summary
Benchmarks breakdown
|
Running |
I'm marking this as ready for review to get 👀 on it. Just like the TDD patch that this builds on, I'm quite confused by the codspeed findings. |
CodSpeed says it's a regression on the incremental benchmark, not the cold benchmark. Is your hyperfine testing on black and tomllib testing incremental performance (that is, make an insignificant/comment change to one file and test how fast we re-check incrementally), or cold-check performance? Incremental regression generally suggests we are creating more Salsa-cached values that have to be revalidated on incremental re-check, or making that revalidation of cached values (rather than the actual semantic indexing and type inference itself) more expensive in some way. |
hyperfine is testing cold performance, but I was seeing the regression locally on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
I think it's worth putting some time-boxed effort (on the scale of a few hours) into looking into the regression here, but I don't think it should block the PR; I don't see anything obviously inefficient here, and this is what we need in order to get the right semantics. It's a better use of optimization effort to look broadly for the best ROI than to focus narrowly on a specific regression.
crates/red_knot_python_semantic/resources/mdtest/terminal_statements.md
Outdated
Show resolved
Hide resolved
// should be completely overwritten by the snapshot we're merging in. If the other snapshot | ||
// is unreachable, we should return without merging. | ||
if !snapshot.reachable { | ||
// As an optimization, if we know statically that either of the snapshots is always |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I verified that we can indeed remove these checks and all tests pass! I'm not seeing a detectable performance improvement in the benchmark from including these lines; perhaps that just suggests conditional terminals aren't common enough for it to show up? It definitely seems like this should be faster in cases where it does apply.
crates/red_knot_python_semantic/resources/mdtest/exception/basic.md
Outdated
Show resolved
Hide resolved
Two things I like doing when investigating performance issues are run
|
// should be completely overwritten by the snapshot we're merging in. If the other snapshot | ||
// is unreachable, we should return without merging. | ||
if !snapshot.reachable { | ||
// As an optimization, if we know statically that either of the snapshots is always |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not seeing a detectable performance improvement in the benchmark from including these lines
I think it might be that the merge step is now faster: if the other snapshot's reachability is ALWAYS_FALSE
, then the visibility of all of its bindings should also be ALWAYS_FALSE
. (I think there were cases before TDD normalization where we wouldn't be able to see that in the structure of the visibility constraint.) Merge will iterate through all of the bindings and AND their visibility constraints, but ANDing with ALWAYS_FALSE
is one of the fast-path returns.
To be clear, it's a hunch — I haven't backed any of ☝️ with data!
crates/red_knot_python_semantic/resources/mdtest/exception/basic.md
Outdated
Show resolved
Hide resolved
## Bindings after a terminal statement are unreachable | ||
|
||
Any bindings introduced after a terminal statement are unreachable, and are considered not visible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the new test case showing that bindings after a terminal statement are considered not visible. https://github.com/astral-sh/ruff/pull/15817/files#r1941996689
Note that this means we're currently implementing the "least helpful" option in #15797. (I think that's still okay for this PR, just pointing out that this will change depending on how we decide to handle unreachable code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I would put a TODO on that unresolved-reference
diagnostic below.
I'm a little worried about the difficulty of implementing "more useful" options for checking unreachable code, but we can leave that as a separate problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I would put a TODO on that
unresolved-reference
diagnostic below.
Done
I'm a little worried about the difficulty of implementing "more useful" options for checking unreachable code, but we can leave that as a separate problem.
Are you worried that this PR makes it more difficult? Or just that it's on the list of things to tackle sooner rather than later in case it requires large changes to the design?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not really so much that this PR makes it more difficult, just that I'm not sure how much this approach will end up having to change. I don't think it's a reason not to merge this. I am curious if you have a rough sense of how we might go about implementing "check unreachable code as if it were reachable" while still preserving (as I think we must) "unreachable branches never merge back to outer control flow". That is, fixing the TODO you just added, without having that unreachable assignment become visible in the outer flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious if you have a rough sense of how we might go about implementing "check unreachable code as if it were reachable" while still preserving (as I think we must) "unreachable branches never merge back to outer control flow".
I'd say we'd either need to track multiple visibility constraints for each binding, or multiple "current flow states" — both being ways to represent the visibility that each binding has now, and what it would reset to at the next merge point.
But I'm also not sure that's what we'd want to implement — if we want to "check unreachable code as if it were reachable", I'm not sure that should reset at merge points. e.g. if someone inserted a return
statement for debugging, I don't see a difference in UX between:
def _(cond: bool):
if cond:
x = 1
return
reveal_type(x) # revealed: Literal[1]
and
def _(cond: bool):
if cond:
x = 1
return
reveal_type(x) # revealed: Literal[1]
And so if we want to treat these both the same, I'd say we'd go for an option that controls the visibility of new bindings: "always true" if we want to check unreachable code as if it were reachable, and "current reachability" if not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say we'd either need to track multiple visibility constraints for each binding, or multiple "current flow states" — both being ways to represent the visibility that each binding has now, and what it would reset to at the next merge point.
Yeah makes sense.
But I'm also not sure that's what we'd want to implement — if we want to "check unreachable code as if it were reachable", I'm not sure that should reset at merge points.
Not sure either. I feel like checking unreachable code as if it were reachable is kind of an unprincipled approach that may not have a sensible and consistent semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, I think the semantics implemented in this PR are a good step forward, and we should go ahead with them for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this file are easier to see with "Hide whitespace"
For my future self, a ruff/crates/red_knot/Cargo.toml Line 29 in d47088c
|
After figuring out #15817 (comment), I get identical ingredient counts for
|
@dcreager can you tell me a bit more about what performance investigation you did other than comparing ingredient numbers? Did you try to compare the verbose output of watching tomllib? Did you compare two recorded profiles? I'm asking because an 8% regression is huge, especially considering we don't even know where it's coming from. For comparison, our biggest win on the incremental benchmark is #15763, and it's only 15%. This PR "eats up" 50% of that improvement. My main worry is: It's hard to figure out the root cause today, but it will even be harder to win back this regression in the future if nothing obvious shows up in benchmarks or profiles today. |
I went ahead and ran One main finding: We now call |
I ran the benchmark test under valgrind and used the I also had added some
That looks like an Could this be sampling bias due to
That suggests that the cause might be this change — we might be recording more complex visibility constraints for a noticeable number of bindings, which would take more time to evaluate than the |
Thanks for the extra explanation. It does show that you spent a fair amount of time investigating! Thanks for doing that.
I think that could partially explain the regression. It means that the queries evaluating visibility constraints have more dependencies and marking each dependency as "green" is a non-zero cost. |
This example from @sharkdp shows how terminal statements can appear in statically known branches: #15676 (comment)
We now use visibility constraints to track reachability, which allows us to model this correctly. There are two related changes as a result:
New bindings are not assumed to be visible; they inherit the current "scope start" visibility, which effectively means that new bindings are visible if/when the current flow is reachable
When simplifying visibility constraints after branching control flow, we only simplify if none of the intervening branches included a terminal statement. That is, earlier unaffected bindings are only actually unaffected if all branches make it to the merge point.