-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NewOptimizer] Better handling in the presence of select value #26969
Conversation
Why does it do that? The current inliner (correctly) decides that it is more optimal to inline this function without doing union-splitting (partly because of this counter-example case). Fwiw, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks probably valid. I dislike doing jump-threading during inlining, but I suppose I'll have to live with it for now.
# Performs minimal backwards inference to catch a couple of interesting, common cases | ||
function minimal_backinf(compact, constraints, unconstrained_types, argexprs) | ||
for i = 2:length(argexprs) | ||
isa(argexprs[i], SSAValue) || continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lift argexprs[i]
into a local variable will allow the compiler to infer this and generate better (and less) code
end | ||
|
||
# Performs minimal backwards inference to catch a couple of interesting, common cases | ||
function minimal_backinf(compact, constraints, unconstrained_types, argexprs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add type annotations
c = find_constraint(v1, constraints) | ||
if c !== nothing | ||
refined = egal_tfunc(c, compact_exprtype(compact, v2)) | ||
if !(ut ⊑ refined) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be better to preserve the inference type, unless we've proven that we have a narrower bound (instead of checking if we're certain we don't have a wider bound):
if refined ⊑ ut
Nothing in this PR does jump threading. |
Oh, I didn't notice before how much work this is doing for no benefit. The actual issue here is that the new inliner is doing union-splitting in the wrong order. It should attempt normal inlining first before considering whether other optimizations can apply to split an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather fix the actual bug here (wrong order of passes) rather than this
I can take a look at that, but I don't think it actually fixes the performance problem without this. |
Took a look, we force_noinline any function specialized on union arguments, so there's no difference (because any function with union arguments will never be eligible for inlining). However, the comment in the source says this is due to bugs in type intersection, so I'll see what happens if we remove that restriction. |
The benchmarks contain code like this: ``` x::Union{Nothing, Int} result += ifelse(x === nothing, 0, x) ``` which, perhaps somewhat ironically is quite a bit harder on the new optimizer than an equivalent code sequence using ternary operators. The reason for this is that ifelse gets inferred as `Union{Int, Nothing}`, creating a phi node of that type, which then causes a union split + that the optimizer can't really get rid of easily. What this commit does is add some local improvements to help with the situation. First, it adds some minimal back inference during inlining. As a result, when inlining decides to unionsplit `ifelse(x === nothing, 0, x::Union{Nothing, Int})`, it looks back at the definition of `x === nothing`, realizes it's constrained by the union split and inserts the appropriate boolean constant. Next, a new `type_tightening_pass` goes back and annotates more precise types for the inlinined `select_value` and phi nodes. This is sufficient to get the above code to behave reasonably and should hopefully fix the performance regression on the various union sum benchmarks seen in #26795.
56efc7c
to
8f3b986
Compare
Rerunning CI, if it looks good, let's merge. |
@vtjnash had a couple review comments above which I was planning to address today. I'll start a nanosoldier run with this version on the benchmark PR before that though. |
Discussing with @vtjnash, he really doesn't like this approach and would prefer to take the performance regressions for now and before the 1.0 release merge ifelse/select_value into one and teach inference to be smarter about this. In the meantime I'll rebase the two bug fix commits in here and submit them separately. |
Extract PiNode insertion code from #26969
Superseded by the referencing pull requests. Let's keep the branch around for a while for reference though. |
The benchmarks contain code like this:
which, perhaps somewhat ironically is quite a bit harder
on the new optimizer than an equivalent code sequence
using ternary operators. The reason for this is that
ifelse gets inferred as
Union{Int, Nothing}
, creatinga phi node of that type, which then causes a union split +
that the optimizer can't really get rid of easily. What this
commit does is add some local improvements to help with the
situation. First, it adds some minimal back inference during
inlining. As a result, when inlining decides to unionsplit
ifelse(x === nothing, 0, x::Union{Nothing, Int})
, it looksback at the definition of
x === nothing
, realizes it's constrainedby the union split and inserts the appropriate boolean constant.
Next, a new
type_tightening_pass
goes back and annotates more precisetypes for the inlinined
select_value
and phi nodes. This is sufficientto get the above code to behave reasonably and should hopefully fix
the performance regression on the various union sum benchmarks
seen in #26795.