-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[semantic-arc-opts] Eliminate all copy_value from guaranteed argument… #19690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[semantic-arc-opts] Eliminate all copy_value from guaranteed argument… #19690
Conversation
@swift-ci smoke benchmark |
@swift-ci test |
@swift-ci test compiler performance |
Build comment file:No performance and code size changesHow to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR. Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein). Hardware Overview
|
edd6ccc
to
d2aef5c
Compare
Got curious and looked at some of the output. Made a small change that ups the number of rr that we remove from the stdlib from < 200 to > 7000. Lets see if that gives us some boost. |
@swift-ci smoke benchmark |
1 similar comment
@swift-ci smoke benchmark |
(and I already see a bunch of rr traffic we can get rid of from objc code in the stdlib). |
@swift-ci smoke benchmark |
Build comment file:Performance: -O
Code size: -O
Performance: -Osize
Code size: -Osize
Performance: -Onone
Code size: Swift libraries
How to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the regressions before you merge the PR. Noise: Sometimes the performance results (not code size!) contain false alarms. Unexpected regressions which are marked with '(?)' are probably noise. If you see regressions which you cannot explain you can try to run the benchmarks again. If regressions still show up, please consult with the performance team (@eeckstein). Hardware Overview
|
I think this is good enough. I am going to merge what I have. |
@swift-ci smoke test |
@atrick can I get a review? |
Or actually, I need to clean it up first. |
d2aef5c
to
637d7eb
Compare
@swift-ci smoke test |
5 similar comments
@swift-ci smoke test |
@swift-ci smoke test |
@swift-ci smoke test |
@swift-ci smoke test |
@swift-ci smoke test |
Thanks. I haven't reviewed it yet but it's more than promising. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. It should unlock the benefits of simple copy propagation in ownership SIL.
If the intention here is to stage in this PR before changing the ownership map kind stuff, that's ok, but my review comments are always "as-is"...
lib/SIL/SILOwnershipVerifier.cpp
Outdated
bool swift::getUnderlyingBorrowIntroducers(SILValue inputValue, | ||
SmallVectorImpl<SILValue> &out) { | ||
SmallVector<SILValue, 32> worklist; | ||
worklist.emplace_back(inputValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[comment] I think it's fine if you always use emplace_back, but some people might say it's silly because you're just calling the same SILValue copy constructor as push_back.
NumEliminatedInsts += 2; | ||
} // end anonymous namespace | ||
|
||
bool SemanticARCOptVisitor::visitBeginBorrowInst(BeginBorrowInst *bbi) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[question] I thought we were using all-caps for acronyms: BBI
? I don't actually care, but if not, there will be quite a lot of code to rewrite.
SmallVector<EndBorrowInst *, 16> endBorrows; | ||
for (auto *op : bbi->getUses()) { | ||
auto *user = op->getUser(); | ||
switch (user->getKind()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] I don't understand why a switch statement is used here.
default: | ||
// Make sure that this operand can accept our arguments kind. | ||
auto map = op->getOwnershipKindMap(); | ||
if (map.canAcceptKind(kind)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[design] I still don't understand what an OwnershipKindMap is for. I think it obscures the meaning of the code. Why not just:
op->canAcceptOwnership(kind)
Also, this check doesn't make sense to me for forwarding instructions, without teaching the optimization to look through all of those. If the forwarding instruction is not taking ownership before, we don't want to implicitly change it's meaning by changing the ownership of its incoming value.
This is what would make sense to me, instead of the map and "can accept" check.
// Is this a point-in-time use that cannot extend the lifetime if it's
// operand value?
if (op->isNonConsuming())
continue;
return false
/// Stashes a value for an iterator for a scope. This instruction/any | ||
/// instructions after instruction in the block/any instructions in | ||
/// other blocks can all be deleted safely. | ||
struct ForwardInstIteratorGuard { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[style] This is a substantial abstraction that I would only ever use to hack
around old code instead of just fixing those APIs.
I generally think that transformation utilities that delete instructions (or blocks)
should be responsible for returning a valid iterator. This is just a
style comment...
for (auto ii = begin(), nextI = ii, ie = end(); ii != ie; ii = nextI) {
nextI = std::next(ii)
// Ususally some large number of conditions...
if (someBailout)
continue
// And any number of possible transformations...
if (canTransform) {
nextI = transformInst(ii);
continue
}
}
// Ok, this constraint can take something owned as live. Lets | ||
// see if it can also take something that is guaranteed. If it | ||
// can not, then we bail. | ||
if (!map.canAcceptKind(ValueOwnershipKind::Guaranteed)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[design] I really can't mentally map the code that's using these physical maps to concepts that I normally reason about. i.e. If I wanted to know how this would work on a concrete SIL example, I would have to construct a test case by hand and run it through the code. (that's the problem with too much indirection). I think this is what the code is trying to do (eliding some details), but not sure:
switch (operand.ownershipEffect()):
case Consuming:
if (isa<Destroy>(operand.getUser()))
continue;
return false;
case NonConsuming:
// A point-in-time use can always handle +1 or +0.
continue;
case Forwarding:
// This instruction takes ownership of the copy. We don't currently
// recurse through them.
return false;
}
If we wanted to constrain some non-consuming users to only handle owned
values, then we could introduce a separate "use constraint" API for
that. But I suspect we don't want to allow that anyway.
Andy and I discussed about this offline. There are a few things here that we need before this can land:
|
Sounds good. I think we should have a separate API to identify "borrow projections" (struct_extract, ref_element_addr), although they may be hidden behind other forwarding instructions (casts), so we probably can't just ignore forwarding instructions when we remove borrow scopes. |
637d7eb
to
dbeb2d5
Compare
…s that have all uses that can accept values with guaranteed ownership. This takes advantage of my restricting Semantic ARC Opts to run only on the stdlib since we know it passes the ownership verifier. Using that I reimplemented this optimization in a more robust, elegant, general way. Specifically, we now will eliminate any SILGen copy_value on a guaranteed argument all of whose uses are either destroy_values or instructions that can accept guaranteed parameters. To be clear: This means that the value must be consumed in the same function only be destroy_value. Since we know that the lifetime of the guaranteed parameter will be larger than any owned value created/destroyed in the body, we do not need to check that the copy_value's destroy_value set is joint post-dominated by the set of end_borrows. That will have to be added to turn this into a general optimization.
Specifically, now the optimizer can take the following code: sil @foo: $@convention(thin) (@guaranteed NativeObjectPair) -> () { bb0(%0 : @guaranteed $NativeObjectPair): %1 = struct_extract %0 : $NativeObjectPair, #NativeObjectPair.obj1 %2 = copy_value %1 : $Builtin.NativeObject %3 = begin_borrow %2 : $Builtin.NativeObject %foo = function_ref ... apply %foo(%3) : $@convention(thin) (@guaranteed Builtin.NativeObject) -> () end_borrow %3 : $Builtin.NativeObject destroy_value %2 : $Builtin.NativeObject ... } and eliminate all ownership instructions. I.e.: sil @foo: $@convention(thin) (@guaranteed NativeObjectPair) -> () { bb0(%0 : @guaranteed $NativeObjectPair): %1 = struct_extract %0 : $NativeObjectPair, #NativeObjectPair.obj1 %foo = function_ref .... apply %foo(%1) ... } Before this the optimizer could only eliminate 200 insts in the stdlib. With this change, we now can eliminate ~7000.
dbeb2d5
to
099cae5
Compare
@swift-ci smoke test |
…s that have all uses that can accept values with guaranteed ownership.
This takes advantage of my restricting Semantic ARC Opts to run only on the
stdlib since we know it passes the ownership verifier. Using that I
reimplemented this optimization in a more robust, elegant, general
way. Specifically, we now will eliminate any SILGen copy_value on a guaranteed
argument all of whose uses are either destroy_values or instructions that can
accept guaranteed parameters. To be clear: This means that the value must be
consumed in the same function only be destroy_value.
Since we know that the lifetime of the guaranteed parameter will be larger than
any owned value created/destroyed in the body, we do not need to check that the
copy_value's destroy_value set is joint post-dominated by the set of end_borrows.
That will have to be added to turn this into a general optimization.
I wrote this as a prototype over the weekend.
rdar://problem/43569988