Skip to content

Conversation

@tlively
Copy link
Member

@tlively tlively commented Mar 28, 2024

Previously we transformed return calls in inlined function bodies into normal
calls followed by branches out to the caller code. Similarly, when inlining a
return_call callsite, we simply added a return after the body inlined at the
callsite. These transformations would have been correct if the semantics of
return calls were to call and then return, but they are not correct for the
actual semantics of returning and then calling.

The previous implementation is observably incorrect for return calls inside try
blocks, where the previous implementation would run the inlined body within the
try block, but the proper semantics would be to run the inlined body outside the
try block.

Fix the problem by tranforming inlined return calls to branches followed by
calls rather than as calls followed by branches. For the case of inlined return
call callsites, insert branches out of the original body of the caller and
inline the body of the callee as a sibling of the original caller body. For the
other case of return calls appearing in inlined bodies, translate the return
calls to branches out to calls inserted as siblings of the original inlined
body.

In both cases, it would have been convenient to use multivalue block return to
send call parameters along the branches to the calls, but unfortunately in our
IR that would have required tuple-typed scratch locals to unpack the tuple of
operands at the call sites. It is simpler to just use locals to propagate the
operands in the first place.

@tlively tlively requested a review from kripken March 28, 2024 01:14
@tlively
Copy link
Member Author

tlively commented Mar 28, 2024

@kripken
Copy link
Member

kripken commented Mar 28, 2024

I haven't had time to review in detail yet, but a quick comment: I think you need to also fix the interpreter on return_call at this time, as if we just fix Inlining then it will be out of sync with execution and the fuzzer will find errors.

@tlively
Copy link
Member Author

tlively commented Mar 28, 2024

Yes, working on it now :)

auto var = builder->addVar(getFunction(), sig.params[i]);
branchBlock->list.push_back(
builder->makeLocalSet(var, curr->operands[i]));
curr->operands[i] = builder->makeLocalGet(var, sig.params[i]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can perhaps use ir/localize.h. That would avoid putting things in locals unnecessarily (it will leave things with no effects in place). But if it's more work maybe not worth it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look into it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this would be slightly tricky to use. The result of ChildLocalizer::getReplacement may or may not contain the parent (in this case the call), but we need to know where the parent is so we can replace it with a branch. We could add a new getChildrenReplacment method that returns a block with only the sets, perhaps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed this change, PTAL.

}

void walk(Expression*& curr) {
PostWalker<Updater>::walk(curr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than these two lines, can just implement visitFunction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no, because it's not the entire function that gets walked, just the block containing the inlined body.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks, that's what I was missing...

return;
}

Block* body = curr->cast<Block>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why must the function body be a block?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the block we create to hold the inlined contents in the parent. We always create a block in case there are any returns in the inlined contents that would need to be transformed into branches out of that block.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you are passing the block? The inliner is crashing on this example.

(module
 (import "env" "h" (func $h (result i32)))
 (func $g (param $x i32) (result i32)
  (if (result i32) (local.get $x)
   (then
    (return_call $h)
   )
   (else
    (i32.const 17)
   )
  )
 )
 (func (export "f") (param $x i32) (result i32)
  (call $g (local.get $x))
 )
)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this comment was incorrect. I fixed it though, so the latest version of this branch should be good. I just tested this example locally and it worked fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, the fix is in #6451.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, good catch. I've moved it to the correct PR now.

// inlined body. The branch label will be set later when we create branch
// targets for the calls.
Block* childBlock = ChildLocalizer(curr, getFunction(), *module, options)
.getChildrenReplacement();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should avoid adding temp vars when there is no need - is there a test that shows that? (I seem to see locals created all the time but I probably just missed it.)

;; CHECK-NEXT: (block
;; CHECK-NEXT: (block $__return_call
;; CHECK-NEXT: (block
;; CHECK-NEXT: (br $__return_call)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lack of local.set here demonstrates that we successfully skipped using a local for this parameter. The test output still has a local because of the next inlining iteration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks. Yeah, that local.set has two causes in these tests now made this hard to see...

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with fuzzing + landing at the same time as the other PR.

;; CHECK-NEXT: (block
;; CHECK-NEXT: (block $__return_call
;; CHECK-NEXT: (block
;; CHECK-NEXT: (br $__return_call)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks. Yeah, that local.set has two causes in these tests now made this hard to see...

tlively added 3 commits April 4, 2024 09:13
Previously we transformed return calls in inlined function bodies into normal
calls followed by branches out to the caller code. Similarly, when inlining a
`return_call` callsite, we simply added a `return` after the body inlined at the
callsite. These transformations would have been correct if the semantics of
return calls were to call and then return, but they are not correct for the
actual semantics of returning and then calling.

The previous implementation is observably incorrect for return calls inside try
blocks, where the previous implementation would run the inlined body within the
try block, but the proper semantics would be to run the inlined body outside the
try block.

Fix the problem by tranforming inlined return calls to branches followed by
calls rather than as calls followed by branches. For the case of inlined return
call callsites, insert branches out of the original body of the caller and
inline the body of the callee as a sibling of the original caller body. For the
other case of return calls appearing in inlined bodies, translate the return
calls to branches out to calls inserted as siblings of the original inlined
body.

In both cases, it would have been convenient to use multivalue block return to
send call parameters along the branches to the calls, but unfortunately in our
IR that would have required tuple-typed scratch locals to unpack the tuple of
operands at the call sites. It is simpler to just use locals to propagate the
operands in the first place.
tlively added 2 commits April 5, 2024 12:23
We previously interpreted return calls as calls followed by returns, but that is
not correct both because it grows the size of the execution stack and because it
runs the called functions in the wrong context, which can be observable in the
case of exception handling.

Update the interpreter to handle return calls correctly by adding a new
`RETURN_CALL_FLOW` that behaves like a return, but carries the arguments and
reference to the return-callee rather than normal return values.
`callFunctionInternal` is updated to intercept this flow and call return-called
functions in a loop until a function returns with some other kind of flow.

Pull in the upstream spec tests return_call.wast, return_call_indirect.wast, and
return_call_ref.wast with light editing so that we parse and validate them
successfully.

* Handle return calls in wasm-ctor-eval (#6464)

When an evaluated export ends in a return call, continue evaluating the
return-called function. This requires propagating the parameters, handling the
case that the return-called function might be an import, and fixing up local
indices in case the final function has different parameters than the original
function.

* Update effects.h to handle return calls correctly (#6470)

As far as their surrounding code is concerned return calls are no different from
normal returns. It's only from a caller's perspective that a function containing
a return call also has the effects of the return-callee. To model this more
precisely in EffectAnalyzer, stash the throw effect of return-callees on the
side and only merge it in at the end when analyzing the effects of a full
function body.
@tlively tlively enabled auto-merge (squash) April 8, 2024 17:10
@tlively tlively merged commit ad09739 into main Apr 8, 2024
@tlively tlively deleted the return-call-inlining branch April 8, 2024 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants