[RuntimeAsync] Clean up and document JIT parts#3063
[RuntimeAsync] Clean up and document JIT parts#3063jakobbotsch merged 11 commits intodotnet:feature/async2-experimentfrom
Conversation
This reverts commit 0267fdcd8022b5ebfef49ee64ba845052f9d4c38.
|
cc @VSadov |
| // Even if these are address exposed we expect them to be dead at | ||
| // suspension points. TODO: It would be good to somehow verify these | ||
| // aren't obviously live, if the JIT creates live ranges that span a | ||
| // suspension point then this makes it quite hard to diagnose that. |
There was a problem hiding this comment.
I assume this comment reflects current implementation and will change when "zero out ref/ref-like" is implemented.
There was a problem hiding this comment.
Correct (this comment isn't new, this function was just moved)
|
|
||
| // Now walk the IR for all the blocks that contain async2 calls. Keep track | ||
| // of liveness and outstanding LIR edges as we go; the LIR edges that cross | ||
| // async2 calls are additional live variables that must be spilled. |
There was a problem hiding this comment.
Can we end up spilling byrefs here?
There was a problem hiding this comment.
Yes, but with the existing scheme that was basically undefined behavior. With the new scheme we'll have to substitute zero for these edges.
There was a problem hiding this comment.
We should probably also have a small wording change in the spec that indicates that IL stack temporaries containing byrefs are also zeroed around async calls.
There was a problem hiding this comment.
Right. IL stack is the same as locals. Maybe that can be unified somehow in the spec.
There was a problem hiding this comment.
Here I was more concerned about JIT itself introducing byref values that cross awaits.
I do not have a good sense of how early async rewrite happens. If it happens very early we only need to be concerned with what is in IL, and after capture/restore is introduced, it will disallow optimizations that keeping byref temps to locals across awaits.
Although I still wonder what happens if byref temps do not point to locals - like point to a threadstatic.
There was a problem hiding this comment.
Importer transformations must run before this, since we rely on them, so perhaps need some care about introducing byref locals.
It has this for example, but not sure if can be a problem:
// Exact method descriptor as passed in
ctxTree = gtNewLclvNode(impInlineRoot()->info.compTypeCtxtArg, TYP_I_IMPL);There was a problem hiding this comment.
Here I was more concerned about JIT itself introducing byref values that cross awaits.
CSE is one source of these (CSE is also utilized by loop hoisting). That's why CSE has special logic that makes byref CSEs unavailable across suspension points.
Loop opts introduced in .NET 9 probably also needs some handling now that I think about it.
Object stack allocation is also disabled for async2 because it introduces stack allocations and takes the address of it. With some work we could do object stack allocation in cases where all definitions/uses of the stack allocated object are visible.
I wouldn't be surprised if there are other cases, but I would be somewhat surprised if these cases are tree temps/LIR edges and not locals. In any case, if we do this through the LIR edge path we will hit this assert when we try to lay out the continuation:
runtimelab/src/coreclr/jit/async.cpp
Line 405 in 94f8e5a
I do not have a good sense of how early async rewrite happens. If it happens very early we only need to be concerned with what is in IL, and after capture/restore is introduced, it will disallow optimizations that keeping byref temps to locals across awaits.
The async rewrite happens very late. If this happened early, it would inhibit most optimizations to a significant degree because it introduces spurious data flow due to resumption. Currently most optimizations get to optimize the function as if there is no suspension/resumption and the fact that it is an async method is not even visible to them.
It means we need to teach the JIT in various places that it cannot introduce byrefs that live across suspension points, and it cannot take the address of a local across a suspension point.
Although I still wonder what happens if byref temps do not point to locals - like point to a threadstatic.
We will hit the assert above.
It has this for example, but not sure if can be a problem:
// Exact method descriptor as passed in ctxTree = gtNewLclvNode(impInlineRoot()->info.compTypeCtxtArg, TYP_I_IMPL);
TYP_I_IMPL is a native pointer, so this shouldn't be a problem.
There was a problem hiding this comment.
The async rewrite happens very late. If this happened early, it would inhibit most optimizations to a significant degree because it introduces spurious data flow due to resumption.
Right, even though await formally destroys byrefs/reflikes, it has no effect on byvals, which are the majority.
It makes sense to do the transform late so that value-based optimizations continue working, but need somehow prevent anything that would rely on byrefs being stable across await.
| } | ||
|
|
||
| if (gcRefsCount > 0) | ||
| return resumeBB; |
There was a problem hiding this comment.
Perhaps worth assigning null to data array fields after deserializing them. The dispatcher will keep the current continuation alive until we return (since it needs .Next). That may take a while, but we no longer need the arrays, so no need to keep them alive.
There was a problem hiding this comment.
Since this is mostly refactoring, it would be ok to just add a comment or put this onto the "suggestions" pile.
There was a problem hiding this comment.
I think we can make it so that DispatchContinuations doesn't keep the strong reference to continuation beyond the resumption call.
There was a problem hiding this comment.
VSadov
left a comment
There was a problem hiding this comment.
LGTM. Some parts are much more clear now!
d060bbb
into
dotnet:feature/async2-experiment
CLASSID_SYSTEM_BYTEthat was added early as part of async2 but no longer usedNo functional changes.