Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial interpreter wire-in #112202

Merged

Conversation

janvorli
Copy link
Member

@janvorli janvorli commented Feb 5, 2025

This change adds initial support for a new interpreter to coreclr. The interpreter will have a compilation phase that generates an IR byte code and will behave as a JIT. This change allows the runtime to understand that the generated code is an interpreter IR and when runtime attempts to execute such a method, it calls into the ExecuteInterpretedMethod. This will call into the interpreter execution code once the actual interpreter is merged in.
The change uses the StubPrecode and FixupPrecode as a mean to call the InterpreterStub with a special register loaded with a pointer to the IR byte code of the methods and TransitionFrame to get the arguments from. So instead of MethodDesc, the stub data holds the IR address and instead of the usual generated code target, it holds the InterpreterStub. There is a small twist for FixupPrecode. This precode has two parts ensuring that we load the MethodDesc into the special register only at the first call when we need to know what method we are going to compile. In follow up calls, it invokes the target directly without loading that register. For interpreter, it is always kept in the state of going through the register loading path so that the interpreter execution knows what to run.
The plan is to use AltJit mechanism to load the interpreter during its development. That allows us to interpret only a subset of the methods and fall back to JIT for the ones we cannot interpret yet.

This change adds initial support for a new interpreter to coreclr. The
interpreter will have a compilation phase that generates an IR byte code
and will behave as a JIT. This change allows the runtime to understand
that the generated code is an interpreter IR and when runtime attempts
to execute such a method, it calls into the ExecuteInterpretedMethod.
This will call into the interpreter execution code once the actual
interpreter is merged in.
The change uses the StubPrecode and FixupPrecode as a mean to call the
InterpreterStub with a special register loaded with a pointer to the
IR byte code of the methods and TransitionFrame to get the arguments
from. So instead of MethodDesc, the stub data holds the IR address and
instead of the usual generated code target, it holds the
InterpreterStub. There is a small twist for FixupPrecode. This precode
has two parts ensuring that we load the MethodDesc into the special
register only at the first call when we need to know what method we are
going to compile. In follow up calls, it invokes the target directly
without loading that register. For interpreter, it is always kept in the
state of going through the register loading path so that the interpreter
execution knows what to run.
The plan is to use AltJit mechanism to load the interpreter during its
development. That allows us to interpret only a subset of the methods
and fall back to JIT for the ones we cannot interpret yet.
@janvorli janvorli added this to the 10.0.0 milestone Feb 5, 2025
@janvorli janvorli requested a review from jkotas February 5, 2025 19:41
@janvorli janvorli self-assigned this Feb 5, 2025
@janvorli
Copy link
Member Author

janvorli commented Feb 5, 2025

cc: @BrzVlad

Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

Found by PR feedback, instead of InterpretedCodeAddressFlag, it was
using 1.
@jkotas
Copy link
Member

jkotas commented Feb 5, 2025

For interpreter, it is always kept in the state of going through the register loading path so that the interpreter execution knows what to run.

You have to atomically switch both the code pointer and the data when going into the steady state. How are you going to do it atomically so that there are no race conditions?

I think we may need to have a new type of precode or precode-like structure for interpreted code. When you are going into steady state, you allocate a new one, initialize both the data and code pointers in it, and then atomically replace the entrypoint code pointer to point to it.

src/coreclr/clrfeatures.cmake Show resolved Hide resolved
return methodDesc;
}

PTR_CodeHeader pCodeHeader = PTR_CodeHeader(EEJitManager::GetCodeHeaderFromStartAddress(methodDesc & ~InterpretedCodeAddressFlag));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeHeader is specific to JITed code. We have different CodeHeader-like structure for R2R code.

Interpreted code is fundamentally different type of code. Should we have a new CodeHeader-like structure for it instead of reusing the JITed code CodeHeader (perhaps with associated code manager)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally thinking I would go that way, but when I was considering various things about stack walking etc, it seems it would just unnecessarily complicate matters. It seems that the JITted code header / JITed code manager fits well the the interpreted code too. The plan is to use the same format of EH info and so the only thing that is not needed for the interpreter is the unwind info which can be always NULL. After all, the interpreter IR can be also seen as "jitted" code, just the format is not CPU executable.
So I'd like to keep it this way for now. If we find some downsides that I cannot foresee now, it should be straightforward to switch to what you've said.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stack walking

Stackwalking is 100% behind code manager abstraction today. I assume that we are going to have a new code manager for the interpreter. Is that correct? I do not think we want to have interpreter support to be intertwined with regular code manager.

There are other things that the JITed code manager takes care of that we should not need for the interpreter. For example, the nibble map should not be needed for the interpreter code since we should be always able to find the method that we are interpreting from the interpreter state.

I think we should be setting up the interpreter as a code kind that it independent on the existing JIT and R2R code kinds so that it can evolve and diverge as much as it needs to independently on the rest of the system.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a separate code manager in my tentative plans in the docs from the experiment, but when I was thinking about detail, it seemed that it would be 90% the same as the existing JIT one (and those would mostly be just things we don't need rather than things to add / change) and I have convinced myself that it is better to stick with using it for the .NET 10 timeframe instead of cloning / sharing most of its code.

Also, since we are using the AltJit mechanism for the bringup, plugging in a different code manager into the related code paths would add quite a lot of churn.
And at many places we would need to add switching between the two code managers or somehow abstract them and when I looked at it, it was far from straightforward.
For example, the interpreter compilation phase uses the same code as JIT in the jitinterface.cpp. That code operates explicitly with EEJitManager.

If we decide to create a separate manager though, I'd prefer doing that at a stage when we have the stack walking and EH for the interpreted code done, which should be in April. Then it would be much clearer which functionality of the EEJitManager we don't need and which we needed to update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidwrighton Do you have an opinion on whether we want to hack support for interpreter code into the existing EEJitManager or whether we want to create a new IJitManager implementation for interpreted code?

My opinion is that we should have a new IJitManager implementation for interpreted code. It does not need to be added in this PR, but we should be setting this up with assumption that it will exist in near feature. For example, introduce new struct InterpretedCode that the interpreted code start with instead of reusing JITed code CodeHeader (it does not have to be at negative offset indirection like it is the case for JITed code CodeHeader - the interpreted code blob can start with it).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My current view is that I actually expect very rare changes to the EEJitManager if any at all. It seems it provides the functionality we need as is. Obviously it manages more stuff than we need, like the unwindinfo, debug info or the nibble map. And the heaps that it manages are executable while the interpreter would not need that.
I am not trying to push for it to stay the manager for the interpreter code forever, but it simplifies the current work on all the other changes, having confidence that the manager is a stable thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janvorli @jkotas I agree with @janvorli that we shouldn't need all THAT many changes to the EEJitManager to make this all work, but I'm very suspicious about intertwining the changes we DO have with whatever we've got here. In particular things like sharing the same code heap for the interpreter IR and the jitted code seems like a spectacularly poor idea, and I suspect we may actually want to have a form of unwind data for the interpreter ir at some point that is managed by the code manager

I see this as a somewhat separate work-stream from the bring up the interpreter effort though, which implies we can get someone other than @janvorli or @BrzVlad to do it. My first thought is to start with the idea that we will have 2 separate instances of the EEJitManager object, and then detangle from there. Probably create a base class EEDynamicCodeManager that we can use to refer to both of them, and then slowly peel the two managers into separate chunks. (While we're at it I'd like to re-organize the codeman.cpp file, as it is currently very poorly organized. I'd like to see it split into codeman.cpp for the stuff which is common to all jit managers, dynamiccodeman.cpp, which is used for the stuff the JIT and Interpreter share, and then jitcodeman.cpp/interpretercodeman.cpp for the jit/interpreter specific bits. This is all a bit pie-in-the-sky but that's my ideal world.)

@janvorli
Copy link
Member Author

janvorli commented Feb 6, 2025

You have to atomically switch both the code pointer and the data when going into the steady state. How are you going to do it atomically so that there are no race conditions?

I think we may need to have a new type of precode or precode-like structure for interpreted code. When you are going into steady state, you allocate a new one, initialize both the data and code pointers in it, and then atomically replace the entrypoint code pointer to point to it.

Hmm, I originally kept using the MethodDesc in the precode "MethodDesc" slot, but moved to having the byte code pointer there later when I have realized that I have no way to retrieve the bytecode for MethodDesc, so the interpreter would need to keep some kind of lookup table. I haven't realized the change introduced this problem.
I think that it may be solved without the atomic switch though by:

  • ensuring that the bytecode pointer is written into the precode MethodDesc slot first and the InterpreterStub into the target slot second (ensuring proper ordering of the writes)
  • adding a check in the ThePreStub to see whether the MethodTable has the InterpretedCodeAddressFlag set and redirecting it to the ExecuteInterpretedMethod in such case.

But this is just a quick idea.

Overall, it seemed it would be nice to not to have to create another precode when there already is the "temporary" one. But if that doesn't sound like a problematic overhead, then I can always create a new StubPrecode for it. And if we wanted to differentiate that StubPrecode from the regular one for some reason, then we can do it via the "Type" field we already have in it - we use it
as the NDirectImportPrecode.

@jkotas
Copy link
Member

jkotas commented Feb 6, 2025

ensuring that the bytecode pointer is written into the precode MethodDesc slot first and the InterpreterStub into the target slot second (ensuring proper ordering of the writes)

Having two writes and two reads in specific order in the code does not mean that they are going to be executed in that order by the hardware on weaker memory architectures like Arm64. I think we would have to account for any actual orderings of reads/writes, or have somebody with PhD in memory models create carefully crafted lock-free code to guarantee the ordering. In future, I expect that we will want to switch to optimized precompiled transition thunks for the interpreter code where possible. Dealing with this race condition in the transition thunks would not be pretty.

if that doesn't sound like a problematic overhead, then I can always create a new StubPrecode for it

I do not see a problem with creating StubPrecode after compiling an interpreted method. I think it would be the best solution.

@janvorli
Copy link
Member Author

janvorli commented Feb 6, 2025

Having two writes and two reads in specific order in the code does not mean that they are going to be executed in that order by the hardware on weaker memory architectures like Arm64. I think we would have to account for any actual orderings of reads/writes

That's what I've actually meant by saying "ensuring proper ordering of writes". But anyways, creating a new precode for the interpreter code will be nicer than having to solve races.

@janvorli
Copy link
Member Author

janvorli commented Feb 6, 2025

@jkotas I have added a commit modifying the change to use a new precode for the interpreter in the way you have suggested

@janvorli janvorli force-pushed the initial-new-interpreter-wire-in-final branch from 66dfe94 to 084c862 Compare February 6, 2025 14:19
@janvorli janvorli force-pushed the initial-new-interpreter-wire-in-final branch from 084c862 to bd9f36b Compare February 6, 2025 14:21
Also enable the interpreter in debug / checked builds only by default
@steveisok steveisok requested review from kg and lateralusX February 6, 2025 20:58
Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the osx-x64 build break is fixed.

We should plan on doing the refactoring of the code manager like @davidwrighton suggested before it starts getting out of control.

@janvorli janvorli merged commit cee8434 into dotnet:main Feb 7, 2025
94 checks passed
@janvorli janvorli deleted the initial-new-interpreter-wire-in-final branch February 7, 2025 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants