Skip to content

Move InterpMethod* to the code header #114214

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

janvorli
Copy link
Member

@janvorli janvorli commented Apr 3, 2025

The InterpMethod pointer is currently stored at the beginning of the IR bytecode of the interpreter. This change moves it to the code header.

I have also noticed that we don't need the CORINFO_FLG_INTERPRETER anymore, as there is a simple way without it after the recent code managed refactoring, so I've made that simple change in this PR too.

Most of the changes that add the setInterpMethod were generated by the ThunkGenerator tool.

The InterpMethod pointer is currently stored at the beginning of the IR bytecode
of the interpreter. This change moves it to the code header.
@janvorli janvorli added this to the 10.0.0 milestone Apr 3, 2025
@janvorli janvorli requested review from kg, cshung, BrzVlad and jkotas April 3, 2025 15:03
@janvorli janvorli self-assigned this Apr 3, 2025
@Copilot Copilot AI review requested due to automatic review settings April 3, 2025 15:03
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the handling of the interpreter’s method pointer by moving it from the IR bytecode to the code header and removing the now‐unused CORINFO_FLG_INTERPRETER flag. Key changes include:

  • Moving the storage of the InterpMethod pointer to the code header via new accessor methods.
  • Removing CORINFO_FLG_INTERPRETER checks and updating related conditional logic.
  • Propagating the new setInterpMethod API across multiple components such as JIT interfaces, SuperPMI tools, and AOT support.

Reviewed Changes

Copilot reviewed 18 out of 19 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/coreclr/vm/jitinterface.h/cpp Added and implemented setInterpMethod to update the InterpMethod pointer.
src/coreclr/vm/interpexec.cpp Updated interpreter execution to retrieve the InterpMethod via the new code header.
src/coreclr/vm/codeman.h Moved the InterpMethod pointer into the code header with new setter/getter methods.
src/coreclr/tools/* Updated SuperPMI, AOT, and JIT interface code to incorporate the new setInterpMethod method.
src/coreclr/inc/{corjit.h, corinfo.h, icorjitinfoimpl_generated.h} Updated interfaces and removed the unused interpreter flag.
src/coreclr/interpreter/eeinterp.cpp Modified code generation logic to leverage setInterpMethod instead of embedding the pointer.
Files not reviewed (1)
  • src/coreclr/tools/Common/JitInterface/ThunkGenerator/ThunkInput.txt: Language not supported
Comments suppressed due to low confidence (2)

src/coreclr/vm/interpexec.cpp:43

  • Ensure that the removal of the offset adjustment for the instruction pointer is intentional and that the overall interpreter execution correctly accounts for the new code header layout.
ip = pFrame->startIp;

src/coreclr/tools/Common/JitInterface/CorInfoImpl_generated.cs:2515

  • Consider adding a clarifying comment to document that setInterpMethod is intentionally not implemented because it is never called by the JIT, to assist future maintainers.
throw new NotImplementedException("setInterpMethod");

@@ -1778,6 +1778,10 @@ void MyICJI::reportFatalError(CorJitResult result)
jitInstance->mc->cr->recReportFatalError(result);
}

void MyICJI::setInterpMethod(void *pMethod)
Copy link
Preview

Copilot AI Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If setInterpMethod is not expected to perform any operation in this context, please add a comment inside the method body to clarify that its empty implementation is intentional.

Copilot uses AI. Check for mistakes.

@jkotas
Copy link
Member

jkotas commented Apr 3, 2025

I am wondering whether it is the best data layout.

InterpMethod is logically an interpreter method prolog. You cannot start interpreting the method without it since it has the size of locals.

I think it would make the most sense to store InterpMethod inline as the first thing in the code (ie it would be the method entrypoint) and the byte code would immediately follow.

CodeHeader is meant to be used for rarely use data, like GCInfo/UnwindInfo/EHInfo/DebugInfo. InterpMethod is not rarely used, so it should not be attached via CodeHeader.

InterpMethod should not need to contain MethodDesc* field. It is another piece of rarely used data that it is stored in code header already.

@janvorli
Copy link
Member Author

janvorli commented Apr 3, 2025

@jkotas my view was that the code header contains metadata for the code, thus it seemed reasonable to have the InterpMethod pointer in it. It cleaned up need to adjust offsets to the actual IR when reporting them to debugger / eh. But I can see that the adjustments can be done when we generate the code to pretend the InterpMethod pointer is really part of the code and then we would not have to do them at other places.

As for storing it inline as you suggest, that was something I've suggested to @BrzVlad initially. I liked the fact that we would not need to do yet another allocation to get memory for it. But Vlad preferred to have just the pointer there. @BrzVlad can you please share why you preferred that way? I don't remember the details.

@jkotas
Copy link
Member

jkotas commented Apr 3, 2025

my view was that the code header contains metadata for the code

The code header structure is intentionally designed to separate hot and rarely used data. We even experimented at one point with moving the code header completely out-of-line (at the cost of slower access to code header - similar to how AOT images are structured) to achieve better code density.

@janvorli
Copy link
Member Author

janvorli commented Apr 3, 2025

Other option would be to let the InterpreterCodeHeader contain the InterpMethod* / possibly the whole InterpMethod besides the pointer to the real code header. That would preserve the locality of the InterpMethod with the code and yet logically separate it from the byte code stream.

@jkotas
Copy link
Member

jkotas commented Apr 3, 2025

I think about the InterpMethod as a logical part of the byte stream. It is the first thing you need to interpret when executing a method, it is like a special PROLOG instruction.

@BrzVlad
Copy link
Member

BrzVlad commented Apr 3, 2025

On mono, the InterpMethod is a really big struct, with a lot of uncommon fields that most methods never need. While we will probably keep it cleaner for the CoreCLR port, it is still expected to expand quite heavily in size. I don't think it is a good idea to just have so much unused data inlined in the code stream. Given the format of the InterpMethod will keep changing as the implementation goes on, I decided to just store the pointer for now. Long term, my view was that we will have a few fields inlined (like allocaSize and pDataItems) and the rest would be stored in a now optional InterpMethod structure in the code header.

Personally, I don't really view the InterpMethod fields as a prologue that is interpreted, rather metadata for the code. To me it seems more intuitive for code to actually start at offset 0. However, I think it is a good idea to have allocaSize and pDataItems inlined, while the rest of InterpMethod pointer to be stored in the header or somewhere. At the moment there is nothing relevant stored in the InterpMethod (I agree that methodHnd is useless), but this should soon change, at least with the EH support. If we follow this approach, I lean towards not having to adjust the IP in the debugger, EH and other places, so we would consider this data as part of prolog as @jkotas suggested. So this means that the ip of an interpreted method will never be at offset 0 for example, but I don't see why that would matter.

@jkotas
Copy link
Member

jkotas commented Apr 3, 2025

At the moment there is nothing relevant stored in the InterpMethod (I agree that methodHnd is useless), but this should soon change, at least with the EH support.

I expect that the EH metadata are going to be stored in the InterpreterRealCodeHeader. Is that right?

@BrzVlad
Copy link
Member

BrzVlad commented Apr 3, 2025

I was not talking about the EH metadata coming from IL code. Mainly, I was referring to an array mapping each clause index to an offset on the interpreter stack. EH would use this information to store certain data on the stack of the method that needs to invoke a clause and also several interpreter EH related opcodes would make use of this.

@jkotas
Copy link
Member

jkotas commented Apr 3, 2025

Right, it is the kind of metadata that RealCodeHeader contains today (for the JITed code at least).

@janvorli
Copy link
Member Author

janvorli commented Apr 3, 2025

Ok, let me close this PR now and wait until the InterpMethod becomes more or less final and then decide what approach we will take.

@janvorli janvorli closed this Apr 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants