Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Initial draft of the composite R2R format #25344

Closed
wants to merge 12 commits into from

Conversation

trylek
Copy link
Member

@trylek trylek commented Jun 24, 2019

I have put together some initial thoughts towards the composite R2R format. As the 3.1 format spec hasn't yet been merged in, this change is a bit confusing as it includes the older commits; I assume I'll rebase it once I manage to merge in the 3.1 change.

Thanks for patience

Tomas

Copy link

@mjsabby mjsabby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the write up.

Somewhat related to the format or implementation details, but a point to think about -- would we put the shared framework under such a composite image. If not, why not? The last I looked, assemblies in the shared framework (TPAList?) always take precedence than the application folder ones, if that's true then it aligns with your goal of always looking into the composite file first.


The perf improvements for composite R2R files can stem from the following facts:

* The composite R2R file forms a tightly-bound version bubble where an individual assembly cannot be accidentally replaced on disk so we don't need to emit any type layout checks.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your statement seems to indicate that this is better than version bubble across multiple PE files. Is the sole reason here because we feel that screwing up is now harder because it would be a bug in the tool that generates the single PE file versus in the multiple file case the developer could mess up the files on disk? To me they both seem the same, i.e. if we can remove it here we could remove it on the multiple file case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it depends on what you refer to by "messing up". For the sake of simplicity let's assume the compiler is flawless so it doesn't "mess up" just by itself. When it generates a composite file, it's a correct runnable composite file, period. It can only become invalid when tampered with by binary patching. I believe most software publishers and deployment tools address this by means of various security hashes, checksums, envelopes or digital signatures.

In the "multi-file composite R2R file" that we more or less have today - a number of assemblies compiled into the same number of R2R output files just relaxing inlining and other limits, directly embedding field offsets and the like - is much more vulnerable to replacing a given component file with a wrong file, e.g. a slightly mismatched version. The combined absence of R2R resiliency checks and any file-level consistency checks creates a hole that can cause extremely tricky bugs (e.g. slightly shifted field offsets are a big fun to debug, I know what I'm talking about).

I believe the reason why Michal is working on the PR below is the general opinion that accidental replacement of a file with a mismatched version, be it due to a bug in some deployment script, due to some weird non-deterministic condition like an antivirus holding a certain image file locked during reinstallation or some weird condition while unpacking a "single-file self-extracting package" into a temporary location, is a much higher probability and is less catered for by means ensuring consistency of individual files as mentioned above. I may be wrong and I should probably emphasize that I'm not trying to represent or interpret any official company position on this subject.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@trylek This paragraph starts with "The perf improvements for composite R2R files can stem from the following facts", but the points that follow are more about deployment robustness - not about performance. I think it is really what @mjsabby is getting at.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon rereading I agree the section (which now resides in the design file) is somewhat misleading in multiple aspects - the type layout checks are omitted even in the multi-file large version bubble case, the MVID checks probably have minimum perf effect and more or less the only meaningful argument is the generics but even there the actual perf benefit is hard to demonstrate. I guess I should reword this to more focus on actual code optimizations like embedding field offsets or future support for direct calls.

The perf improvements for composite R2R files can stem from the following facts:

* The composite R2R file forms a tightly-bound version bubble where an individual assembly cannot be accidentally replaced on disk so we don't need to emit any type layout checks.
* Similarly, the runtime doesn't need to check reference assembly MVID's within the tightly-bound version bubble represented by the composite file.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is this done? At assembly load time? Would you really want to elide these checks? Are they expensive? Again curios how a single file benefits this case. I'm exceedingly happy that it may, but would like the document to explain more the reason why.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new change that @MichalStrehovsky is code reviewing right now in a separate PR #25281. Yes, these should be assembly load-time GUID consistency checks. I'm not saying we mustn't be performing them for combined R2R, I just mentioned I don't see them as necessary because we usually don't take software precautions against tampering with a file - that's what we use other deployment options like security envelopes for - but we normally take software precautions against locating a wrong file. For the check itself, it's probably not super-expensive, but with your 1500 assemblies in a composite file, who knows... :-)

* For composite files, we shouldn't be checking MVID's of reference assemblies within the tight version bubble represented by the single file.
* We need to improve generic instantiation lookup algorithm to be able to locate all instantiations emitted into the composite file.
* CoreCLR runtime will need to parse the manifest metadata (and possibly the `READYTORUN_SECTION_SIMPLE_ASSEMBLY_NAME_LOOKUP` section) and set up lookup structures to facilitate fast assembly lookup for available types and method entrypoints.
* The runtime will need to start consulting the `READYTORUN_SECTION_ASSEMBLIES` table to locate MSIL metadata within the composite executable - this logic should definitely take precedence to arbitrary files on the disk to satisfy consistency guarantees.

# PE Headers and CLI Headers
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is implicit but all PE sections will be merged. One consequence for debuggers and profilers will be searching through the Debug Directory. All such tools I'm familiar with read through the first two entries and rely on the order. I believe both PerfView and WPA expect max 2 entries (one for MSIL, one for precompiled code) so no action required other than maybe a mental note that if we decide to get fancy with how we merge pe sections.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the precompiled code, I believe that should get merged automatically by the compiler. For the MSIL it's somewhat more tricky - I think it should be no problem to put all MSIL metadata in the same section - in fact it's probably the logical thing to do - but it's not immediately obvious whether it will automatically fix the entire problem - e.g. if PerfView expects the entire MSIL section to be a single MSIL metadata block, it still wouldn't work.


* In single-input R2R PE files the generic instantiations generated for the input MSIL assembly are placed in `READYTORUN_SECTION_INSTANCE_METHOD_ENTRYPOINTS`. In composite R2R files, this section represents all instance entrypoints emitted within the composite build (i.e. generic instantiations needed by any of the input assemblies). With the proposed change to make the "input MSIL" represent the manifest metadata, we should be able to easily refer to all required reference assemblies in the signature encodings even without the manifest module override (**note:** this is a minute technical detail).

### Required CoreCLR runtime changes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This does not belong to the file format spec.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in the introduction, this first version was a bit of a mess due to combining two kinds of docs. After writing up some of that things became somewhat clearer to me. For now I have added a secondary doc readytorun-composite-format-design I seeded with the "design notes" parts I removed from the format md.


**TODO:** I suspect that `READYTORUN_SECTION_DEBUG_INFO`, `READYTORUN_SECTION_METHODCALL_THUNKS` and / or `READYTORUN_SECTION_INLINING_INFO` may also require changes specific to the composite R2R file format.

## READYTORUN_SECTION_SIMPLE_ASSEMBLY_NAME_LOOKUP (v3.2+)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a performance optimization. We should include it only once we have the data that it is useful and improves performance significantly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I have removed this section for now, after all it was very short. Will update in 7th commit.

struct READYTORUN_SECTION_ASSEMBLIES_ENTRY
{
DWORD CorHeaderRva; // RVA of the input MSIL metadata COR header
DWORD AvailableTypesRva; // RVA of the available types table
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type used for RVA + SIZE is IMAGE_DATA_DIRECTORY

It is likely that the number of per-assembly entries will grow. It may be nice to use the same ID+ IMAGE_DATA_DIRECTORY like what is used in the regular R2R header today.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, updated in 7th commit.

trylek added 6 commits June 24, 2019 20:25
I have updated the various enumerations and tables. I didn't try
to go overboard w.r.t. understanding tables I'm not yet familiar
with, most notably the diagnostic, inlining and profiling tables.

Thanks

Tomas
I have rebased the composite R2R format changes against the
merged in 3.1 format doc and I fixed two small formatting bugs
I spotted in the rebase.

Thanks

Tomas
@trylek trylek force-pushed the CompositeR2RFormat branch from 29ef1ef to 0e5bf59 Compare June 24, 2019 18:35
@trylek
Copy link
Member Author

trylek commented Jun 24, 2019

@mjsabby - to answer your "global" question regarding framework in the R2R file, right now we only support one flavor of large version bubble where everything (i.e. including framework) is the same version bubble. As a natural consequence the initial implementation of composite R2R will probably also unconditionally include framework. Immediately everything will have tens to hundreds of MB which is probably OK for you but for the general audience (and probably even for you) this will merit some longer-term cleanup.

I believe that ultimately we should start supporting multiple large version bubbles and / or arbitrary combinations of large and small version bubbles so that we could e.g. have a "shared framework" as one version bubble (does it ring a bell w.r.t. the .NET Native shared library?), perhaps some "extended framework" on top of that and then the app itself, arbitrarily either combined or split whichever suits better a particular customer's needs w.r.t. performance, deployment, componentization and other applicable metrics.

I can imagine that even you might be able to improve perf by some careful code reuse (either w.r.t. framework or your custom code) at least due to reducing the working set and maybe module loads. One thing that's not yet clear to me is how to efficiently locate references in the composite R2R case - i.o.w. how to quickly figure out that some hypothetical "SharedFramework.r2r.dll" is the place to look for the System.Runtime assembly. That's something I'll need to look into with the app model team, @swaroop-sridhar may also have some ideas based on his single-file work.

Thanks for your detailed feedback!

Tomas

@jkotas
Copy link
Member

jkotas commented Jun 24, 2019

The last I looked, assemblies in the shared framework (TPAList?) always take precedence than the application folder ones,

FWIW, that is not correct statement today.

But yes, having the common framework to be one composite image is something definitely on our radar.

@@ -0,0 +1,94 @@
ReadyToRun File Format - Composite Format Design Notes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out, moved in 8th commit.

};
```

**TODO:** I suspect that `READYTORUN_SECTION_DEBUG_INFO`, `READYTORUN_SECTION_METHODCALL_THUNKS` and / or `READYTORUN_SECTION_INLINING_INFO` may also require changes specific to the composite R2R file format.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

READYTORUN_SECTION_DEBUG_INFO is functionally okay to just merge all the R2R code debug info into a single table mapping back to IL offsets / locals since it maps from native offset to offset within IL bodies, variable numbers. The debug engine will need to do the work of making sense of the merged assembly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed READYTORUN_SECTION_DEBUG_INFO from the 'suspect list' per your suggestion in 8th commit.


```C++
struct READYTORUN_SECTION_ASSEMBLIES_ENTRY
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to make this look like this:

    DWORD                   NumberOfSections;

    // Array of sections follows. The array entries are sorted by Type
    // READYTORUN_SECTION   Sections[];

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I apologize, I'm still not fully getting what you're proposing.

a) We can make the sections a jagged array so that each component assembly can potentially have a different NumberOfSections. That seems to correspond to what you're suggesting - but then the table will either require an additional indirection level for the section array, or the individual per-assembly entries would have different lengths and would require linear scanning that is generally perf-concerning.

b) We can make the sections a two-dimensional array but that would correspond to a global value for NumberOfSections (that should, in fact, correspond to a global set of section type ID's). Say, if we put NumberOfSections = 3 (corresponding to method entrypoints, available types and a new previously nonexistent section for the MSIL metadata), READYTORUN_SECTION elements 0-2 would correspond to AssemblyRef #1, 3-5 to AssemblyRef #2, 6-8 to AssemblyRef #3 and so on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just occurred to me that yet another way of approaching this would be dual encoding - in the presence of the COMPOSITE flag, the appropriate sections like available types would automatically switch over to a different mode where there's the additional indirection to cater for multiple per component-assembly entries. The advantage of this would be not having the appropriate tables unused in composite mode and not having to introduce a new ASSEMBLIES table, the downside is naturally the dual encoding.

Copy link
Member

@jkotas jkotas Jun 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make the sections a jagged array so that each component assembly can potentially have a different NumberOfSections

This is what I meant.

the individual per-assembly entries would have different lengths and would require linear scanning that is generally perf-concerning.

This is the same structure as what is used in R2R header today. The number of entries is small and this is a top level table that won't be scanned that often.

The advantage of it is that it is easy to version (add and remove from). I think it is pretty safe bet that there are going to be more than the 3 things that are here now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, but if we put this in the READYTORUN_SECTION_ASSEMBLIES table, it will have the number of entries equal to the number of component assemblies in the composite R2R image i.e. 1500 in Mukul's case. I concur with your overall strategy that we should leave perf optimizations for the future, when we have solid framework for measuring the actual benefits of particular changes, but isn't linear scanning of 1500 assemblies too much of an initial perf risk w.r.t. things like locating a particular MSIL metadata block for a given module override?

But perhaps you're right that the runtime is likely to mitigate some of these by means of various caches and the like and we should initially strive to make the implementation simple so that it's primarily easy to reason about semantically. Perf is a bonus we can always add later assuming we keep it in mind so that some of our design decisions don't block it.

Copy link
Member

@jkotas jkotas Jun 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect that runtime will need to hash the 1500 entries during startup anyway to understand what is loaded. Hashing 1500 entries is no big deal. It will take like 10 miliseconds. Note that we will be replacing opening of 1500 files in the current state with hashing 1500 entries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. One other interesting angle is @MichalStrehovsky's change where he's independently adding a precursor of what we're just discussing - a per component-assembly table.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, upon second thinking it might be best to leave that alone, after all that's mostly required for external files, not for the composite file, as discussed in other places on this PR. Originally, when I had a straight struct, I was imagining adding Michal's MVID as one of the fields, but that makes sense no longer.

@@ -4,10 +4,11 @@ ReadyToRun File Format
Revisions:
* 1.1 - [Jan Kotas](https://github.com/jkotas) - 2015
* 3.1 - [Tomas Rylek](https://github.com/trylek) - 2019
* 3.2 - [Tomas Rylek](https://github.com/trylek) - 2019 - *I'm not clear on whether we need to bump this up to 4.0*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe we'll need to bump to 4.0 unless a 3.0 format file is no longer loadable with our R2R loader.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was rather concerned by the fact that, as the 3.1 loader doesn't know about the new proposed COMPOSITE flag, it wouldn't recognize the composite file for what it is and load it incorrectly. I think this has mostly become moot with @jkotas` suggestion to drop the global COR header in the composite case as in such case a legacy loader has no way of trying to make sense of the new file.

@@ -119,6 +125,7 @@ enum ReadyToRunSectionType
READYTORUN_SECTION_PROFILEDATA_INFO = 111, // Added in V2.2
READYTORUN_SECTION_MANIFEST_METADATA = 112, // Added in V2.3
READYTORUN_SECTION_ATTRIBUTEPRESENCE = 113, // Added in V3.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ATTRIBUTEPRESENCE section is questionable in composite mode as implemented now, but it could be tweaked to be viable by adjusting the hashing function to include which assembly was a token associated with.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what we're also discussing here with @jkotas right now - basically some of the R2R sections need to be per-assembly in the composite case. One special case there is the input MSIL metadata that previously used to be the global singleton addressed by the COM descriptor directory; newly this also needs to be per component assembly. It would be ideal to end up with a generally usable and sufficiently performant design covering all the necessary tables.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I marked the available types, method entrypoints and the MSIL metadata as eligible; and I added a TODO for some I was unsure about; ATTRIBUTEPRESENCE is a big todo right now by itself, thanks for pointing it out. According to our current discussion with JanK we should probably add a jagged array of per component assembly sections to the composite file. As a potential alternative approach I suggested dual encoding - switching over the encoding for a known subset of the R2R sections to adopt the per-assembly behavior, most likely by means of an additional indirection.

trylek added 3 commits June 25, 2019 00:34
In view of JanK's comments about 'pleasant reading in plaintext mode'
I added manual linebreaks for some annoyingly long lines. I added
a few minor clarifications and fixes (e.g. there was a copy-paste bug
in the description of instance entrypoints that incorrectly referred
to available types). As I mentioned on a related e-mail thread,
I would love to figure out how to merge this in as a first cut in this
direction we can further iterate on, having a pending PR for several
months plain sucks.

Thanks

Tomas
@trylek
Copy link
Member Author

trylek commented Oct 1, 2019

I have reread my proposed change and I fixed several smaller inconsistencies I noticed as my doc has again entered discussion recently. I would appreciate if we could define the needed changes before this initial proposal is merged in so that others like Swaroop can build on it and describe the design for the expected runtime behavior, assembly loader logic, generic lookup etc.

@trylek trylek requested a review from swaroop-sridhar October 1, 2019 00:44

**Disclaimer:** The manifest metadata is a new feature that hasn't shipped yet; it involves straightforward adaptation of a fragile nGen technology to ReadyToRun images as an expedite means for enabling new functionality (larger version bubble support). The precise details of this encoding are still work in progress and likely to further evolve.
**Disclaimer:** The manifest metadata is a new feature that hasn't shipped yet; it involves
straightforward adaptation of a fragile nGen technology to ReadyToRun images as an expedite
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
straightforward adaptation of a fragile nGen technology to ReadyToRun images as an expedite
straightforward adaptation of a fragile NGen technology to ReadyToRun images as an expedite

@trylek
Copy link
Member Author

trylek commented Oct 29, 2019

@jkotas - as we'll need to close pending PR's prior to the consolidation, could you please let me know whether you're fine with me merging this in? Or should I just close the review with the assumption that we'll introduce the doc later in the combined repo? AFAIK I have addressed all feedback you previously shared.

@jkotas
Copy link
Member

jkotas commented Oct 29, 2019

I am not a big fan of merging half-baked changes to master. My preference would be to close it here and open it in a new repo once we are ready to start working on this again.

@trylek
Copy link
Member Author

trylek commented Oct 29, 2019

Works for me, thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants