-
Notifications
You must be signed in to change notification settings - Fork 5k
Expand host_runtime_contract to get assembly names and resolve their paths #100503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…g path fields. Modified the host to read the values into these structures
Tagging subscribers to this area: @vitek-karas, @agocke, @VSadov |
@elinor-fung a couple of items I'm wondering about:
|
char** dirs; | ||
}; | ||
|
||
struct probing_path_properties |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this rather be a callback?
A performance flaw of the current system is that the host has to push paths to the runtime upfront. It introduces a lot of unnecessary copying and parsing. If we are redoing how this works, we should address this performance flaw. Pushing a list of full paths and base-names maybe a bit more efficient than the current system, but it still leaves a lot on the table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial thought was no callback, since the runtime requires all the paths on startup and would immediately have to call it. However, avoiding the extra copying would definitely be better than avoiding the extra call back to the host and would answer the question of the memory between host and runtime (for the properties, it was intentionally all allocated by the runtime).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Runtime does not fundamentally require all paths on startup. All that the runtime does with the paths during startup is to store them in a hashtable with simple name as the key.
Maintaining this hashtable on the host side would open a number of opportunities for memory optimizations. For example, we do not need to materialize all full paths eagerly. Instead, we can remember a few common prefixes and materialize the full path from the common prefix and suffix on demand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this hashtable
This hashtable needs to be case-insensitive (assembly binding is case-insensitive). Case-insensitive hashing and comparisons are tricky. The host does not seem to have the infrastructure for it.
So maybe the best design is to push the simple assembly names from the host eagerly, let the runtime maintain the hash table, but then use a callback to resolve the simple name into a full path. It would avoid potential mismatches in case-insensitive behavior for non-ascii file names, etc.
struct host_runtime_contract | ||
{ | ||
size_t size; | ||
|
||
// Context for the contract. Pass to functions taking a contract context. | ||
void* context; | ||
|
||
char* entry_assembly; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Should any new things be added to the end of this structure so that it grows in backward compatible way?
}); | ||
|
||
// Add to the TPA. | ||
tpa_list << tmp; | ||
} | ||
} | ||
|
||
tpa_items->assembly_count = static_cast<uint32_t>(name_path_map.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a number of apps out there that host coreclr.dll directly using TRUSTED_PLATFORM_ASSEMBLIES
and other key/value pairs. We had the direct coreclr.dll hosting documented at learning.microsoft.com for a while. There are scenarios where hosting via host policy is too inflexible.
Are we going to break the existing coreclr.dll hosting model based on TRUSTED_PLATFORM_ASSEMBLIES
and other key/value pairs or are we keeping it around?
If we are keeping it around, we may want to keep corerun intact so that we get some test coverage for it. corerun is our non-shipping test-only host and it is fine for it to leave some perf on table.
If we are going to break it, it will be a significant breaking change that will require a lot of work to push through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we going to break the existing coreclr.dll hosting model based on TRUSTED_PLATFORM_ASSEMBLIES and other key/value pairs or are we keeping it around?
We do not plan to break it. The runtime should continue to respect the key/value pairs (but hostpolicy / default hosting won't use it) and getting the properties (AppContext or hostfxr) should continue to work (but on demand).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not plan to break it.
Ok, we should think about having some test coverage. corerun
test host would be a good place for it - keeping it intact should be the easiest option.
As we have discussed before, it would be useful to agree on the design of the public APIs that we are going to introduce to address the scenario first. If this change is going to fix some of the performance issues of the current hosting scheme, that sounds good. I am not sure how it helps with #99883. |
}); | ||
|
||
// Add to the TPA. | ||
tpa_list << tmp; | ||
} | ||
} | ||
|
||
tpa_items->assembly_count = static_cast<uint32_t>(name_path_map.size()); | ||
tpa_items->assembly_filepaths = new char*[tpa_items->assembly_count]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, there is a lot of apps and libraries that depend on the existing documented key-value pairs in appcontext:
- https://learn.microsoft.com/en-us/dotnet/core/dependency-loading/default-probing
- https://stackoverflow.com/questions/75314798/is-there-a-net-framework-equivalent-to-cores-appcontext-getdatatrusted-platf
Are we going to break these or are these key/value pairs going to continue to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plan is not to break anything. I gather from @elinor-fung that the host_runtime_contract usage is intended to be gradual.
char** dirs; | ||
}; | ||
|
||
struct probing_path_properties |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming nit: I would not use "probing". One of the goals of the .NET Core redesign of hosting was to avoid spurious probing for files on disk. The current implementation is a bit half-way-there, but realistically that's more of a bug than feature. TPA is already on the no-probing plan. And native and resource files are actually also no-probing inside hostpolicy, we just condense that information into the list of paths when calling into CoreCLR. Now I don't know if this is the time to change it (it can be a breaking change for some apps), but at least in terms of naming I would like to be clear that "We don't want to probe in the runtime" (note that hostpolicy does some amount of probing, but it does it so that the runtime doesn't have to).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I anticipated having to rename things :-).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to add here that we have an existing structure called probe_paths_t
. We have to make sure that whatever we end up naming our probing paths stuff is clear enough to be differentiated from probe_paths_t
. To not make the code more complicated to understand :)
|
||
struct probing_path_properties | ||
{ | ||
trusted_platform_assemblies trusted_platform_assemblies; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another naming nit: If we are redoing how this works, we should drop "trusted" and "platform" from the names of new things.
This name is left-over from Silverlight. There is nothing trusted or platform about assemblies in this list. If we go with the callback, a good name can be resolve_assembly_to_path
- the callback would be native equivalent of existing managed API https://learn.microsoft.com/dotnet/api/system.runtime.loader.assemblydependencyresolver.resolveassemblytopath .
output->append(item.second.resolved_path); | ||
output->push_back(PATH_SEPARATOR); | ||
} | ||
// Now that we have the complete list, store it in the host_runtime_contract and as a ; delimted string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to store TPA list as an array instead of a string with delimiter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding, we want to either stick to the string or use both to keep backwards compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming I understand the feedback correctly, both in the runtime. What we'll angle to do is make the host by default use the callback method as opposed to passing in TRUSTED_PLATFORM_ASSEMBLIES
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the new model, the long TRUSTED_PLATFORM_ASSEMBLIES
string with all paths concatenated should be only created on demand when the user code asks for it; and we should make sure that it does not happen in any of our mainline scenarios.
…m assembly names and the ability to resolve their paths
Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it. |
Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it. |
This change continues down the path of making the host_runtime_contract structure the way the runtime can query information about the host.
get_assemblies
: function that gets the list of assemblies (name+extension) known by the host.destroy_assemblies
: function that cleans up the what was returned by get_assemblies when you're done with itresolve_assembly_to_path
: function that when given an assembly name (no extension) will return the full path known by the host.entry_assembly
: string containing the full path to the designated entry assemblyContributes to Allow modification of entry assembly and probing after startup hooks are run #99883