Skip to content

Comments

[wasm][coreclr] Improve primary reverse thunk lookup#124730

Open
radekdoulik wants to merge 1 commit intodotnet:mainfrom
radekdoulik:clr-wasm-improve-reverse-thunks
Open

[wasm][coreclr] Improve primary reverse thunk lookup#124730
radekdoulik wants to merge 1 commit intodotnet:mainfrom
radekdoulik:clr-wasm-improve-reverse-thunks

Conversation

@radekdoulik
Copy link
Member

The previous reverse thunk lookup keyed on AssemblyFQName.Hash() ^ token, which could produce undetectable collisions — tokens can change across builds while the display name stays the same, so a hash match doesn't guarantee the correct thunk. This already happened after #124303 landed and #123377 started failing.

This change switches the primary key to MVID.Hash() ^ token. The MVID changes with every compilation, so we can detect whether the tokens used to build the key are still valid. After a hash match, the lookup now verifies the MVID and token exactly. The fallback path similarly verifies the full fallback source string after its hash lookup.

Also added diagnostic logging.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request improves the WASM/CoreCLR reverse thunk lookup mechanism to prevent undetectable hash collisions that were occurring after recent changes to the runtime. The previous implementation used AssemblyFQName.Hash() ^ token as the lookup key, which could produce collisions when tokens changed across builds while the assembly name remained the same. The fix switches to using MVID.Hash() ^ token as the primary key and adds exact verification of both the MVID and token after a hash match to detect stale entries.

Changes:

  • Replaced assembly fully-qualified name with MVID (Module Version ID) in the primary hash key computation, ensuring the key changes with every compilation
  • Added exact verification of MVID string and token after hash lookups to detect collisions
  • Added exact verification of fallback source string after fallback hash lookups
  • Added diagnostic logging to aid in debugging reverse thunk lookup issues

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/tasks/WasmAppBuilder/coreclr/PInvokeCollector.cs Captures MVID from each module for use in hash key generation
src/tasks/WasmAppBuilder/coreclr/PInvokeTableGenerator.cs Generates MVID array and embeds Token, MVIDIndex, and FallbackSource in ReverseThunkMapValue for runtime verification
src/coreclr/vm/wasm/callhelpers.hpp Adds Token, MVIDIndex, and FallbackSource fields to ReverseThunkMapValue structure
src/coreclr/vm/wasm/helpers.cpp Implements MVID-based key creation and adds exact verification logic for both primary and fallback lookups, plus diagnostic logging
src/coreclr/vm/wasm/callhelpers-reverse.cpp Generated file containing the MVID array and updated reverse thunk entries with new verification fields

var fsName = FixedSymbolName(cb, Log);

return $" {{ {cb.Token ^ HashString(cb.AssemblyFQName)}, {HashString(cb.Key)}, {{ &MD_{fsName}, (void*)&Call_{cb.EntrySymbol} }} }} /* alternate key source: {cb.Key} */";
return $" {{ {cb.Token ^ HashString(cb.MVID)}, {HashString(cb.Key)}, {{ &MD_{fsName}, (void*)&Call_{cb.EntrySymbol}, {cb.Token}, {mvidIndexMap[cb.MVID]}, \"{cb.Key}\" }} }}";
Copy link

Copilot AI Feb 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cb.Key value is embedded in the generated C++ string literal without escaping. If the key contains special characters like quotes or backslashes (e.g., from compiler-generated method names or unusual namespaces), the generated C++ code will be invalid. The key should be escaped using EscapeLiteral(cb.Key) instead of directly embedding cb.Key in the string.

Suggested change
return $" {{ {cb.Token ^ HashString(cb.MVID)}, {HashString(cb.Key)}, {{ &MD_{fsName}, (void*)&Call_{cb.EntrySymbol}, {cb.Token}, {mvidIndexMap[cb.MVID]}, \"{cb.Key}\" }} }}";
return $" {{ {cb.Token ^ HashString(cb.MVID)}, {HashString(cb.Key)}, {{ &MD_{fsName}, (void*)&Call_{cb.EntrySymbol}, {cb.Token}, {mvidIndexMap[cb.MVID]}, \"{EscapeLiteral(cb.Key)}\" }} }}";

Copilot uses AI. Check for mistakes.
@jkotas
Copy link
Member

jkotas commented Feb 22, 2026

Do we need to bother with matching tokens at all in the temporary call helpers solution? Corelib and other assemblies are changing all the time, so the checked in call helpers map is going to use the fallback path like 99% of the time.

Once we teach crossgen how to generate this map. I expect that we are going to use the optimized native R2R hashtable to store these. We won't use the token matching algorithm used by call helpers currently.

@radekdoulik
Copy link
Member Author

Do we need to bother with matching tokens at all in the temporary call helpers solution?

Not really, it was nice exercise with copilot to fix the broken primary keys. The other parts are more useful from this view point. The validity checks and logging. That would be useful to investigate the problem we had, it was painful to find out what broke.

@pavelsavara pavelsavara added the arch-wasm WebAssembly architecture label Feb 23, 2026
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to 'arch-wasm': @lewing, @pavelsavara
See info in area-owners.md if you want to be subscribed.

@jkotas
Copy link
Member

jkotas commented Feb 23, 2026

Not really,

Can we delete the code that deals with MVIDs and tokens from this PR then?

Copy link
Member

@AaronRobinsonMSFT AaronRobinsonMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Jan regarding the fallback path being the most common case. Let's see what we can remove here.

const char* pszLookupName = pMD->GetMethodTable()->GetFullyQualifiedNameInfo(&pszLookupNamespace);
LOG((LF_STUBS, LL_INFO100000, "WASM lookupThunk pMD: %s.%s::%s\n", pszLookupNamespace ? pszLookupNamespace : "", pszLookupName, pMD->GetName()));
}
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#endif
#endif // LOGGING

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arch-wasm WebAssembly architecture area-VM-coreclr

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants