Skip to content

[embind] Use a single invoker mechanism #24577

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 17, 2025

Conversation

RReverser
Copy link
Collaborator

This is the next step in refactorings I started back in #20383 to merge all Embind generic method callers into a single mechanism. I never got around to landing this PR because I kept trying to split it into even smaller changes / steps, never found a good way to do so due to how entangled they were, got frustrated and left it alone.

However, aside from being a nice-to-have optimisation or cleanup refactoring, it's also an easy way to deal with issues like #24547 which makes it a bit more pressing. Without this PR, we'd have to duplicate the emval_as+emval_as_int64+emval_as_uint64 fix from #13889 for emval_call and emval_call_method as well, but after this PR everything goes through the same centralised implementation where we can deal with all Embind types in a consistent manner.

There are some more things that I'd like to clean up here (including some of the complexity added by this very own PR), but in order to try and keep review scope smaller, I'm going to submit them separately.

Fixes #24547.

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say I understand all of this but look like a good direction. I leave @brendandahl to do a full review.

@@ -305,26 +271,38 @@ var LibraryEmVal = {

#if !DYNAMIC_EXECUTION
var argN = new Array(argCount);
var invokerFunction = (obj, func, destructorsRef, args) => {
var invokerFunction = (handle, methodName, destructorsRef, args) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense to call the first argument receiver? Or is handle (the type of the receiver I guess?) more descriptive here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle as in generally EM_VAL. If the invoker kind is "method" then it's a handle of the receiver, if it's a "function" or a "constructor" then it's a handle of the function itself, and if it's a cast, then it's just null... A handle seemed like the most generic name.

@RReverser
Copy link
Collaborator Author

The wasm2js failure looks a bit suspicious:

error: undefined symbol: _emscripten_tempret_set (referenced by $setTempRet0, referenced by setTempRet0, referenced by _emval_invoke_i64, referenced by root reference (e.g. compiled C/C++ code))
warning: To disable errors for undefined symbols use -sERROR_ON_UNDEFINED_SYMBOLS=0
warning: __emscripten_tempret_set may need to be added to EXPORTED_FUNCTIONS if it arrives from a system library
Error: Aborting compilation due to previous errors
em++: error: '/root/emsdk/node/22.16.0_64bit/bin/node /root/project/tools/compiler.mjs -' failed (returned 1)

I guess it comes from makeReturn64, but then if the usage is injected automatically, shouldn't the dependency be added automatically as well by jsify? @sbc100

@sbc100
Copy link
Collaborator

sbc100 commented Jun 16, 2025

Yes, _emscripten_tempret_set is listed as a native dependency of $setTempRet.

If you build with EMCC_DEBUG=1 should should see something like /tmp/emscripten_temp/tmpz6ttjgoflibemscripten_js_symbols.so being passed to the linker which is a map of JS dependencies.

However this only handles native->JS dependencies, but what we have here is a case of the JS compiler itself generating calls to a JS function (effectively a JS->JS dependency), but I guess its being injected somehow too late (and the native link has already been done).

If you can share a repro case I can take a look.

@RReverser
Copy link
Collaborator Author

If you can share a repro case I can take a look.

I don't have a smaller one, I've only guessed what's happening by searching for keywords from the error message for now.

@sbc100
Copy link
Collaborator

sbc100 commented Jun 16, 2025

Does it fix the issue if you make the condition on line 536 of jsifier.js match the one on 314. It looks like they should be the same condition but they are not.

@RReverser
Copy link
Collaborator Author

No, still the same. I even tried making that dep unconditional just in case, but nothing:

.\test\runner wasm2js1.test_embind_unsigned
Running test_core: (1 tests)
test_embind_unsigned (test_core.wasm2js1) ... error: undefined symbol: _emscripten_tempret_set (referenced by $setTempRet0, referenced by setTempRet0, referenced by $ExitStatus, referenced by root reference (e.g. compiled C/C++ code))

Worth noting that setTempRet0 itself is discovered fine, so that condition shouldn't matter - the linker is complaining specifically about _emscripten_tempret_set.

Looks like that one is implemented in assembly, maybe it needs to be somehow exposed to wasm2js conversion?

@RReverser
Copy link
Collaborator Author

It's all the weirder that wasm2js0.test_embind_unsigned and wasm2js2.test_embind_unsigned pass fine but wasm2js1.test_embind_unsigned fails with said linker error.

Aren't those digits at the end just optimisation levels?

@sbc100
Copy link
Collaborator

sbc100 commented Jun 17, 2025

yes, you can see the setTempRet0 dependency is supposed to be added on line 530 (deps.push('setTempRet0')). Not that this needs to happen before the native link happens and the end result is that the native dependencies of setTempRet0 should be passed to wasm-ld with --export-if-defined=XX (for each native JS dependency).

@RReverser
Copy link
Collaborator Author

Sorry, I still don't understand.

yes, you can see the setTempRet0 dependency is supposed to be added on line 530 (deps.push('setTempRet0')).

But then why even making it unconditional doesn't fix the issue? And why is setTempRet0 the problem if it's added successfully already, just its own native dep isn't? And why does this not happen in wasm2js0 and wasm2js2?

I don't understand the mechanism here at all, so I have more questions than answers 😅

@sbc100
Copy link
Collaborator

sbc100 commented Jun 17, 2025

Sorry, I still don't understand.

yes, you can see the setTempRet0 dependency is supposed to be added on line 530 (deps.push('setTempRet0')).

But then why even making it unconditional doesn't fix the issue? And why is setTempRet0 the problem if it's added successfully already, just its own native dep isn't?

Answering just this part: We process the JS libraries files twice. Once in --symbol-only mode where we extract dependencies. We use this dependency information for figure out which native symbols we might need to --export-if-defined.

I think this issue is that while setTempRet0 is added to the deps here, it never actually ends up in DEFAULT_LIBRARY_FUNCS_TO_INCLUDE out on the python side. I think this is likely we use when searching for which native symbols to explicitly export with --export-if-defined

The code for this is here:

emscripten/tools/link.py

Lines 3104 to 3128 in 542dc42

if (not settings.SIDE_MODULE or settings.ASYNCIFY) and not shared.SKIP_SUBPROCS:
js_info = get_js_sym_info()
if not settings.SIDE_MODULE:
js_syms = js_info['deps']
if settings.LINKABLE:
for native_deps in js_syms.values():
settings.REQUIRED_EXPORTS += native_deps
else:
def add_js_deps(sym):
if sym in js_syms:
native_deps = js_syms[sym]
if native_deps:
settings.REQUIRED_EXPORTS += native_deps
for sym in settings.DEFAULT_LIBRARY_FUNCS_TO_INCLUDE:
add_js_deps(sym)
for sym in js_info['extraLibraryFuncs']:
add_js_deps(sym)
for sym in settings.EXPORTED_RUNTIME_METHODS:
add_js_deps(shared.demangle_c_symbol_name(sym))
for sym in settings.EXPORTED_FUNCTIONS:
add_js_deps(shared.demangle_c_symbol_name(sym))
if settings.ASYNCIFY:
settings.ASYNCIFY_IMPORTS_EXCEPT_JS_LIBS = settings.ASYNCIFY_IMPORTS[:]
settings.ASYNCIFY_IMPORTS += ['*.' + x for x in js_info['asyncFuncs']]

Specifically I think we are missing a call to add_js_deps for getTempRet0.

And why does this not happen in wasm2js0 and wasm2js2?

I imagine this is because at O0 getTempRet0 and/or its dependencies are exported for other reasons.

@RReverser
Copy link
Collaborator Author

I imagine this is because at O0 getTempRet0 and/or its dependencies are exported for other reasons.

That would make sense, but then why does O2 pass? Shouldn't it be more restrictive than O1 basically always?

Could you help me fix this one please? I still don't understand what the fix should look like.

@RReverser
Copy link
Collaborator Author

Wait it passed after a rebase, wat? Nothing relevant changed upstream that I can see.

@RReverser
Copy link
Collaborator Author

The CI failure other.test_jspi_async_function seems unrelated, likely a bad combination of #24578 + #24550 - I guess they raced on CI. @brendandahl

@RReverser RReverser requested a review from brendandahl June 17, 2025 10:26
@RReverser
Copy link
Collaborator Author

@brendandahl Can you please fix CI? I don't think I can merge until then (unless someone with force-merge rights wants to do it).

@RReverser
Copy link
Collaborator Author

Can you please fix CI?

For #24550 I mean.

@sbc100
Copy link
Collaborator

sbc100 commented Jun 17, 2025

Can you please fix CI?

For #24550 I mean.

#24590

This is the next step in refactorings I started back in emscripten-core#20383 to merge all Embind generic method callers into a single mechanism. I never got around to landing this PR because I kept trying to split it into even smaller changes / steps, never found a good way to do so due to how entangled they were, got frustrated and left it alone.

However, aside from being a nice-to-have optimisation or cleanup refactoring, it's also an easy way to deal with issues like emscripten-core#24547 which makes it a bit more pressing. Without this PR, we'd have to duplicate the `emval_as`+`emval_as_int64`+`emval_as_uint64` fix from emscripten-core#13889 for `emval_call` and `emval_call_method` as well, but after this PR everything goes through the same centralised implementation where we can deal with all Embind types in a consistent manner.

There are some more things that I'd like to clean up here (including some of the complexity added by this very own PR), but in order to try and keep review scope smaller, I'm going to submit them separately.

Fixes emscripten-core#24547.
@RReverser RReverser enabled auto-merge (squash) June 17, 2025 17:06
@RReverser
Copy link
Collaborator Author

FAIL [1.119s]: test_fetch_redirect (test_browser.browser64_4gb)

Given how often it flakes lately, it really should use a local server instead of relying on httpbin.org... @sbc100 can you merge manually please? I already did rerun and it takes forever only to run into another flake.

@sbc100
Copy link
Collaborator

sbc100 commented Jun 17, 2025

FAIL [1.119s]: test_fetch_redirect (test_browser.browser64_4gb)

Given how often it flakes lately, it really should use a local server instead of relying on httpbin.org... @sbc100 can you merge manually please? I already did rerun and it takes forever only to run into another flake.

Agreed. Would you open a bug for that?

@sbc100 sbc100 disabled auto-merge June 17, 2025 20:19
@sbc100 sbc100 merged commit d8dda21 into emscripten-core:main Jun 17, 2025
28 of 30 checks passed
@RReverser RReverser deleted the single-caller branch June 17, 2025 20:54
@RReverser
Copy link
Collaborator Author

Thanks, and done: #24597

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

val.call<T>() can't return 64-bit integers
3 participants