Description
So for the last six months or so, the top priority of the loaders work has been fixing the last bug standing in the way of stability, where the separate thread we spawn to run module customization hooks is unintentionally duplicated for each application worker thread rather than shared between all of them. A (stable) fix might now be almost at hand, but it was raised that we should perhaps reevaluate whether the intended behavior is what we want. As I see things, we have three options, with varying pros and cons:
-
Keep things as they are, where there’s one hooks thread per application thread.
- Pro: Already built; the hooks are running in fully isolated environments separate from each other, similar to the isolation provided by the worker threads. (Note that other methods of isolation might be on the horizon such as
NodeRealm
,ShadowRealm
or module compartments.) - Con: Wasteful of memory (an application with a main plus N worker threads will have N+1 hooks threads); and duplicative for many common use cases such as transpilers, where the same work such as transpiling a file might happen repetitively in each hooks thread rather than the result/state being shared between all of them.
- Pro: Already built; the hooks are running in fully isolated environments separate from each other, similar to the isolation provided by the worker threads. (Note that other methods of isolation might be on the horizon such as
-
Run the hooks in a dedicated thread that supports all threads in the process, with a separate chain of hooks for each application thread.
- Pro: Conserves memory as compared with multiple hooks threads; sharing of state across all application threads can reduce processing for otherwise repeated work, or enable use cases like mocks that need to maintain state at the process level.
- Con: Hooks would need to be intentional to keep state per thread rather than per process.
-
Move the module customization hooks back on the main thread and change the
resolve
andload
hooks to be synchronous. (A big reason that the threads are off-thread currenly is so that async hooks can affect sync CommonJSrequire
calls.) Similar to (or could be combined with/replaced by) @joyeecheung’s proposal for synchronous main thread hooks.- Pro: Could achieve many (all?) of the goals of Joyee’s proposal, of providing a supported API to replace CommonJS monkey-patching; avoids the overhead of spawning a separate hooks thread; probably easier to develop hooks for, assuming that isolation is not desired.
- Con: All hooks would need to be sync functions; no isolation of hooks code from application code.
I’m not sure if it would be technically feasible to support async hooks for import
and sync hooks for require
; and on a product level I don’t think we want separate hooks APIs for require
and import
. If others diagree with either of those assessments then I guess we could have fourth and/or fifth options to consider. I honestly don’t have a preference between any of the options; I just want to get this API stable, ideally before Node 23 in October.
Aside from the overall “which option do you prefer” question, some other questions I have for the folks who have been with us on this journey:
-
How necessary is it for the
resolve
andload
hooks to be async? Do you commonly write asyncresolve
and/orload
hooks, where refactoring it to be sync would be difficult or impossible? (We are considering taking the “syncify async work via a separate thread” code from the hooks thread and splitting it off into a userland package that libraries could use as a general utility. We might also provide some kind of sync network request API similar toreadFileSync
.) -
How necessary is isolation for hooks code, and on what level is it desired? Even the current API doesn’t isolate hooks from other hooks running in the same chain; they’re only isolated from application code.
@nodejs/loaders plus the participants of #103: @cspotcode @giltayar @arcanis @bumblehead @bizob2828 plus participants of nodejs/node#53332: @VoltrexKeyva @Flarna @alan-agius4