Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esm: refactor hooks – add loader instances #49315

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions lib/internal/modules/esm/default_loader.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
'use strict';

const { defaultLoad } = require('internal/modules/esm/load');
const { defaultResolve } = require('internal/modules/esm/resolve');

exports.resolve = defaultResolve;
exports.load = defaultLoad;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we create a new file we might need to add it to the startup snapshot (see #45849). Alternatively is there a way to stick this in an existing file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I put this in its own file is so that the url property actually led to a valid file that represented the base "loader".

It can live in literally any file that can be treated as a loader (exports load, resolve, etc.) This code is the relevant block:

const defaultLoader = 'internal/modules/esm/default_loader';
this.addCustomLoader(`node:${defaultLoader}`, require(defaultLoader));

So that if we ever expose a list loaders or anything like that the url property is what you see above.

99 changes: 45 additions & 54 deletions lib/internal/modules/esm/hooks.js
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,6 @@ const {
} = require('internal/util');

const {
defaultResolve,
throwIfInvalidParentURL,
} = require('internal/modules/esm/resolve');
const {
Expand Down Expand Up @@ -87,45 +86,40 @@ let importMetaInitializer;
// [2] `validate...()`s throw the wrong error

class Hooks {
#chains = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove these? If you need them to start out empty, you can still leave the fields (resolve etc) and then also the code docs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that the built-in hooks are not treated as any kind of "special" object; they now work just like any other loader – you can see them being added in the constructor. This ensures that if we attach any data to a loader instance that the default node loader gets it too – right now it's just url.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. I figured that :) I think you can still leave everything else here and just initialise the lists as empty 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes true 😄 I can do that. Part of me considered that hard-coding the hook names in a bunch of places might be less clean… but it's more clear and that works for me 🎉

/**
* Prior to ESM loading. These are called once before any modules are started.
* @private
* @property {KeyedHook[]} globalPreload Last-in-first-out list of preload hooks.
*/
globalPreload: [],

/**
* Phase 1 of 2 in ESM loading.
* The output of the `resolve` chain of hooks is passed into the `load` chain of hooks.
* @private
* @property {KeyedHook[]} resolve Last-in-first-out collection of resolve hooks.
*/
resolve: [
{
fn: defaultResolve,
url: 'node:internal/modules/esm/resolve',
},
],

/**
* Phase 2 of 2 in ESM loading.
* @private
* @property {KeyedHook[]} load Last-in-first-out collection of loader hooks.
*/
load: [
{
fn: require('internal/modules/esm/load').defaultLoad,
url: 'node:internal/modules/esm/load',
},
],
};
#loaderInstances = [];
#chains = {};

// Cache URLs we've already validated to avoid repeated validation
#validatedUrls = new SafeSet();

allowImportMetaResolve = false;

constructor() {
const defaultLoader = 'internal/modules/esm/default_loader';
this.addCustomLoader(`node:${defaultLoader}`, require(defaultLoader));
}

#rebuildChain(name) {
const chain = this.#chains[name] = [];
let i = 0;
for (const instance of this.#loaderInstances) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't use for…of :( gotta use a plain ol' for

Copy link
Contributor

@JakobJingleheimer JakobJingleheimer Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

frowning about no for…of

💯 😢 😆

if (typeof instance[name] !== 'function') {
continue;
}
chain.push({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotta use primordials for these sorts of things ArrayPrototypePush.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Derp thanks 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
chain.push({
ArrayPrototypePush(chain, {
__proto__: null,

loader: instance,
fn: instance[name],
next: chain[i++ - 1],
});
}
}

#rebuildChains() {
Copy link
Contributor

@JakobJingleheimer JakobJingleheimer Aug 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a performance loss. I think it would be better for there to be only rebuildChains and for it to accept a list of chain names to rebuild, like

#rebuildChains(...names) {
  if (!names.length) {
    names = new Array('initialize','load','resolve'); // [1]
  }

  const shouldRebuild = { __proto__: null };

  for (let n = names.length - 1; n > -1; n--) {
    const name = names[n];
    shouldRebuild[name]: true;
    this.#chains[name].length = 0; // [2] Might need a primordial?
  }

  for (
    let i = 0,
        l = this.#loaderInstances.length - 1;
    i < l;
    i++
  ) {
    const {
      initialize,
      load,
      resolve,
    } = this.#loaderInstances[i];

    if (shouldRebuild.initialize) { /* … */ }
    if (shouldRebuild.load) { /* … */ }
    if (shouldRebuild.resolve) { /* … */ }
  }
}
this.#rebuildChains('resolve'); // Rebuild only `resolve` chain
this.#rebuildChains('load','resolve'); // Rebuild `load` and `resolve`
this.#rebuildChains(); // Rebuild all chains

[1] I believe this (new Array(…) instead of […]) will invoke an optimisation in V8 to choose the most appropriate specialised C++ array from the start and also to pre-allocation only exactly the space for these 3 items. If it doesn't, no need for the otherwise uncommon use of new Array() instead of [].

[2] We don't need to re-create the array, only empty it. For that, setting its length to 0 is far better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can make this happen 😄 Probably a bit of a µ-optimization considering how often #rebuildChains is called and the size of N… but it's not a problem to make it a little bit faster. If the actual chain execution was affected I would be more concerned, considering that is called on every import/require.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mm, it likely won't break the performance bank. But it's easy to avoid the de-op, so why not 🙂

this.#rebuildChain('globalPreload');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s wait to land this after #49144.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me 🎉

this.#rebuildChain('resolve');
this.#rebuildChain('load');
}

/**
* Import and register custom/user-defined module loader hook(s).
* @param {string} urlOrSpecifier
Expand Down Expand Up @@ -164,25 +158,26 @@ class Hooks {
emitExperimentalWarning(
'`globalPreload` is planned for removal in favor of `initialize`. `globalPreload`',
);
ArrayPrototypePush(this.#chains.globalPreload, { __proto__: null, fn: globalPreload, url });
}
if (resolve) {
const next = this.#chains.resolve[this.#chains.resolve.length - 1];
ArrayPrototypePush(this.#chains.resolve, { __proto__: null, fn: resolve, url, next });
}
if (load) {
const next = this.#chains.load[this.#chains.load.length - 1];
ArrayPrototypePush(this.#chains.load, { __proto__: null, fn: load, url, next });
}
return initialize?.(data);
const instance = {
__proto__: null,
url,
globalPreload,
initialize,
resolve,
load,
};
ArrayPrototypePush(this.#loaderInstances, instance);
this.#rebuildChains();
return initialize?.(data, { __proto__: null, id: instance.id, url });
}

/**
* Initialize `globalPreload` hooks.
*/
initializeGlobalPreload() {
const preloadScripts = [];
for (let i = this.#chains.globalPreload.length - 1; i >= 0; i--) {
for (const chainEntry of this.#chains.globalPreload) {
const { MessageChannel } = require('internal/worker/io');
const channel = new MessageChannel();
const {
Expand All @@ -193,10 +188,8 @@ class Hooks {
insidePreload.unref();
insideLoader.unref();

const {
fn: preload,
url: specifier,
} = this.#chains.globalPreload[i];
const preload = chainEntry.fn;
const specifier = chainEntry.loader.url;

const preloaded = preload({
port: insideLoader,
Expand Down Expand Up @@ -789,11 +782,9 @@ function pluckHooks({
function nextHookFactory(current, meta, { validateArgs, validateOutput }) {
// First, prepare the current
const { hookName } = meta;
const {
fn: hook,
url: hookFilePath,
next,
} = current;
Comment on lines 784 to -796
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did anything change here? It looks like it's now just less readable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The previous structure of chain:

type Fn = (arg0: any, context: any, next: Fn) => any;

interface ChainItem {
  fn: Fn
  url: string;
}

type Chain = ChainItem[];

The new structure of chain:

type Fn = (arg0: any, context: any, next: Fn) => any;

interface LoaderInstance {
  load?: Fn;
  resolve?: Fn;
  initialize?: Fn;
  url: string;
}

interface ChainItem {
  fn: Fn
  loader: LoaderInstance;
}

type Chain = ChainItem[];


const { next, fn: hook, loader } = current;
const { url: hookFilePath } = loader;

// ex 'nextResolve'
const nextHookName = `next${
Expand Down