Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to load two modules with the same URL throws an empty object #51694

Open
bpstrngr opened this issue Feb 7, 2024 · 8 comments
Open

Trying to load two modules with the same URL throws an empty object #51694

bpstrngr opened this issue Feb 7, 2024 · 8 comments
Labels
esm Issues and PRs related to the ECMAScript Modules implementation. loaders Issues and PRs related to ES module loaders

Comments

@bpstrngr
Copy link

bpstrngr commented Feb 7, 2024

In a loader module I'm handling sources with omitted type import attribute on JSON imports,
by assigning context.importAttributes.type in the load hook, while caching the outputs.

When the attributes are consistently omitted from JSON imports, or consistently aren't, this works well.
But when the import signature changes on one occasion, the process breaks
with an empty error object ({}) on returning the cached output.

I can't tell the reason for this, for as far as the loader is concerned,
the load hook output should be idempotent whether the input context.importAttributes.type is defined or not:

[Object: null prototype] {
  format: 'json',
  responseURL: 'file:///home/ranger/loaded.json',
  source: <Buffer 7b 22 61 22 3a 31 7d 0a>,
  shortCircuit: true
}

Here's a loader module to reproduce:

// loader.js
export const address = new URL(import.meta.url).pathname;
export const file = address.replace(/.*\//, "");

// register loader
if (
  !globalThis.window &&
  process.execArgv.some((flag) =>
    new RegExp("^--import[= ][^ ]*" + file).test(flag),
  )
)
  Promise.all(
    ["module", "worker_threads"].map((module) => import(module)),
  ).then(
    ([{ register }, { MessageChannel, isMainThread }]) =>
      isMainThread &&
      [new MessageChannel(), import.meta.url].reduce(
        ({ port1, port2 }, parentURL) =>
          register(address, parentURL, {
            data: { socket: port2 },
            transferList: [port2],
          }) || port1.on("message", console.log),
      ),
  );

function initialize({ socket }) {
  socket.postMessage(" Registered loader module:\n " + import.meta.url + ".");
}

const load = function (source, context, next) {
  if (this[source])
    // return cached value.
    return console.log(this[source]) || this[source];
  if (source.endsWith("json")) context.importAttributes.type = "json";
  return next(source, context).then(
    (context) =>
      // cache output
      (this[source] = Object.assign(context, { shortCircuit: true })),
  );
}.bind({});

export { initialize, load };

and here's a file with inconsistent attribute signatures to load
(in my real use case the inconsistency was in separate modules, one importing the other):

// loaded.js
 import json1 from "./loaded.json";
 import json2 from "./loaded.json" with {type:"json"};

console.log(json1===json2);

// loaded.json
{"a":1}

Result with node 21.1.0 (only notable difference is <Buffer > being empty in nextLoad's output already on the first import):
node --watch --import=./loader.js ./loaded.js

 Registered loader module:
 file:///home/ranger/loader.js.
(node:1498) ExperimentalWarning: Importing JSON modules is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)
[Object: null prototype] {
  format: 'json',
  responseURL: 'file:///home/ranger/loaded.json',
  source: <Buffer >,
  shortCircuit: true
}

node:internal/process/esm_loader:40
      internalBinding('errors').triggerUncaughtException(
                                ^
{}

Node.js v21.1.0
Failed running './loaded.js'
@bpstrngr
Copy link
Author

bpstrngr commented Feb 7, 2024

May not seem like a huge deal at first,
but if i intend to allow defaulting to json import attribute for imports of json extensions,
which is arguably reasonable,
if someone commits this inconsistency in the future, they would face this very hardly tracable error again (took a while for me to pinpoint where the problem occurs along the import chain).

@bpstrngr
Copy link
Author

bpstrngr commented Feb 7, 2024

loosely related to nodejs/loaders#163 , in being my second encounter with an untracable error message from beyond the "load" hook in a load sequence.

@GeoffreyBooth GeoffreyBooth transferred this issue from nodejs/loaders Feb 7, 2024
@GeoffreyBooth
Copy link
Member

@nodejs/loaders

@GeoffreyBooth GeoffreyBooth added esm Issues and PRs related to the ECMAScript Modules implementation. loaders Issues and PRs related to ES module loaders labels Feb 7, 2024
@aduh95
Copy link
Contributor

aduh95 commented Feb 13, 2024

I think that's the documented behavior:

node/doc/api/module.md

Lines 529 to 532 in ee5cfcb

Import type attributes are part of the cache key for saving loaded modules into
the internal module cache. The `resolve` hook is responsible for returning an
`importAttributes` object if the module should be cached with different
attributes than were present in the source code.

@bpstrngr
Copy link
Author

bpstrngr commented Feb 13, 2024

ah so the error actually is that import statements are statically analysed beyond "load", and it isn't matching the cached respective "resolve" entry? or, more likely, i figure it rather happens not after the "load" hook at all, but before the respective subsequent "resolve" hook, where the new signature mismatches with the cached value (so no explicit static analysis involved).

does it have to throw an empty object at that point though? can't it spell out "import signature for {source} in {parentUrl} at {line:character} doesn't match the cached entry to resolve."?

or if the intention behind including it in the cache key is that it should resolve different (even if the same) files by the given specifier, shouldn't it just differentiate and proceed with the hooks? which will then determine if they want to re-equivocate them from their own cache or sthing?

@aduh95
Copy link
Contributor

aduh95 commented Feb 13, 2024

export const address = new URL(import.meta.url).pathname;
export const file = address.replace(/.*\//, "");

// register loader
if (
  !globalThis.window &&
  process.execArgv.some((flag) =>
    new RegExp("^--import[= ][^ ]*" + file).test(flag),
  )
)

You probably already know this, and I suppose it's your way for protesting lack of a good API in Node.js, but this is horrendous – and very fragile. Please, I'm begging you, don't do that in real code.

does it have to throw an empty object at that point though? can't it spell out "import signature for {source} in {parentUrl} at {line:character} doesn't match the cached entry to resolve."?

The empty object is definitely not supposed to happen, if someone wants to investigate that that'd be awesome.

or if the intention behind including it in the cache key is that it should resolve different (even if the same) files by the given specifier, shouldn't it just differentiate and proceed with the hooks? which will then determine if they want to re-equivocate them from their own cache or sthing?

Both imports should resolve to the same module, by spec, so loading a different module is not a supported use-case. Resolving modules is the mission of the resolve hook.

@bpstrngr
Copy link
Author

bpstrngr commented Feb 13, 2024

haha you may assume more of me, either in smarts that i know what you mean or in stubbornness that i'm protesting; i don't see the fragility there, is it about the regexp? maybe the [^ ]* could be a bit more accurate... not really sure how to improve on it from top of my head. sorry about the distress, i'm all ears if you can specify what else makes it "fragile". edit: i see the "+file" could also be problematic, with some crazy url provided... and it not being escaped in the first place lol... that's noted. i shall just match the file path from the flag and check for equality.

Thanks for confirming the rest.

ps. if the specifier alone should exhaustively indicate module identity, i wonder why the attribute is part of the key. maybe it's just a hash of the arguments.. still my interpretation of import attributes was that it wouldn't be impossible for a file to contain ambiguous format (in a possible future where it's used for more than json, to load ambiguous syntaxes such as binaries, archives, symbolic links, vanilla vs type-annotations when native support lands for that, etc). anyway that's wild speculation, it's perfectly reasonable to assume one file stays in one format.
What's important is that in conclusion according to the spec you quoted then, the loader should let the resolve hook decide what importAttributes to return, and not fail before entering it. all clear.

@aduh95
Copy link
Contributor

aduh95 commented Feb 13, 2024

i don't see the fragility there, is it about the regexp?

Yes, file names can contain characters that have a special meaning in the context of a regex.
It's also fragile in the sense it can only detect --import=./relative/path/to/module.js, and breaks in other scenario (e.g. it won't catch any of --import ./relative/path/to/module.js, --import=somePackage/subpath/export, --import=#subpath/imports) and if Node.js ever introduces a an alias for --import, it won't detect it either.
All that to say: you should not try to detect if a file was --imported, it's not a supported use case. If you think you have a good use case, you should open an issue to propose adding a new API.

ps. if the specifier alone should exhaustively indicate module identity, i wonder why the attribute is part of the key. maybe it's just a hash of the arguments..

If you take the following code:

await Promise.all({
  import('someSpecifier'),
  import('someSpecifier', { with: { type: 'json' }}),
]);

You can imagine the race condition where we know someSpecifer can only resolve to a JSON module or a non-JSON module, but because load is async, we can't know which until one has succeeded, so we have to use the import attributes as part of the resolve cache.

@aduh95 aduh95 changed the title Nondescript failure over json import attribute inconsistency Trying to load two modules with the same URL throws an empty object Feb 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
esm Issues and PRs related to the ECMAScript Modules implementation. loaders Issues and PRs related to ES module loaders
Projects
None yet
Development

No branches or pull requests

3 participants