Description
There are 2 leading approach proposals for ESM loaders and chaining them.
Similarities
Both approaches:
- consolidate hooks into
resolve()
: finding the source (equivalent to the current experimentalresolve()
); returns {Object}format?
: see esm → getFormaturl
: same as current (see esm → resolve)
load()
: supplying the source (a combination of the current experimentalgetFormat()
,getSource()
, andtransformSource()
); returns {Object}format
: same as current (see esm → getFormat)source
: same as current (see esm → getSource)
- Hooks of the same kind (
resolve
andload
) are chained:- all
resolve()
s are executed (resolve1, resolve2, …resolveN) - all
load()
s are executed (load1, load2, …loadN)
- all
Differences
Next()
This approach is originally detailed in #36396.
Hooks are called in reverse order (last first): a hook's 3rd argument would be a next()
function, which is a reference to the previous loader hook. Ex there are 3 loaders: unpkg
, http-to-https
, and cache-buster
(cache-buster
is the final loader in the chain):
cache-buster
invokes http-to-https
, which in turn invokes unpkg
(which itself invokes Node's default):
cache-buster
← http-to-https
← unpkg
← Node's default
The user must actively connect the chain or it (likely) fails: If a hook does not call next
, "the loader short-circuits the chain and no further loaders are called".
Done()
This approach was also proposed in #36396 (in this comment).
The guiding principle of this approach is principal of least knowledge.
Hooks are called in the order they're declared/listed, and the return of the previous is fed as the input of the subsequent/next hook, and each hook is called automatically (unless short-circuited):
unpkg
→ http-to-https
→ cache-buster
(if none of the supplied loaders output valid values, node's default loader/hook is invoked, enabling a hook to potentially handle only part and avoid re-implementing the native functionality node already provides via the default hook).
Hooks have a done
argument, used in rare circumstances to short-circuit the chain.
Additionally, this proposal includes a polymorphic return:
hook returns | continue? | result | scenario |
---|---|---|---|
done(validValue) |
no | use validValue as final value (skipping any remaining loaders) |
module mocking (automated tests) |
false |
no | skip file (use empty string for value?) | file is not needed in current circumstances |
invalid value | no | throw and abort instead (current behaviour) | user error |
nullish | yes | loader did nothing: continue to next | loader isn't for the file type |
valid value | yes | pass value to next loader | expected use |
Examples
--loader https-loader \
--loader mock-loader \
--loader coffee-loader \
--loader other-loader
Resulting in https-loader
being invoked first, mock-loader
second, etc, and node's internal defaultLoader
last.
For illustrative purposes, I've separated resolve
and load
hooks into different code blocks, but they would actually appear in the same module IRL.
Resolve hook chain
HTTPS Loader
const httpProtocols = new Set([
'http:',
'https:',
]);
/**
* @param {Object} interimResult The result from the previous loader
* (if any previous loader returned anything).
* @param {string} [interimResult.format='']
* @param {string} [interimResult.url='']
* @param {string} context.originalSpecifier The original value of the import
* specifier
* @param {string?} context.parentUrl
* @param {function} defaultResolver The built-in Node.js resolver (handles
* built-in modules like `fs`, npm packages, etc)
* @param {function(finalResult?)} done A short-circuit function to break the
* resolve hook chain
* @returns {false|{format?: string, url: string}?} If participating, the hook
* resolves with a `url` and optionally a `format`
*/
export async function resolve(
interimResult,
// context,
// defaultResolver,
// done,
) {
let url;
try {
url = new URL(interimResult.url);
// there is a protocol and it's not one this loader supports: step aside.
if (!httpProtocols.has(url.protocol)) return;
}
catch (err) {
// specifier does not meet conditions for this loader; step aside.
if (!determineWhetherShouldHandle(interimResult)) return;
}
return {
url: '…',
};
}
Mock Loader
export async function resolve(
interimResult,
context,
defaultResolver,
// done,
) {
let url;
try { url = new URL(interimResult.url) }
catch (err) { url = new URL(defaultResolver(interimResult /* , … */).url) }
url.searchParams.set('__quibble', generation);
return {
url: url.toString(),
};
}
Load hook chain
HTTPS Loader
const contentTypeToFormat = new Map([
['text/coffeescript', 'coffeescript'],
['application/node', 'commonjs'],
['application/vnd.node.node', 'commonjs'],
['application/javascript', 'javascript'],
['text/javascript', 'javascript'],
['application/json', 'json'],
// …
]);
/**
* @param {Object} interimResult The result from the previous loader (if any
* previous loader returned anything)
* @param {string} [interimResult.format=''] Potentially a transient value. If
* the resolve chain settled with a `format`, that is the initial value here.
* @param {string|ArrayBufferView|TypedArray} [interimResult.source='']
* @param {Object} context
* @param {Array} context.conditions
* @param {string?} context.parentUrl
* @param {string} context.resolvedUrl The module's resolved url (as
* determined by the resolve hook chain).
* @param {Function} defaultLoader The built-in Node.js loader (handles file
* and data URLs).
* @param {Function} done A terminating function to break the load hook chain;
* done accepts a single argument, which is used for the final result of the
* load hook chain.
*/
export async function load(
interimResult,
{ resolvedUrl },
// defaultLoader,
// done,
) {
if (interimResult.source) return; // step aside (content already retrieved)
const url = new URL(resolvedUrl);
if (!httpProtocols.has(url.protocol)) return; // step aside
const result = await new Promise((res, rej) => {
get(resolvedUrl, (rsp) => {
const format = contentTypeToFormat.get(rsp.headers['content-type']);
let source = '';
rsp.on('data', (chunk) => source += chunk);
rsp.on('end', () => res({
format,
source,
}));
rsp.on('error', (err) => rej(err));
});
});
return result;
}
Mock Loader
export async function load(
interimResult,
{ resolvedUrl },
defaultLoader,
// done,
) {
const isQuibbly = (new URL(resolvedUrl)).searchParams.get('__quibble');
if (!isQuibbly) return;
const mock = defaultLoader(urlToMock); // or some runtime-supplied mock
return { source: mock };
}
CoffeeScript Loader
const exts = new Set([
'.coffee',
'.coffee.md',
'.litcoffee',
]);
export async function load(
interimResult,
context,
defaultLoader,
// done,
) {
if (
!!interimResult.format
&& interimResult.format !== 'coffeescript'
) return; // step aside
const ext = extname(context.resolvedUrl);
if (!exts.has(ext)) return; // step aside
const rawSource = interimResult.source || defaultLoader(
{
format: 'coffeescript', // defaultLoader currently doesn't actually care
},
context
).source;
const transformedSource = coffee.compile(rawSource.toString(), {
whateverOptionSpecifies: 'module'
});
return {
format: 'module',
source: transformedSource,
};
}
Updates to ESMLoader.load()
class ESMLoader {
async load(resolvedUrl, moduleContext, resolvedFormat = '') {
const context = {
...moduleContext,
resolvedUrl,
}
let shortCircuited = false; // should we support calling done with no arg?
let finalResult;
let format = resolvedFormat;
let source = '';
function done(result) {
finalResult = result;
shortCircuited = true;
}
for (let i = 0, count = this.loaders.length; i < count; i++) {
const tmpResult = await loader(
{ format, source },
context,
defaultLoader,
done,
);
if (shortCircuited) break;
if (tmpResult == null) continue; // loader opted out
if (tmpResult === false) {
finalResult = { source: '' };
break;
}
if (tmpResult?.format != null) format = tmpResult.format;
if (tmpResult?.source != null) source = tmpResult.source;
}
finalResult ??= interimResult;
// various existing result checks and error throwing
}
}
Concerns Raised
Next()
- This creates an Inception-like pattern, which could confuse users: loaders would be specified in a different sequence than called, as loaders are called in a nested manner: the final loader calls the previous, and the previous calls its previous, etc all the way back to the beginning.
- The
next
function does not behave as many current, well-known implementations behave (ex javascript's native generator'snext
is the inverse order to this's, and not calling ExpressJS's route-handler'snext
does not break the chain). - Requires the user to have specific knowledge:
next
is effectively required (not callingnext
will likely lead to adverse/undesirable behaviour, and in many cases, break in very confusing ways). - Unit testing is more difficult (requiring spying in almost all cases, whereas done's needs spying very rarely, by, likely, more advanced users)
Done()
This could potentially cause issue for APMs (does theAfter chatting with @bengl, it seems like this is not an issue as V8 exposes what they need.next
approach also?)A hook that unintentionally does not return / returns nullish might be difficult to track downI believe this was resolved in the previous issue discussion?