-
Notifications
You must be signed in to change notification settings - Fork 29.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add API for dumping or inspecting the consolidated CJS and EJS module graph #52180
Comments
I welcome and approve this proposal, for what it matters, I don’t welcome the type suggestion … it’s not super hard to have a consistent naming convention, and we have a super simple Boolean like value of the thank you. |
amend: by mandatory I mean in the module case, commonjs can be the OR default if module is not matched. |
It kind of depends on what you mean. For example, imported CJS have both a |
I think with the current data structure it is probably only possible to provide Also I guess this isn't necessarily tied to |
I think the important bit of the feature request is setting the stage for introspection or debugging. So from that angle, I think we'd already be adding a lot of value even if the we limit things to Also the reason I was asking about whether a source file could have separate ESM and CommonJS representations is that it would mean that the module identifier alone would be insufficient to uniquely identify that node. If that's the case, the entries in the @WebReflection as for naming things, I have no strong opinions but wanted to get an idea that could be easily understood out there. |
That is not supposed to happen. Though I think there's always the possibilities of bugs (both the module loaders are heavily undertested but it seems the world just have been living with them okay...). |
"when in Rome" I think the type should reflect the parsing goal and its source:
I don’t think exist other use cases, but I agree just module or commonjs would already be a great starting point. edit in case it wasn’t clear, these types are not for the importer, rather for the imported module as graph … to showcase how modules get imported or bootstrapped. |
I think it would make more sense to put the the type of reference in the edges, not in the nodes. So a.cjs (commonjs) ->(require) -> b.mjs (module) or a.cjs (commonjs) -> (import) -> b.mjs (module) can both be represented without duplication. The type name can just follow the "format" enum we have internally for the ESM "translators" (or an extended version of it, because the addons are currently represented by the commonjs format internally). |
Yesterday I updated the example data structure to differentiate top-level and dependency-level I've taken a stab at encoding it as a type definition: type ModuleReferenceImport = "import";
type ModuleReferenceRequire = "require";
type ModuleReferenceKind = ModuleReferenceImport | ModuleReferenceRequire;
interface ModuleReference {
readonly reference_type: ModuleReferenceKind;
readonly specifier: string;
readonly resolved_url: string;
// I think we probably also need this since two otherwise similar imports
// could have different behavior based on the import atributes.
// This could also be a `ReadonlyMap`.
readonly import_attributes?: Readonly<ImportAttributes>;
}
type ModuleTypeESM = "module";
type ModuleTypeCommonJS = "commonjs";
type ModuleTypeJSON = "json";
type ModuleTypeAddon = "addon";
type ModuleTypeBuiltin = "builtin";
type ModuleType =
| ModuleTypeESM
| ModuleTypeCommonJS
| ModuleTypeJSON
| ModuleTypeAddon
| ModuleTypeBuiltin;
interface Module {
readonly module_type: ModuleType;
readonly url: string;
readonly module_requests: ReadonlyArray<ModuleReference>;
} |
The type definition looks great. Some modification I'd suggest:
|
Thanks @joyeecheung I'll edit to get the terminology aligned. |
Somewhat related: with #52219 it should be possible to just implement this in the user land. There might still be value to have some sort of "standardized" format in core. But the universal hooks allows creating a prototype in the user land. |
Hope you don't mind if we keep this open for the time being. I know that a 'universal loader' will be a hotly debated topic 😉. I love the idea of being able to solve this from user-land but I think the spirit of this feature request is not fully addressed by this sort of capability. I'm anticipating a flood of "module XYZ is broken in node 22" style issues. I'm thinking that some standardized, repeatable way to dump the module graph would give maintainers and users alike the ability to communicate on exactly what the graph looks like at a point in time. Having this in user-land might solve that problem but then involves further mutating the graph to include this user-land loader. Another thing is that a user-land solution might not be able to have 100% confidence that what it has seen is the actual module graph in memory. |
I imagine if this gets implemented in core there are two possible ways:
1 would be the unlikely choice if universal module loader hooks are implemented (we probably wouldn't want to insert both the hook code and the graph stashing code everywhere again, it would be a lot cleaner to just insert the hook code everywhere, and use the hooks to stash the graph information, even internally). Note that internally currently there's no such a graph like this maintained in core (neither for ESM, nor for CJS). Node.js dones't even retain the original module specifier once it gets resolved (I don't know why, but the code has always been like this), and finding the parent/children can be difficult/hacky especially for CJS since people monkey-patch those references. If you look at the code you could also find some pretty silly re-computation of information everywhere because a lot of those aren't retained once computed (not sure why, probably to trade performance for memory), and the recomputed information probably isn't 100% consistent either (this is especially problematic for the monkey-patchable |
@nodejs/loaders |
I think this should already be possible with the module hooks that we have today. Define a custom Before anything new gets created, do you mind seeing how far you can get with what already exists? |
There has been no activity on this feature request for 5 months. To help maintain relevant open issues, please add the
never-stale
|
What is the problem this feature will solve?
With the #51977 having landed, we're moving to a world where the logical graph will cross freely to and from CJS and ESM. I anticipate that this will break some stuff in unpredictable and hard-to-debug ways given the long tail of wacky stuff that has made its way into NPM.
These hard-to-debug and hard-to-anticipate situations will be difficult to detect and inspect from user-land code. The module graph is private state held by the runtime and V8. Having a way to extract a read-only view of this will allow tools and users to inspect, share and fix these situations.
What is the feature you are proposing to solve the problem?
I propose an API that would allow the caller to obtain a read-only representation of the module graph across both CJS and ESM nodes.
As a straw man, let's imagine
require('node:module').getModuleGraph()
. It could provide data with providing the following:URL
).ESM
vsCommonJS
).require
,import_dynamic
,import
, .. anything else?It might look like this:
Side-note questions:
What alternatives have you considered?
N/A. This is uncharted territory! 🚀
The text was updated successfully, but these errors were encountered: