Skip to content

module: add module.detectSyntax #57731

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions doc/api/module.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,47 @@ const require = createRequire(import.meta.url);
const siblingModule = require('./sibling-module');
```

### `module.detectSyntax(code)`

<!-- YAML
added: REPLACEME
-->

> Stability: 1.0 - Early development

The `module.detectSyntax()` method attempts to determine the syntax of a given source code string.
This detection is performed only by parsing the source code; there is no evaluation or reading of file extensions
or reading of `package.json` files.
Only TypeScript syntax that can be erased is supported.

* `code` {string} The code to detect the syntax of.
* Returns: {string|undefined} The detected syntax of the code. Possible values are:
* `'commonjs'` for CommonJS module code.
* `'module'` for ECMAScript module code.
* `'commonjs-typescript'` for CommonJS TypeScript code.
* `'module-typescript'` for ECMAScript TypeScript code.
* `undefined` if the syntax cannot be detected.
**Default:** `'commonjs'`.

```mjs
import { detectSyntax } from 'node:module';
detectSyntax('export const a = 1;'); // 'module'
detectSyntax('const a = 1;'); // 'commonjs'
detectSyntax('export const a: number = 1;'); // 'module-typescript'
detectSyntax('const a: number = 1;'); // 'commonjs-typescript'
detectSyntax('const foo;'); // Invalid syntax, returns undefined
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should instead return two booleans where one indicates if the code contained module specific syntax and the other whether it contained typescript syntax? Stating const a = 1; is commonjs isn't exactly accurate because that can't be determined solely from the text, as alluded to in the description.


There are some limitations to the detection, where the syntax cannot be determined because it's ambiguous:

```mjs
import { detectSyntax } from 'node:module';
// This could be CommonJS if we interpret < and > as greater and lesser,
// or it could be TypeScript if we interpret them as type parameters.
// In this case, it will be interpreted as TypeScript.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems potentially problematic, since TS isn't a syntactic superset of JS. Perhaps in this case it could return an array of two choices?

Copy link
Member Author

@marco-ippolito marco-ippolito Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's an edge case, how do you detect it? 😄 I guess it has to be clear its heuristic, it trades some correctness for convenience

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it can be parsed as both, then it's the edge case.

since this function is intended to be used by libraries and not by most users, it seems like correctness is the better tradeoff to make here. Doing result?.[0] ?? result isn't that inconvenient, but allows the choice and visibility into the edge cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only way I see to disambiguate the syntax is to compileScript and see if it works, and attempt to strip types and if the output is different, it means its an edge case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that seems fine to me.

detectSyntax('foo<Object>("bar")'); // 'commonjs-typescript'
```

### `module.findPackageJSON(specifier[, base])`

<!-- YAML
Expand Down
39 changes: 38 additions & 1 deletion lib/internal/modules/typescript.js
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@ const {
saveCompileCacheEntry,
cachedCodeTypes: { kStrippedTypeScript, kTransformedTypeScript, kTransformedTypeScriptWithSourceMaps },
} = internalBinding('modules');

const {
containsModuleSyntax,
} = internalBinding('contextify');
/**
* The TypeScript parsing mode, either 'strip-only' or 'transform'.
* @type {string}
Expand Down Expand Up @@ -205,6 +207,40 @@ function stripTypeScriptModuleTypes(source, filename, emitWarning = true) {
return transpiled;
}

/**
* Determine whether the given source contains CommonJS,
* ES module, or TypeScript Syntax.
* @param {string} source
* @returns {string | undefined} The syntax type, or undefined if it can't be determined.
*/
function detectSyntax(source) {
validateString(source, 'source');
// If the source contains module syntax, it's an ES module.
// But we don't know if it's TypeScript yet.
const defaultModule = containsModuleSyntax(source, '') ? 'module' : 'commonjs';
try {
// Strip the types, if the output is the same
// it means there was no TypeScript syntax.
// This is almost always true, except for some edge cases.
const stripped = stripTypeScriptTypes(source);
return stripped === source ? defaultModule : `${defaultModule}-typescript`;
} catch (error) {
switch (error?.code) {
// In case there is unsupported TypeScript syntax,
// we know it's typescript but cannot determine the syntax.
case 'ERR_UNSUPPORTED_TYPESCRIPT_SYNTAX':
return;
// In this case we can't determine the syntax
// because the syntax is invalid.
case 'ERR_INVALID_TYPESCRIPT_SYNTAX':
return;
// Probably SWC crashed.
default:
throw error;
}
}
}

/**
*
* @param {string} code The compiled code.
Expand All @@ -220,6 +256,7 @@ function addSourceMap(code, sourceMap) {
}

module.exports = {
detectSyntax,
stripTypeScriptModuleTypes,
stripTypeScriptTypes,
};
3 changes: 2 additions & 1 deletion lib/module.js
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,11 @@ const {
const {
findPackageJSON,
} = require('internal/modules/package_json_reader');
const { stripTypeScriptTypes } = require('internal/modules/typescript');
const { stripTypeScriptTypes, detectSyntax } = require('internal/modules/typescript');

Module.register = register;
Module.constants = constants;
Module.detectSyntax = detectSyntax;
Module.enableCompileCache = enableCompileCache;
Module.findPackageJSON = findPackageJSON;
Module.flushCompileCache = flushCompileCache;
Expand Down
65 changes: 65 additions & 0 deletions test/es-module/test-typescript-detect-syntax.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
import { skip } from '../common/index.mjs';
import { test } from 'node:test';
import { detectSyntax } from 'node:module';
import { strictEqual, throws } from 'node:assert';

if (!process.config.variables.node_use_amaro) skip('Requires Amaro');

test('basic input validation', () => {
const cases = [{}, undefined, 5, null, false, Symbol('foo'), [], () => { }];

for (const value of cases) {
throws(() => detectSyntax(value), {
name: 'TypeError',
});
}
});

test('detectSyntax', () => {
const cases = [
{ source: `const x = require('fs');`, expected: 'commonjs' },
{ source: `import fs from 'fs';`, expected: 'module' },
{ source: `const x: number = 5;`, expected: 'commonjs-typescript' },
{ source: `interface User { name: string; }`, expected: 'commonjs-typescript' },
{ source: `import fs from 'fs'; const x: number = 5;`, expected: 'module-typescript' },
{ source: `const x: number = 5;`, expected: 'commonjs-typescript' },
{ source: `function add(a: number, b: number) { return a + b; }`, expected: 'commonjs-typescript' },
{ source: `type X = unknown; function fail(): X { return 5; }`, expected: 'commonjs-typescript' },
{ source: `const x: never = 5;`, expected: 'commonjs-typescript' },
{ source: `import foo from "bar"; const foo: string = "bar";`, expected: 'module-typescript' },
{ source: `const foo: string = "bar";`, expected: 'commonjs-typescript' },
{ source: `import foo from "bar";`, expected: 'module' },
{ source: `const foo = "bar";`, expected: 'commonjs' },
{ source: `module.exports = {};`, expected: 'commonjs' },
{ source: `exports.foo = 42;`, expected: 'commonjs' },
{ source: `export default function () {};`, expected: 'module' },
{ source: `export const foo = 42;`, expected: 'module' },
{ source: `const x: number = 5;`, expected: 'commonjs-typescript' },
{ source: `interface User { name: string; }`, expected: 'commonjs-typescript' },
{ source: `import fs from 'fs'; const x: number = 5;`, expected: 'module-typescript' },
{ source: `type X = unknown; function fail(): X { return 5; }`, expected: 'commonjs-typescript' },
{ source: `foo<Object>("bar");`, expected: 'commonjs-typescript' },
{ source: `foo<T, U>(arg);`, expected: 'commonjs-typescript' },
{ source: `<div>hello</div>;`, expected: undefined },
{ source: `const el = <span>{foo}</span>;`, expected: undefined },
{ source: `/** @type {string} */ let foo;`, expected: 'commonjs' },
{ source: `/** @param {number} x */ function square(x) { return x * x; }`, expected: 'commonjs' },
{ source: `function foo(x: number): string { return x.toString(); }`, expected: 'commonjs-typescript' },
// Decorators are ignored by the TypeScript parser
{ source: `@Component class MyComponent {}`, expected: 'commonjs' },
{ source: `const x: never = 5;`, expected: 'commonjs-typescript' },
{ source: `import type { Foo } from './types';`, expected: 'module-typescript' },
{ source: `import { type Foo } from './types';`, expected: 'module-typescript' },
{ source: '', expected: 'commonjs' },
{ source: ' ', expected: 'commonjs' },
{ source: '\n\n', expected: 'commonjs' },
{ source: `const foo;`, expected: undefined },
// // This is an edge case where the parser detects syntax wrong
{ source: `fetch<Object>("boo")`, expected: 'commonjs-typescript' },
{ source: `import x = require('x')`, expected: undefined },
];

for (const { source, expected } of cases) {
strictEqual(detectSyntax(source), expected);
}
});