fix(extract): four TS/JS extractor gaps — generators, namespace containers, decorators, import-equals#1615
Closed
papinto wants to merge 4 commits into
Closed
fix(extract): four TS/JS extractor gaps — generators, namespace containers, decorators, import-equals#1615papinto wants to merge 4 commits into
papinto wants to merge 4 commits into
Conversation
Generator functions were invisible to the graph. The declaration form
`function* g()` parses as `generator_function_declaration`, which was
absent from the JS/TS `function_types`, so it produced no node; the
expression form `const h = function*(){}` parses as `generator_function`,
which was absent from the JS function-value types, so it was never captured
when assigned to a module-level const. Generator *methods* (`*gen()` in a
class) were already covered — they parse as `method_definition`.
Add `generator_function_declaration` to the JS and TS `function_types` (so
it emits a node and its body is walked) and to `function_boundary_types`
(so its calls are scoped to it, parity with `function_declaration`); add
`generator_function` to `_JS_FUNCTION_VALUE_TYPES` (so the const-assigned
expression form is captured like `function_expression`).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`namespace Foo {}` parses as `internal_module` and `module Bar {}` (and
ambient `declare module "pkg" {}`) as a named `module` node. Neither kind
was in `class_types`/`function_types` nor handled by an extra-walk, so the
container produced no node — its members were still reached by the default
recurse, but the namespace/module itself was invisible to the graph and its
members lost their namespace context.
Add `_ts_extra_walk`, dispatched for TypeScript after `_js_extra_walk`,
mirroring `_csharp_extra_walk`: it emits a container node + a file→container
`contains` edge and recurses the body, leaving members file-contained as
before. `internal_module` exposes `name`/`body` fields; `module` exposes
none, so name (identifier / nested_identifier / quote-stripped string) and
body (`statement_block`) are found positionally. The `is_named` guard skips
the anonymous `module` keyword token, which shares the `module` type string.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`@Component`, `@Injectable`, `@Input`, `@Inject`, `@Entity`, … produced no
edge — the `decorator` node kind was never walked. This is framework-critical
(Angular, NestJS, Vue class components, TypeORM): the decorators are the
primary signal of what a class is and does.
Decorators occur only on classes, class members, and parameters, so one pass
over each class declaration covers them. `_ts_emit_decorator_edges` emits a
`references` edge (context="decorator") from the decorated entity to the
decorator symbol:
- class decorators -> the class. Handles both `@Deco class C` (decorator is
a child of the class) and `@Deco export class C` (decorator sits on the
wrapping export_statement), plus stacked decorators.
- method decorators -> the method node. They are siblings preceding the
`method_definition`; stacked decorators are skipped past to find it.
- field / accessor decorators -> the class (the field is not a graph node).
- parameter decorators (`@Inject(T)`) -> the enclosing method/constructor.
The symbol is the head identifier: `@Injectable`, the `function` of
`@Component({...})`, or the `property` of `@ns.Component()`. Targets go
through `ensure_named_node`, so a decorator defined outside the corpus
becomes a sourceless stub, consistent with type references — one per
referencing file, matching the cross-file stub disambiguation introduced
with full-path node IDs in 0.9.0.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…..))
`import x = require("./m")` produced no edge at all: tree-sitter parses it
as an `import_statement` whose module string sits inside an
`import_require_clause`, not as a direct child of the statement, so the
direct-child string scan in `_import_js` never found it. The file-level
dependency was silently dropped while the equivalent ESM form
(`import * as x from "./m"`) was captured — an invisible hole in the
import graph of TS codebases that interop with CommonJS modules.
Restructure the scan to first locate the module string — a direct `string`
child for ESM imports/re-exports, or the `string` nested inside an
`import_require_clause` for the import-equals form — then emit the
`imports_from` edge from the single shared path. Relative paths, tsconfig
aliases, and bare modules all resolve through the same
`_resolve_js_import_target` as ESM, giving the import-equals form exact
parity with a namespace import: one file-level `imports_from` edge.
Plain JS is unaffected (the grammar has no `import_require_clause`), and
the pure namespace alias form (`import A = B.C`) is out of scope — it has
no module string and models an intra-code alias, not an import.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
safishamsi
added a commit
that referenced
this pull request
Jul 2, 2026
Collaborator
|
Merged into |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Four independent gaps in the TS/JS extractor, found while running graphify against a large TypeScript front end. One commit per fix so they can be reviewed — or cherry-picked — individually. Each commit carries its own regression tests, and none of them changes existing behavior for code that was already extracted (the ESM import path is regression-guarded explicitly).
1. Generator functions were not nodes (
0bb8895)The declaration form
function* g()parses asgenerator_function_declaration, which was absent from the JS/TSfunction_types, so it produced no node; the expression formconst h = function*(){}parses asgenerator_function, which was absent from the JS function-value types. Generator methods (*gen()in a class) were already covered viamethod_definition.The fix adds
generator_function_declarationto the JS/TSfunction_typesandfunction_boundary_types(so a generator is a node and its body's calls attribute to it, parity withfunction_declaration), andgenerator_functionto the JS function-value types (so the const-assigned form is captured likefunction_expression).2. TS namespace / module containers were not nodes (
d4418fc)namespace Foo {}(internal_module) andmodule Bar {}/ ambientdeclare module "pkg" {}(module) were not in any type set and had no extra-walk, so the container produced no node — members were still reached by the default recurse, but the namespace itself was invisible and its members lost their namespace context.The fix adds
_ts_extra_walk(dispatched for TypeScript after_js_extra_walk, mirroring the C# extra-walk): it emits a container node plus a file→containercontainsedge and recurses the body, leaving members file-contained exactly as before. Anis_namedguard skips the anonymousmodulekeyword token, which shares themoduletype string.3. Decorators produced no edges (
71632b2)@Component,@Injectable,@Input,@Inject,@Entity, … — thedecoratornode kind was never walked, so decorators produced no edge at all. This is framework-critical signal (Angular, NestJS, Vue class components, TypeORM): decorators are often the primary statement of what a class is and does.The fix emits a
referencesedge (context="decorator") from the decorated entity to the decorator symbol: class decorators → the class (both@Deco class Cand@Deco export class C, plus stacked decorators), method decorators → the method node, field/accessor decorators → the class (the field is not a graph node), parameter decorators (@Inject(T)) → the enclosing method/constructor. The symbol is the head identifier (@Injectable, the function of@Component({...}), or the property of@ns.Component()), routed throughensure_named_nodeso out-of-corpus decorators become sourceless stubs consistent with type references.4. The TS import-equals form was invisible (
31aed9a)import x = require("./m")produced no edge at all: tree-sitter parses it as animport_statementwhose module string sits inside animport_require_clause, not as a direct child of the statement, so the direct-child string scan in_import_jsnever found it — while the equivalent ESMimport * as x from "./m"was captured.The fix restructures the scan to locate the module string in either position and emits the
imports_fromedge from the single shared path. Relative paths, tsconfig aliases, and bare modules all resolve through the same_resolve_js_import_targetas ESM, giving the import-equals form exact parity with a namespace import (verified by a require-vs-ESM parity test). Plain JS is unaffected (its grammar has noimport_require_clause), and the pure alias formimport A = B.Cis deliberately out of scope — it has no module string and models an intra-code alias, not an import.Testing
test_ts_generators.py(5),test_ts_namespace.py(6),test_ts_decorators.py(9),test_ts_import_require.py(5).v8@d89ec68:test_languages,test_js_import_resolution,test_symbol_resolution,test_language_resolvers,test_ts_inheritance,test_vue_extraction+ the four new files — 451 passed, 13 skipped.🤖 Generated with Claude Code