feat: 14-language AST support with heritage, call resolution, and dead code improvements#78
Merged
RaghavChamadiya merged 10 commits intomainfrom Apr 13, 2026
Merged
Conversation
Replace FTS-only file retrieval with a 3-signal ranking system: - Symbol name match (weight 2.0) — most precise - File path match (weight 1.5) — catches path-based searches - FTS on wiki content (weight 1.0) — broadest, lowest priority Files ranked by signal score then PageRank, top 3 returned. Remove git signals (HOTSPOT, bus-factor, owner) from enrichment — that info belongs in get_risk, not every search. Remove Bash command interception (fragile regex on grep/rg commands). Keep: symbols (3), importers (3), dependencies (2) per file.
Create a single LanguageRegistry with 42 LanguageSpec entries as the source of truth for all language identity data. Migrate 14 consumer files to derive their constants from the registry, eliminating widespread duplication. Delete stale packages/core/queries/ directory.
…py (Phase 2) Extract per-language logic into dedicated packages: - extractors/ — visibility, signatures, docstrings, bindings, heritage - resolvers/ — Python, TS/JS, Go, Rust, C/C++, generic stem fallback - framework_edges.py — Django, FastAPI, Flask, pytest conftest detection parser.py drops from 1,806 to 796 lines (pure orchestration). graph.py drops from 1,286 to 646 lines. Delete dead parsers/ stubs.
Update Adding a New Language guide to reflect modular architecture (extractors/, resolvers/ instead of inline in parser.py/graph.py). Add architecture section and updated roadmap. Create Phase 3 handoff doc covering remaining language work: hardening C++/C, wiring Kotlin/Ruby/C#, adding Swift/Scala/PHP.
Complete language pipeline for Kotlin, Ruby, C#, Swift, Scala, and PHP with tree-sitter grammars, .scm queries, LanguageConfig entries, per-language extractors (bindings, docstrings, visibility, heritage), and dedicated import resolvers. Harden C++ with binding extraction and Doxygen docstrings, add call captures to C. Brings total AST-supported languages to 14 (7 Full + 7 Good tier). - Add 6 grammar dependencies (tree-sitter-kotlin/ruby/c-sharp/swift/scala/php) - Create .scm query files for C#, Swift, Scala, PHP; extend Kotlin, Ruby, C - Add LanguageConfig entries for all 8 languages in parser.py - Add per-language visibility functions (kotlin, csharp, swift, scala, php) - Add binding extractors for all 8 languages - Add docstring extractors (KDoc, RDoc, XML doc, Swift doc, ScalaDoc, PHPDoc, Doxygen) - Add heritage extractors for Swift, Scala, PHP - Create dedicated resolvers for Kotlin, Ruby, C#, Swift, Scala, PHP - Add 37 new parser tests with fixtures for all 6 languages - Update registry specs with grammar_package and heritage_node_types - Update README.md and LANGUAGE_SUPPORT.md documentation
…n interfaces - Fix _detect_unused_exports to read symbol nodes via DEFINES edges instead of non-existent 'symbols' attribute on file nodes - Add fallback PHP method_declaration pattern without visibility_modifier so methods defaulting to public are captured - Add refine_kotlin_class_kind() to distinguish interface/enum from regular class in Kotlin class_declaration nodes - Update test helper _build_graph to create proper symbol nodes
…, PHP traits - Ruby: extract include/extend/prepend from class body as mixin relations - Rust: extract #[derive(Trait)] from struct/enum attribute items - Swift: add extension conformance capture (user_type pattern in .scm) - PHP: extract use TraitName; from class declaration_list - Add struct_item/enum_item to Rust heritage_node_types - Add 'derive' to valid heritage kinds in integration tests
…ternals, module-level calls - Add PHP require/require_once/include/include_once as import captures - Extend dynamic import detection to JS/TS/Java/Kotlin/Ruby/PHP/Go - Implement _detect_unused_internals for private symbols with no callers - Add synthetic __module__ symbol per file for module-level call resolution - Update call_resolver to assign orphan calls to __module__ symbol
…bsolete planning docs Update README, LANGUAGE_SUPPORT.md, ARCHITECTURE.md, and website docs to reflect 14 AST-supported languages (7 Full + 7 Good tier) with heritage extraction improvements. Remove obsolete planning and handoff docs.
swati510
approved these changes
Apr 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Expands repowise from 7 to 14 languages with full AST support (Python, TypeScript, JavaScript, Java, Go, Rust, C++, C, Kotlin, Ruby, C#, Swift, Scala, PHP) through a 3-phase language pipeline overhaul, plus P0/P1 bug fixes.
Phase 1 — Centralize language config
LanguageRegistrywith 42LanguageSpecentries for identity data (extensions, entry points, manifests, builtins, heritage node types)Phase 2 — Modularize extractors and resolvers
extractors/(bindings, heritage, visibility, docstrings, signatures) andresolvers/(one per language)Phase 3 — Add 6 new languages + harden C/C++
.scmtree-sitter queries,LanguageConfigentries, extractors, resolvers, and test fixtures for Kotlin, Ruby, C#, Swift, Scala, PHPcompile_commands.jsonresolutionP0 Bug fixes
_detect_unused_exports— was reading symbols from file node attributes instead of graph successor iterationrefine_kotlin_class_kind()P1 Improvements
include/extend/prepend), Rust#[derive()], Swift extension conformance, PHPuse TraitNamerequire/include/require_once/include_onceimport captures_detect_unused_internalsfor private/internal symbols with zero incoming call edges__module__symbol for module-level call resolutionDocs
Test plan
derivekind__module__nodes