A code intelligence infrastructure, built on a canonical multi-language IR and a living semantic graph. One graph, several daemons (LSP, MCP stdio + HTTP/SSE), all your tools plugged into it. ~100 tokens per agent query instead of 30k of grep + read. Local, derived from source code, open-source.
Built for what breaks at scale and turns unmanageable in 6 months, not for the 2-minute demo. Code understanding is a system, not a string of greps.
📖 English · Français
About · Quickstart · Roadmap · Comparison · FAQ · Support · Changelog
Standardoc indexes your code into a living semantic graph:
- Direct AST, multi-language (Rust, TypeScript & JavaScript with React/JSX/TSX, Vue, Svelte, Lua today)
- Unified canonical IR — node types + typed edges shared cross-language
(
CALLS,IMPORTS,EXTENDS,IMPLEMENTS,REFERENCES,DEFINES,USES_TYPE,EXPOSES_API), with structured attributes on some edges - SQLite + FTS5, filesystem watcher, BLAKE3 invalidation, versioned schema
- Derived from the code (not another source to maintain on the side), reproducible on any machine in seconds
Several surfaces consume this state:
- LSP daemon (
standardoc lsp, stdio, primary writer of the graph) — the official VSCode extension embeds it; any LSP client can connect (IntelliJ, Neovim, Helix, Emacs eglot, …) by pointing at the binary - MCP daemon (
standardoc mcp, stdio or HTTP/SSE multi-client, readonly) — 16 tools for Claude Code, Cursor, Continue, Cody, Aider, Goose, and any MCP client - RAG layer (
.standardoc/rag.db, linked to the graph by FQDN) — prose chunks (README.md,docs/,notes/, ABOUT, etc.) re-ranked through a Candle/BGE-small embedder, reachable from both daemons (via thefetch_chunksMCP tool or thechunk_refsofget_context) - Sessions DB (
.standardoc-sessions/sessions.db, orthogonal to the graph) — persistent agent memos across chats, accessed through thesession_*MCP tools. Human content, not derived from code - Coming — static docs generated from the graph (
@standardoc/react- Nextra/Docusaurus/Astro adapters), visual navigation, language plugins via UST + Lua
The result: your tools stop re-parsing your code each on their own side. The graph is the shared asset. ~100 tokens per agent query instead of 30k of grep + read.
Standardoc optimizes for the questions you ask after 6 months on a monorepo, not for the 2-minute demo:
- What stays stable despite the changes? → canonical IR (languages mutate, the IR doesn't)
- Which choices become irreversible? → open-source FSL-1.1-MIT that becomes plain MIT on April 26, 2028 (no SaaS lock-in, no retroactive change of terms possible)
- What creates cognitive debt? → a shared graph (N tools re-parsing your code = N points of desync)
- What breaks at scale? → direct AST (no regex or heuristics that rot fast)
- What becomes incomprehensible in 6 months? → MCP-first guardrail (an agent that greps 30k tokens on every task is neither comprehensible nor debuggable)
Code understanding is a system, not a string of greps. Standardoc is the infrastructure for that system.
→ Details in .important/en/storytelling/:
philosophy, short/mid/long-term vision, dogfood observations, test
feedback.
VSCode extension (recommended — embeds the daemon, generates the AI
skill, writes .mcp.json):
Search for "Standardoc" in the Extensions panel, or grab the .vsix from
the releases.
Standalone CLI (without VSCode):
cargo install --git https://github.com/miralabs-tech/standardoc standardoc-cli
standardoc --version→ Full 5-minute walkthrough (QUICKSTART)
Standardoc is built for large, complex codebases — designed by dogfooding on Standardoc itself and calibrated for projects of the same caliber: compilers, programming languages, engines (game / runtime / db), heavy application monorepos, multi-team infra. Not for the weekend JS app — it'll still work, but that's not where the value is strongest.
The core problem it solves: keeping a stable, controlled, non-drifting co-work with an AI agent on a codebase that evolves. It's the problem nobody else tackles head-on today — most tools stop at "give it the context of one session", not "hold coherence over 6 months".
AI agents drift: they forget context from one session to the next, re-grep what they could have queried, invent code that looks like yours without respecting your invariants, don't remember the decision locked last week. Every task, the archaeology starts over — and the bigger the project gets, the more the archaeology costs (tokens, patience, subtle bugs, human cognitive debt).
Standardoc addresses this from 3 complementary angles:
- The graph — the agent queries the real structure (FQDN, edges, body, RAG prose), it doesn't invent
- The discipline — the
MCP-first guardrailstops the agent from shortcutting togrep + read; the PreToolUse hook forces it through the graph before anything else - The memory — locked decisions from one session survive in the
sessions DB (
session_save/session_get); the agent recovers the context at the next session instead of rediscovering everything
Standardoc is an AI-dev co-work tool, not a substitute for the dev. An agent querying a stable semantic graph is powerful; a dev who doesn't understand their code will stay frustrated whatever the AI behind it.
Standardoc is a StandarX project — an organization building open-source tools focused on code intelligence, infrastructure, and AI agents.
If Standardoc saves you time:
Star the repo · Support StandarX on OpenCollective · Other ways
FSL-1.1-MIT — Functional Source License v1.1 with automatic conversion to plain MIT. Each release becomes MIT 2 years after its release date; the first release converts on April 26, 2028. Free for any non-competing use today, fully MIT from those dates on.