Skip to content

danielphan-dp/nodal

Repository files navigation

Rust 2026 Cargo MCP Server 🧠 Embeddings 🔍 Semantic Search 🔁 Watch Mode 🧰 MCP Tools 🌲 Tree-sitter 🏠 Local-first

Nodal

Local, Rust-based mini vector database for source code with a built-in MCP server.

🚀 Quick start

cargo run --bin vdb -- index . --out .nodal-db/index.bin
cargo run --bin vdb -- search "embedding pipeline" --top-k 5

For incremental updates while editing:

cargo run --bin vdb -- watch . --index .nodal-db/index.bin

Tree-sitter chunking is enabled by default when the feature is available. To disable it:

cargo run --bin vdb -- index . --out .nodal-db/index.bin --use-tree-sitter false

⚙️ Configuration (Nodal.toml)

Nodal looks for Nodal.toml in the current working directory by default. You can override this with --config /path/to/Nodal.toml or by setting NODAL_CONFIG. CLI flags always take precedence over the config file. Relative paths in Nodal.toml are resolved from the file's directory.

Minimal example:

[index]
root = "."
out = ".nodal-db/index.bin"

[embedder]
embedder = "token-hash"

🧠 Embeddings

By default, Nodal uses a Rust-BERT embedder when the rust-bert feature is enabled; otherwise it falls back to the lightweight token-hash embedder (fast, no external model). You can swap in model-based backends via optional features:

Use the same embedder for both indexing and search; dimensions must match.

Fastembed:

cargo run --features fastembed --bin vdb -- index . --out .nodal-db/index.bin --embedder fastembed

Gllm:

cargo run --features gllm --bin vdb -- index . --out .nodal-db/index.bin --embedder gllm --embedding-model bge-small-en

Rust-bert (default remote model or local converted path):

cargo run --features rust-bert --bin vdb -- index . --out .nodal-db/index.bin --embedder rust-bert

TEI server (direct /embed or /v1/embeddings endpoint):

cargo run --features tei --bin vdb -- index . --out .nodal-db/index.bin --embedder tei --embedding-url http://localhost:8080/embed

To pick a specific model, add --embedding-model <MODEL> (fastembed/gllm) or point rust-bert at a local model path. For OpenAI-compatible TEI endpoints, pass --embedding-url http://localhost:8080/v1 and optionally set --embedding-model. If your TEI endpoint requires auth, add --embedding-api-key <TOKEN>.

🧭 MCP server

Start the MCP server (stdio, JSON-RPC) and point your MCP client at it:

cargo run --bin mcp_server -- --index .nodal-db/index.bin

Keep the index up to date while serving:

cargo run --bin mcp_server -- --index .nodal-db/index.bin --watch --root .

Tools exposed:

  • search (query, top_k, min_score, include_text, max_snippet_chars)
  • index (root, out, extensions, max_file_bytes, chunk_lines, chunk_overlap, min_chunk_chars, use_tree_sitter)
  • stats

🧩 Codex CLI integration

Codex reads MCP server config from ~/.codex/config.toml (shared between the CLI and IDE). You can either:

  1. Add via CLI:
codex mcp add nodal -- cargo run --quiet --bin mcp_server -- --index .nodal-db/index.bin
  1. Edit config directly: see codex.mcp.example.toml and copy it into ~/.codex/config.toml (set cwd to the absolute path of this repo).

⚡ One-step setup script

Run:

./scripts/setup_codex_mcp.sh

This builds the index, then attempts codex mcp add. If Codex isn't installed (or the add fails), it appends the MCP server config to ~/.codex/config.toml.

🧑‍💻 Editor setup

VS Code settings

Workspace settings live in .vscode/settings.json. See the VS Code settings docs for scope, file locations, and JSON editing.

Zed MCP config

See the Zed MCP docs for server setup options. This repo includes a project-level Zed config in .zed/settings.json that starts the MCP server via cargo run. If you prefer a prebuilt binary, replace the command with ./target/debug/mcp_server after cargo build. Zed will only start MCP servers after the worktree is trusted.

📦 Dev container (Zed)

The repository includes .devcontainer/devcontainer.json. In Zed:

  1. Open the repo.
  2. Use the command palette to open the dev container.
  3. The container image installs Rust plus rg, fd, git, gh, and Node.js/npm.
  4. On first create, it runs ./scripts/devcontainer_setup.sh to install Codex CLI (if missing), enable web search, and build the project.

🛠️ Dev tools

Rust tooling is pinned with rust-toolchain.toml (includes rustfmt + clippy). Convenience aliases:

cargo fmt
cargo lint
cargo typecheck

Scripts in scripts/ (use TOOLCHAIN=1.89 to pin a toolchain if needed):

./scripts/fmt.sh
./scripts/lint.sh
./scripts/typecheck.sh
./scripts/check_all.sh

✅ Conventional commits

This repo includes a Rust-based commit-msg hook powered by the conventional_commits crate.

Enable it with:

./scripts/install_git_hooks.sh

To bypass once (e.g., for an emergency hotfix), run:

SKIP_CONVENTIONAL_COMMITS=1 git commit ...

Enable Codex web search + network access (writes ~/.codex/config.toml):

./scripts/enable_codex_web_search.sh

🗒️ Notes

  • Embeddings are produced with a lightweight hashing trick (no external model dependency).
  • The index is stored as a single bincode file (default .nodal-db/index.bin), so it’s easy to move or back up.
  • By default, indexing targets common languages (Rust, Python, JS/TS, Go, Java, C/C++, C#, Kotlin, Swift, Ruby, PHP, Scala, SQL, HTML/CSS, Shell, Lua, Dart, R). Use --extensions to narrow or expand that list.
  • Tree-sitter chunking is enabled for Rust, Python, JS/TS, Go, Java, C/C++, C#, Kotlin, Ruby, and PHP; other languages fall back to line-based chunks.
  • Tree-sitter chunks are scope-aware: nearby small scopes are merged, and oversized scopes are split by lines.

About

Local, Rust-based vector database for source code with a built-in MCP server

Topics

Resources

License

Stars

Watchers

Forks

Languages