A deterministic, high-precision code intelligence layer exposed as a Model Context Protocol (MCP) server.
- Zero telemetry — your code never leaves your machine
- No API key required — runs entirely locally with sentence-transformers
- 1 min setup — just
uvx code-memoryand you're ready - Token saving by 50% — precise code retrieval instead of dumping entire files
Please help star code-memory if you like this project!
Finding the right context from a large codebase is expensive, inaccurate, and limited by context windows. Dumping files into prompts wastes tokens, and LLMs lose track of the actual task as context fills up.
Instead of manually hunting with grep/find or dumping raw file text, code-memory runs semantic searches against a locally indexed codebase. Inspired by claude-context, but designed from the ground up for large-scale local search.
Full AST Support (structural parsing with symbol extraction): Python, JavaScript/TypeScript, Java, Go, Rust, C/C++, Ruby, Kotlin
Fallback Support (whole-file indexing): C#, Swift, Scala, Lua, Shell, Config (yaml/toml/json), Web (html/css), SQL, Markdown
Files matching
.gitignorepatterns are automatically skipped.
Instead of a single monolithic search, code-memory routes queries through three purpose-built tools:
| Question Type | Tool | Data Source |
|---|---|---|
| "Where / What / How?" — find definitions, references, structure, semantic search | search_code |
BM25 + Dense Vector (SQLite vec) |
| "Architecture / Patterns" — understand architecture, explain workflows | search_docs |
Semantic / Fuzzy |
| "Who / Why?" — debug regressions, understand intent | search_history |
Git + BM25 + Dense Vector (SQLite vec) |
| "Setup / Prepare" — index parsing & embedding generation | index_codebase |
AST Parser + sentence-transformers |
This forces the LLM to pick the right retrieval strategy before any data is fetched.
# Install with pip
pip install code-memory
# Or with uvx (for MCP hosts)
uvx code-memory# Clone the repo
git clone https://github.com/kapillamba4/code-memory.git
cd code-memory
# Install dependencies
uv sync
# Run the MCP server (stdio transport)
uv run mcp run server.pyDownload standalone executables from GitHub Releases — no Python installation required.
| Platform | Architecture | File |
|---|---|---|
| Linux | x86_64 | code-memory-linux-x86_64 |
| macOS | x86_64 (Intel) | code-memory-macos-x86_64 |
| macOS | ARM64 (Apple Silicon) | code-memory-macos-arm64 |
| Windows | x86_64 | code-memory-windows-x86_64.exe |
# Linux/macOS: Download and make executable
chmod +x code-memory-*
./code-memory-*
# Windows: Run directly
code-memory-windows-x86_64.exeNote: The first run will download the embedding model (~600MB) to ~/.cache/huggingface/. Subsequent runs use the cached model.
- Python ≥ 3.13
uvpackage manager (recommended) or pip
Install uv if you don't have it:
curl -LsSf https://astral.sh/uv/install.sh | sh# Install from PyPI
pip install code-memory
# Or run directly with uvx
uvx code-memory# Run with the MCP Inspector for interactive debugging
uv run mcp dev server.py
# Run tests
uv run pytest tests/ -v
# Lint and format
uv run ruff check .
uv run ruff format .
# Build package
uv build
# Build standalone binary (requires pyinstaller)
pip install pyinstaller
pyinstaller --clean code-memory.spec
# Binary output: dist/code-memoryYou can use either uvx (requires Python) or the standalone binary (no dependencies).
Add to your MCP settings (e.g. ~/.gemini/settings.json):
{
"mcpServers": {
"code-memory": {
"command": "uvx",
"args": ["code-memory"]
}
}
}Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"code-memory": {
"command": "uvx",
"args": ["code-memory"]
}
}
}Add to .mcp.json in your project root or ~/.mcp.json for global access:
{
"mcpServers": {
"code-memory": {
"command": "uvx",
"args": ["code-memory"]
}
}
}Add to .vscode/mcp.json in your workspace:
{
"servers": {
"code-memory": {
"command": "uvx",
"args": ["code-memory"]
}
}
}Replace the path with the location of your downloaded binary:
{
"mcpServers": {
"code-memory": {
"command": "/path/to/code-memory-linux-x86_64"
}
}
}For Windows:
{
"mcpServers": {
"code-memory": {
"command": "C:\\path\\to\\code-memory-windows-x86_64.exe"
}
}
}| Variable | Description | Default |
|---|---|---|
CODE_MEMORY_LOG_LEVEL |
Logging verbosity (DEBUG, INFO, WARNING, ERROR) | INFO |
EMBEDDING_MODEL |
HuggingFace model ID for embeddings | jinaai/jina-code-embeddings-0.5b |
Example:
CODE_MEMORY_LOG_LEVEL=DEBUG uvx code-memoryYou can use a different embedding model by setting the EMBEDDING_MODEL environment variable:
EMBEDDING_MODEL="BAAI/bge-small-en-v1.5" uvx code-memoryFor MCP hosts, add the environment variable to your configuration:
{
"mcpServers": {
"code-memory": {
"command": "uvx",
"args": ["code-memory"],
"env": {
"EMBEDDING_MODEL": "BAAI/bge-small-en-v1.5"
}
}
}
}Note: Changing the embedding model will invalidate existing indexes. You'll need to re-run
index_codebaseafter switching models.
Indexes or re-indexes source files and documentation in the given directory. Run this before using search_code or search_docs to ensure the database is up to date. Uses tree-sitter for language-agnostic structural extraction and generates dense vector embeddings using sentence-transformers (runs locally, in-process) for semantic search.
index_codebase(directory=".")
Perform semantic search and find structural code definitions, locate where functions/classes are defined, or map out dependency references (call graphs). Uses hybrid retrieval (BM25 + vector embeddings) to find exact matches and semantic similarities.
search_code(query="parse python files", search_type="definition")
search_code(query="how do we establish the database connection", search_type="references")
search_code(query="src/auth/", search_type="file_structure")
Understand the codebase conceptually — how things work, architectural patterns, SOPs. Searches markdown documentation, READMEs, and docstrings extracted from code.
search_docs(query="how does the authentication flow work?")
search_docs(query="installation instructions", top_k=5)
Debug regressions and understand developer intent through Git history.
search_history(query="fix login timeout", search_type="commits")
search_history(query="src/auth/login.py", search_type="file_history", target_file="src/auth/login.py")
search_history(query="server.py", search_type="blame", target_file="server.py", line_start=1, line_end=20)
code-memory/
├── server.py # MCP server entry point (FastMCP)
├── db.py # SQLite database layer with sqlite-vec
├── parser.py # Tree-sitter-based code parser
├── doc_parser.py # Markdown documentation parser
├── queries.py # Hybrid retrieval query layer
├── git_search.py # Git history search module
├── errors.py # Custom exception hierarchy
├── validation.py # Input validation functions
├── logging_config.py # Structured logging configuration
├── tests/ # Test suite
├── pyproject.toml # Project metadata & dependencies
└── prompts/ # Milestone prompt engineering files
Make sure you're running search_history from within a git repository. The tool searches upward from the current directory to find .git.
Run index_codebase(directory=".") first to index your code and documentation. The index is stored locally in code_memory.db.
Indexing generates embeddings using a local sentence-transformers model. The first run downloads the model (~600MB for jina-code-embeddings-0.5b). Subsequent runs are faster.
Ensure you have enough disk space and memory. The jina-code-embeddings-0.5b model requires ~1GB RAM when loaded.
Your code never leaves your machine. Unlike cloud-based code intelligence tools, code-memory runs entirely locally:
- Zero telemetry — no usage data, analytics, or tracking
- Zero external API calls — all processing happens in-process
- Zero cloud dependencies — works without internet (after initial setup)
- Your data stays local — indexes stored in local SQLite database
This makes code-memory ideal for:
- Proprietary and confidential codebases
- Security-conscious organizations
- Air-gapped development environments
- Privacy-focused developers
See COMPARISON.md for a detailed comparison with cloud-based alternatives.
code-memory works in completely isolated environments:
-
On a connected machine, run code-memory once to cache the embedding model:
uvx code-memory # Model downloads to ~/.cache/huggingface/ -
Transfer to air-gapped machine:
- Standalone binary from GitHub Releases
- Model cache directory (
~/.cache/huggingface/hub/models--*)
-
Run on air-gapped machine — no network required.
- Download the wheel from PyPI on a connected machine
- Transfer and install:
pip install code-memory-*.whl - Pre-cache the model as above
- Run offline
- Milestone 1 — Project scaffolding & MCP protocol wiring
- Milestone 2 — Implement
search_codewith AST parsing + SQLite +sqlite-vec - Milestone 3 — Implement
search_historywith Git integration - Milestone 4 — Implement
search_docswith semantic search - Milestone 5 — Production hardening & packaging
See CONTRIBUTING.md for development setup and guidelines.
See CHANGELOG.md for version history.
MIT
