Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions BENCHMARKS.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Measured on Apple M3 Max, 36GB RAM.
| Recall@5 | 77.5% | 78.5% | 78.5% |
| Recall@10 | 89.5% | 90.0% | 90.0% |
| MRR | **61.9%** | 60.8% | 60.8% |
| nDCG@5 | 58.7% | 59.9% | 59.9% |
| Recency@1 | **100%** | 14% | 14% |
| Consolidation | **99%** | 0% | 0% |
| Store p50 | 49ms | 696ms | 16ms |
Expand All @@ -25,6 +26,57 @@ Measured on Apple M3 Max, 36GB RAM.
- **Deduplication**: 99% consolidation rate — near-duplicates auto-merged. Others: 0%
- **Latency**: 14x faster store than ChromaDB (49ms vs 696ms). All operations local, no network

## Category breakdown

### Recall@5 by category

| Category | Sediment | ChromaDB | Mem0 |
|----------|----------|----------|------|
| `architecture` | **82.9%** | 71.4% | 71.4% |
| `code_patterns` | **88.6%** | **88.6%** | **88.6%** |
| `cross_project` | 65.6% | **68.8%** | **68.8%** |
| `project_facts` | 60.6% | **75.8%** | **75.8%** |
| `troubleshooting` | 78.1% | **81.2%** | **81.2%** |
| `user_preferences` | **87.9%** | 84.9% | 84.9% |

### MRR by category

| Category | Sediment | ChromaDB | Mem0 |
|----------|----------|----------|------|
| `architecture` | **66.8%** | 55.8% | 55.8% |
| `code_patterns` | 70.4% | **71.1%** | **71.1%** |
| `cross_project` | **50.7%** | 47.1% | 47.1% |
| `project_facts` | 51.6% | **59.2%** | **59.2%** |
| `troubleshooting` | **63.2%** | 62.8% | 62.8% |
| `user_preferences` | 67.6% | **67.9%** | **67.9%** |

## Temporal correctness

| Metric | Sediment | ChromaDB | Mem0 |
|--------|----------|----------|------|
| Recency@1 | **100%** | 14% | 14% |
| Recency@3 | **100%** | 94% | 94% |
| MRR | **100%** | 48.8% | 48.8% |
| Mean Rank | **1.00** | 2.38 | 2.38 |

## Latency

### Store latency

| Metric | Sediment | ChromaDB | Mem0 |
|--------|----------|----------|------|
| p50 | 49ms | 696ms | **16ms** |
| p95 | 62ms | 726ms | **19ms** |
| p99 | 88ms | 729ms | **20ms** |

### Recall latency

| Metric | Sediment | ChromaDB | Mem0 |
|--------|----------|----------|------|
| p50 | 103ms | 694ms | **8ms** |
| p95 | 109ms | 728ms | **12ms** |
| p99 | 132ms | 746ms | **12ms** |

## Methodology

- **Dataset**: 1,000 memories across 6 categories (architecture, code patterns, project facts, troubleshooting, user preferences, cross-project)
Expand Down
13 changes: 10 additions & 3 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,13 +39,17 @@ Sediment is a semantic memory system for AI agents, running as an MCP (Model Con
### Core Components

- **`src/main.rs`** - CLI entry point with subcommands (init, stats, list) and MCP server startup
- **`src/lib.rs`** - Library root exposing public API, project detection, and scope types
- **`src/db.rs`** - LanceDB wrapper handling vector storage, search, and CRUD operations
- **`src/lib.rs`** - Library root exposing public API, project detection, scope types, and project ID migration
- **`src/db.rs`** - LanceDB wrapper handling vector storage, hybrid search (vector + FTS/BM25), and CRUD operations
- **`src/embedder.rs`** - Local embeddings using `all-MiniLM-L6-v2` via Candle (384-dim vectors)
- **`src/chunker.rs`** - Smart content chunking by type (markdown, code, JSON, YAML, text)
- **`src/document.rs`** - ContentType enum for routing content to the appropriate chunker
- **`src/item.rs`** - Unified Item, Chunk, SearchResult, StoreResult, and ConflictInfo types
- **`src/access.rs`** - SQLite-based access tracking, validation counting, and memory decay scoring
- **`src/graph.rs`** - SQLite graph store: relationship tracking (RELATED, SUPERSEDES, CO_ACCESSED, CLUSTER_SIBLING edges)
- **`src/consolidation.rs`** - Background consolidation: auto-merging near-duplicates, linking similar items
- **`src/error.rs`** - SedimentError enum with typed error variants (Database, Embedding, Arrow, etc.)
- **`src/retry.rs`** - Retry utilities with exponential backoff (3 attempts, 100ms–2s)

### MCP Server (`src/mcp/`)

Expand All @@ -67,7 +71,8 @@ Sediment is a semantic memory system for AI agents, running as an MCP (Model Con
- **Two-database hybrid**: LanceDB for vectors, SQLite for graph relationships + mutable counters
- **Single central database** at `~/.sediment/data/` stores all projects; graph + access at `~/.sediment/access.db`
- **Project scoping** via UUID stored in `.sediment/config` per project
- **Similarity boosting**: Same-project items get 1.15x boost, different projects 0.95x penalty
- **Similarity boosting**: Same-project items unchanged, different projects get 0.875x penalty (12.5% spread)
- **Hybrid search**: Vector similarity combined with FTS/BM25 scoring. BM25 boost is additive (max 0.12, power-law gamma 2.0). FTS index rebuilt on each store
- **Conflict detection**: Items with >=0.85 similarity flagged on store and enqueued for consolidation
- **Fresh DB connection per tool call** with shared embedder for efficiency
- **Memory decay scoring**: Recall results re-ranked using freshness (hyperbolic decay, 0.5 at 30 days) and access frequency (log-scaled). Tracked in SQLite sidecar since LanceDB is append-oriented.
Expand Down Expand Up @@ -126,6 +131,8 @@ CREATE TABLE graph_edges (
created_at INTEGER NOT NULL,
UNIQUE(from_id, to_id, edge_type)
);
CREATE INDEX idx_edges_from ON graph_edges(from_id, edge_type);
CREATE INDEX idx_edges_to ON graph_edges(to_id, edge_type);

-- Access tracking and decay scoring
CREATE TABLE access_log (
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,8 @@ All local, embedded, zero config:

- **Memory decay**: Results re-ranked by freshness (30-day half-life) and access frequency. Old memories rank lower but are never auto-deleted.
- **Trust-weighted scoring**: Validated and well-connected memories score higher.
- **Project scoping**: Automatic context isolation between projects. Same-project items get a similarity boost.
- **Hybrid search**: Vector similarity combined with FTS/BM25 scoring for better retrieval quality.
- **Project scoping**: Automatic context isolation between projects. Different-project items receive a similarity penalty.
- **Relationship graph**: Items linked via RELATED, SUPERSEDES, and CO_ACCESSED edges. Recall expands results with 1-hop graph neighbors and co-access suggestions.
- **Background consolidation**: Near-duplicates (≥0.95 similarity) auto-merged; similar items (0.85–0.95) linked.
- **Type-aware chunking**: Intelligent splitting for markdown, code, JSON, YAML, and plain text.
Expand Down