Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 39 additions & 24 deletions BENCHMARKS.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,46 @@
# Benchmarks

Retrieval quality and latency compared against ChromaDB and Mem0, using 1,000 developer memories and 200 search queries. Full benchmark suite: [sediment-benchmark](https://github.com/rendro/sediment-benchmark).
Retrieval quality and latency compared against five MCP memory systems, using 1,000 developer memories and 200 search queries. Full benchmark suite: [sediment-benchmark](https://github.com/rendro/sediment-benchmark).

## Results

Measured on Apple M3 Max, 36GB RAM.

| Metric | Sediment | ChromaDB | Mem0 |
|--------|----------|----------|------|
| Recall@1 | **50.0%** | 47.0% | 47.0% |
| Recall@3 | 69.0% | 69.0% | 69.0% |
| Recall@5 | 77.5% | 78.5% | 78.5% |
| Recall@10 | 89.5% | 90.0% | 90.0% |
| MRR | **61.9%** | 60.8% | 60.8% |
| nDCG@5 | 58.7% | 59.9% | 59.9% |
| Recency@1 | **100%** | 14% | 14% |
| Consolidation | **99%** | 0% | 0% |
| Store p50 | 49ms | 696ms | 16ms |
| Recall p50 | 103ms | 694ms | 8ms |
| Metric | Sediment | ChromaDB | Mem0 | MCP Memory Service | MCP Server Memory | Basic Memory |
|--------|----------|----------|------|--------------------|--------------------|-------------|
| Recall@1 | **50.0%** | 47.0% | 47.0% | 38.0% | 1.0% | 9.0% |
| Recall@5 | 77.5% | **78.5%** | **78.5%** | 66.0% | 2.0% | 11.5% |
| MRR | **61.9%** | 60.8% | 60.8% | 49.0% | 1.5% | 10.2% |
| nDCG@5 | 58.7% | **59.9%** | **59.9%** | 47.8% | 1.6% | 10.4% |
| Recency@1 | **100%** | 14% | 14% | 10% | 0% | 0% |
| Consolidation | **99%** | 0% | 0% | 0% | 0% | 0% |
| Store p50 | 50ms | 692ms | 14ms | **2ms** | **2ms** | 49ms |
| Recall p50 | 103ms | 694ms | **8ms** | 10ms | **2ms** | 28ms |

MCP Server Memory and Basic Memory use keyword/entity matching rather than vector search, which explains their low retrieval scores on semantic queries.

## Key takeaways

- **Retrieval quality**: Best R@1 (50.0%) and MRR (61.9%) — top result is correct more often than alternatives
- **Temporal correctness**: 100% Recency@1 — updated memories always rank first. Others: 14%
- **Deduplication**: 99% consolidation rate — near-duplicates auto-merged. Others: 0%
- **Latency**: 14x faster store than ChromaDB (49ms vs 696ms). All operations local, no network
- **Retrieval quality**: Best R@1 (50.0%) and MRR (61.9%) among all systems — top result is correct more often than alternatives
- **Temporal correctness**: 100% Recency@1 — updated memories always rank first. Best competitor: 14%
- **Deduplication**: 99% consolidation rate — near-duplicates auto-merged. All others: 0%
- **Latency**: 14x faster store than ChromaDB (50ms vs 692ms). All operations local, no network

## Embedding model comparison

Sediment supports multiple embedding models via the `SEDIMENT_EMBEDDING_MODEL` environment variable. All models achieve 100% temporal correctness and 99% dedup consolidation.

| Metric | all-MiniLM-L6-v2 (default) | bge-base-en-v1.5 | bge-small-en-v1.5 | e5-small-v2 |
|--------|---------------------------|-------------------|-------------------|-------------|
| Dimensions | 384 | 768 | 384 | 384 |
| Recall@1 | 50.0% | **50.5%** | 46.5% | 42.0% |
| Recall@5 | **77.5%** | 76.0% | 75.0% | 68.0% |
| MRR | 61.9% | **62.0%** | 58.4% | 53.5% |
| nDCG@5 | **58.7%** | 57.6% | 55.5% | 51.1% |
| Store p50 | **50ms** | 92ms | 64ms | 63ms |
| Recall p50 | **103ms** | 149ms | 120ms | 118ms |

all-MiniLM-L6-v2 remains the default: essentially tied with bge-base-en-v1.5 on quality but ~2x faster due to smaller dimensions (384 vs 768).

## Category breakdown

Expand Down Expand Up @@ -65,23 +81,22 @@ Measured on Apple M3 Max, 36GB RAM.

| Metric | Sediment | ChromaDB | Mem0 |
|--------|----------|----------|------|
| p50 | 49ms | 696ms | **16ms** |
| p95 | 62ms | 726ms | **19ms** |
| p99 | 88ms | 729ms | **20ms** |
| p50 | 50ms | 692ms | **14ms** |
| p95 | 63ms | 726ms | **19ms** |
| p99 | 65ms | 729ms | **20ms** |

### Recall latency

| Metric | Sediment | ChromaDB | Mem0 |
|--------|----------|----------|------|
| p50 | 103ms | 694ms | **8ms** |
| p95 | 109ms | 728ms | **12ms** |
| p99 | 132ms | 746ms | **12ms** |
| p95 | 110ms | 728ms | **12ms** |
| p99 | 124ms | 746ms | **12ms** |

## Methodology

- **Dataset**: 1,000 memories across 6 categories (architecture, code patterns, project facts, troubleshooting, user preferences, cross-project)
- **Queries**: 200 queries with known ground-truth expected results
- **Temporal**: 50 sequences testing whether updated information ranks above stale versions
- **Dedup**: 50 pairs of near-duplicate content testing consolidation
- **Baselines**: ChromaDB with default ONNX embeddings; Mem0 with local Qdrant + HuggingFace embeddings
- **Sediment**: Hybrid vector + BM25 search, local Candle embeddings (all-MiniLM-L6-v2)
- **Systems tested**: Sediment (hybrid vector + BM25, local Candle embeddings), ChromaDB (default ONNX embeddings), Mem0 (local Qdrant + HuggingFace), MCP Memory Service (ChromaDB-based), MCP Server Memory (entity graph), Basic Memory (markdown file-based)
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "sediment-mcp"
version = "0.4.3"
version = "0.4.4"
edition = "2024"
repository = "https://github.com/rendro/sediment"
homepage = "https://github.com/rendro/sediment"
Expand Down
53 changes: 30 additions & 23 deletions src/db.rs
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ use tracing::{debug, info};
use crate::boost_similarity;
use crate::chunker::{ChunkingConfig, chunk_content};
use crate::document::ContentType;
use crate::embedder::{EMBEDDING_DIM, Embedder};
use crate::embedder::Embedder;
use crate::error::{Result, SedimentError};
use crate::item::{Chunk, ConflictInfo, Item, ItemFilters, SearchResult, StoreResult};

Expand Down Expand Up @@ -109,7 +109,7 @@ pub struct DatabaseStats {
const SCHEMA_VERSION: i32 = 2;

// Arrow schema builders
fn item_schema() -> Schema {
fn item_schema(dim: usize) -> Schema {
Schema::new(vec![
Field::new("id", DataType::Utf8, false),
Field::new("content", DataType::Utf8, false),
Expand All @@ -120,14 +120,14 @@ fn item_schema() -> Schema {
"vector",
DataType::FixedSizeList(
Arc::new(Field::new("item", DataType::Float32, true)),
EMBEDDING_DIM as i32,
dim as i32,
),
false,
),
])
}

fn chunk_schema() -> Schema {
fn chunk_schema(dim: usize) -> Schema {
Schema::new(vec![
Field::new("id", DataType::Utf8, false),
Field::new("item_id", DataType::Utf8, false),
Expand All @@ -138,7 +138,7 @@ fn chunk_schema() -> Schema {
"vector",
DataType::FixedSizeList(
Arc::new(Field::new("item", DataType::Float32, true)),
EMBEDDING_DIM as i32,
dim as i32,
),
false,
),
Expand Down Expand Up @@ -335,7 +335,8 @@ impl Database {
.await
.map_err(|e| SedimentError::Database(format!("Recovery collect failed: {}", e)))?;

let schema = Arc::new(item_schema());
let dim = self.embedder.dimension();
let schema = Arc::new(item_schema(dim));
let new_table = self
.db
.create_empty_table("items", schema.clone())
Expand Down Expand Up @@ -448,7 +449,8 @@ impl Database {
}

// Step 5: Create staging table with migrated data
let schema = Arc::new(item_schema());
let dim = self.embedder.dimension();
let schema = Arc::new(item_schema(dim));
let staging_table = self
.db
.create_empty_table("items_migrated", schema.clone())
Expand Down Expand Up @@ -530,7 +532,7 @@ impl Database {

/// Convert a batch from old schema to new schema
fn convert_batch_to_new_schema(&self, batch: &RecordBatch) -> Result<RecordBatch> {
let schema = Arc::new(item_schema());
let schema = Arc::new(item_schema(self.embedder.dimension()));

// Extract columns from old batch (handle missing columns gracefully)
let id_col = batch
Expand Down Expand Up @@ -645,7 +647,7 @@ impl Database {
/// Get or create the items table
async fn get_items_table(&mut self) -> Result<&Table> {
if self.items_table.is_none() {
let schema = Arc::new(item_schema());
let schema = Arc::new(item_schema(self.embedder.dimension()));
let table = self
.db
.create_empty_table("items", schema)
Expand All @@ -662,7 +664,7 @@ impl Database {
/// Get or create the chunks table
async fn get_chunks_table(&mut self) -> Result<&Table> {
if self.chunks_table.is_none() {
let schema = Arc::new(chunk_schema());
let schema = Arc::new(chunk_schema(self.embedder.dimension()));
let table = self
.db
.create_empty_table("chunks", schema)
Expand Down Expand Up @@ -695,13 +697,14 @@ impl Database {

// Generate item embedding
let embedding_text = item.embedding_text();
let embedding = self.embedder.embed(&embedding_text)?;
let embedding = self.embedder.embed_document(&embedding_text)?;
item.embedding = embedding;

// Store the item
let table = self.get_items_table().await?;
let batch = item_to_batch(&item)?;
let batches = RecordBatchIterator::new(vec![Ok(batch)], Arc::new(item_schema()));
let batches =
RecordBatchIterator::new(vec![Ok(batch)], Arc::new(item_schema(item.embedding.len())));

table
.add(Box::new(batches))
Expand Down Expand Up @@ -772,7 +775,8 @@ impl Database {
let mut all_embeddings = Vec::with_capacity(chunk_texts.len());
for batch_start in (0..chunk_texts.len()).step_by(EMBEDDING_BATCH_SIZE) {
let batch_end = (batch_start + EMBEDDING_BATCH_SIZE).min(chunk_texts.len());
let batch_embeddings = embedder.embed_batch(&chunk_texts[batch_start..batch_end])?;
let batch_embeddings =
embedder.embed_document_batch(&chunk_texts[batch_start..batch_end])?;
all_embeddings.extend(batch_embeddings);
}

Expand All @@ -789,7 +793,7 @@ impl Database {

// Single LanceDB write for all chunks
if !all_chunk_batches.is_empty() {
let schema = Arc::new(chunk_schema());
let schema = Arc::new(chunk_schema(embedder.dimension()));
let batches = RecordBatchIterator::new(all_chunk_batches.into_iter().map(Ok), schema);
chunks_table
.add(Box::new(batches))
Expand Down Expand Up @@ -854,7 +858,7 @@ impl Database {
self.ensure_vector_index().await?;

// Generate query embedding
let query_embedding = self.embedder.embed(query)?;
let query_embedding = self.embedder.embed_query(query)?;
let min_similarity = filters.min_similarity.unwrap_or(0.3);

// We need to search both items and chunks, then merge results
Expand Down Expand Up @@ -1043,7 +1047,7 @@ impl Database {
min_similarity: f32,
limit: usize,
) -> Result<Vec<ConflictInfo>> {
let embedding = self.embedder.embed(content)?;
let embedding = self.embedder.embed_document(content)?;
self.find_similar_items_by_vector(&embedding, None, min_similarity, limit)
.await
}
Expand Down Expand Up @@ -1476,7 +1480,7 @@ fn detect_content_type(content: &str) -> ContentType {
// ==================== Arrow Conversion Helpers ====================

fn item_to_batch(item: &Item) -> Result<RecordBatch> {
let schema = Arc::new(item_schema());
let schema = Arc::new(item_schema(item.embedding.len()));

let id = StringArray::from(vec![item.id.as_str()]);
let content = StringArray::from(vec![item.content.as_str()]);
Expand Down Expand Up @@ -1577,7 +1581,7 @@ fn batch_to_items(batch: &RecordBatch) -> Result<Vec<Item>> {
}

fn chunk_to_batch(chunk: &Chunk) -> Result<RecordBatch> {
let schema = Arc::new(chunk_schema());
let schema = Arc::new(chunk_schema(chunk.embedding.len()));

let id = StringArray::from(vec![chunk.id.as_str()]);
let item_id = StringArray::from(vec![chunk.item_id.as_str()]);
Expand Down Expand Up @@ -1657,10 +1661,11 @@ fn batch_to_chunks(batch: &RecordBatch) -> Result<Vec<Chunk>> {
}

fn create_embedding_array(embedding: &[f32]) -> Result<FixedSizeListArray> {
let dim = embedding.len();
let values = Float32Array::from(embedding.to_vec());
let field = Arc::new(Field::new("item", DataType::Float32, true));

FixedSizeListArray::try_new(field, EMBEDDING_DIM as i32, Arc::new(values), None)
FixedSizeListArray::try_new(field, dim as i32, Arc::new(values), None)
.map_err(|e| SedimentError::Database(format!("Failed to create vector: {}", e)))
}

Expand Down Expand Up @@ -1833,6 +1838,8 @@ mod tests {
assert!(version >= 2, "Schema version should be at least 2");
}

use crate::embedder::EMBEDDING_DIM;

/// Build the old item schema (v1) that included a `tags` column.
fn old_item_schema() -> Schema {
Schema::new(vec![
Expand Down Expand Up @@ -1932,7 +1939,7 @@ mod tests {
.await
.unwrap();

let schema = Arc::new(item_schema());
let schema = Arc::new(item_schema(EMBEDDING_DIM));
db_conn
.create_empty_table("items", schema)
.execute()
Expand Down Expand Up @@ -2015,7 +2022,7 @@ mod tests {
.await
.unwrap();

let schema = Arc::new(item_schema());
let schema = Arc::new(item_schema(EMBEDDING_DIM));
let vector_values = Float32Array::from(vec![0.0f32; EMBEDDING_DIM]);
let vector_field = Arc::new(Field::new("item", DataType::Float32, true));
let vector = FixedSizeListArray::try_new(
Expand Down Expand Up @@ -2089,7 +2096,7 @@ mod tests {
.unwrap();

// items_migrated (leftover from failed migration)
let new_schema = Arc::new(item_schema());
let new_schema = Arc::new(item_schema(EMBEDDING_DIM));
db_conn
.create_empty_table("items_migrated", new_schema)
.execute()
Expand Down Expand Up @@ -2134,7 +2141,7 @@ mod tests {
.await
.unwrap();

let new_schema = Arc::new(item_schema());
let new_schema = Arc::new(item_schema(EMBEDDING_DIM));

// items with new schema
let vector_values = Float32Array::from(vec![0.0f32; EMBEDDING_DIM]);
Expand Down
Loading