Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 9 additions & 9 deletions BENCHMARKS.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,22 +8,22 @@ Measured on Apple M3 Max, 36GB RAM.

| Metric | Sediment | ChromaDB | Mem0 |
|--------|----------|----------|------|
| Recall@1 | 45.0% | 47.0% | 47.0% |
| Recall@3 | **69.0%** | 69.0% | 69.0% |
| Recall@5 | 78.0% | 78.5% | 78.5% |
| Recall@10 | **90.5%** | 90.0% | 90.0% |
| MRR | 59.0% | 60.8% | 60.8% |
| Recall@1 | **50.0%** | 47.0% | 47.0% |
| Recall@3 | 69.0% | 69.0% | 69.0% |
| Recall@5 | 77.5% | 78.5% | 78.5% |
| Recall@10 | 89.5% | 90.0% | 90.0% |
| MRR | **61.9%** | 60.8% | 60.8% |
| Recency@1 | **100%** | 14% | 14% |
| Consolidation | **99%** | 0% | 0% |
| Store p50 | 22ms | 696ms | 16ms |
| Recall p50 | 26ms | 694ms | 8ms |
| Store p50 | 49ms | 696ms | 16ms |
| Recall p50 | 103ms | 694ms | 8ms |

## Key takeaways

- **Retrieval quality**: Within 0.5pp of ChromaDB on Recall@5 (78.0% vs 78.5%), matching on Recall@3
- **Retrieval quality**: Best R@1 (50.0%) and MRR (61.9%) — top result is correct more often than alternatives
- **Temporal correctness**: 100% Recency@1 — updated memories always rank first. Others: 14%
- **Deduplication**: 99% consolidation rate — near-duplicates auto-merged. Others: 0%
- **Latency**: 32x faster store than ChromaDB (22ms vs 696ms). All operations local, no network
- **Latency**: 14x faster store than ChromaDB (49ms vs 696ms). All operations local, no network

## Methodology

Expand Down