-
Notifications
You must be signed in to change notification settings - Fork 190
Description
Use Case
Building a long-running agent system that accumulates millions to billions of memory vectors—episodic events, customer interactions, supply chain observations—over months or years.
Need a cost-effective way to store and query this memory without blowing through RAM budgets. Most in-memory indexes (HNSW, FAISS) become prohibitively expensive at scale.
Problem Statement
Hindsight's current vector backends are RAM-heavy. When memory grows past a few hundred million vectors, the infrastructure cost becomes a constraint—either you shard across many machines or you hit scaling limits.
DiskANN (from Microsoft Research) solves this by keeping most of the index on SSD and caching hot data in RAM. You get 95%+ recall at a few milliseconds latency, but at a fraction of the RAM footprint. It's proven at billion-scale in production (Bing, Azure Cognitive Search).
How This Feature Would Help
Add DiskANN as an optional vector store backend alongside existing options. This would let users:
- Run Hindsight on a single node with 100M–B+ vectors without massive RAM requirements
- Lower infrastructure costs for persistent, long-lived agent memory
- Keep the same Hindsight API—just swap the backend config
Implementation could be a thin adapter around Microsoft's DiskANN C++ library or integrate with systems that already wrap it (e.g., certain cloud vector DBs).
Proposed Solution
No response
Alternatives Considered
No response
Priority
Nice to have
Additional Context
No response
Checklist
- I would be willing to contribute this feature