A Go implementation of the HMLR (Hierarchical Memory Lookup & Routing) memory system, exposed as an MCP (Model Context Protocol) server for integration with Claude Code and other MCP clients.
HMLR replaces traditional vector-only RAG systems with a structured, state-aware memory architecture that achieves perfect (1.00) faithfulness and context recall across adversarial benchmarks.
Key Features:
- π§ Temporal Resolution: Automatically handles conflicting facts across time
- π Bridge Blocks: Organizes conversations by topic with smart routing
- π― 4 Routing Scenarios: Continuation, Resumption, New Topic, Topic Shift
- π Semantic Recall: Retrieves relevant memories even with zero keyword overlap
- πΎ XDG Compliant: Stores data in
~/.local/share/remember/
- Go 1.23+ installed
- OpenAI API key (for embeddings and LLM features)
- Clone the repository:
git clone <your-repo-url>
cd remember-standalone- Create
.envfile with your API keys:
cp .env.example .env
# Edit .env and add your OPENAI_API_KEYExample .env:
OPENAI_API_KEY=sk-proj-...
ANTHROPIC_API_KEY=sk-ant-... # Optional, for future features
MEMORY_OPENAI_MODEL=gpt-4o-mini # Optional, defaults to gpt-4o-mini- Build the server:
go build -o bin/hmlr-server ./cmd/server- Run the server:
./bin/hmlr-serverThe server will start on stdio and wait for MCP protocol messages.
Required:
OPENAI_API_KEY- Your OpenAI API key for embeddings and LLM features
Optional:
MEMORY_OPENAI_MODEL- Chat model to use (default:gpt-4o-mini)- Options:
gpt-4o-mini,gpt-4o,o1-mini, etc. - Affects metadata extraction, fact extraction, and user profile learning
- gpt-4o-mini is recommended for best balance of speed and quality
- Options:
Model Selection Guide:
gpt-4o-mini: Recommended - Good balance of speed, quality, and cost (~$0.15/1M input tokens)gpt-4o: Highest quality, slower, more expensive (~$2.50/1M input tokens)o1-mini: Advanced reasoning capabilities (~$3/1M input tokens)
Add to your Claude Code MCP settings (~/.config/claude-code/mcp_settings.json):
{
"mcpServers": {
"remember": {
"command": "/path/to/remember-standalone/bin/hmlr-server",
"args": [],
"env": {
"OPENAI_API_KEY": "your-key-here",
"MEMORY_OPENAI_MODEL": "gpt-4o-mini"
}
}
}
}The server exposes 5 MCP tools:
Store a conversation turn in HMLR memory.
Input:
{
"message": "What's the capital of France?",
"context": "Optional additional context"
}Output:
{
"block_id": "block_20251206_143022",
"turn_id": "turn_20251206_143022_abc123",
"routing_scenario": "topic_continuation",
"facts_extracted": 1
}Search for relevant memories.
Input:
{
"query": "What did we discuss about France?",
"max_results": 5
}Output:
{
"memories": [
{
"block_id": "block_20251206_143022",
"topic_label": "Geography",
"relevance_score": 0.95,
"summary": "Discussion about European capitals",
"turns": [...]
}
],
"facts": [
{
"key": "capital_of_France",
"value": "Paris",
"confidence": 1.0
}
]
}List all active conversation topics.
Output:
{
"topics": [
{
"block_id": "block_20251206_143022",
"topic_label": "Geography",
"status": "ACTIVE",
"turn_count": 3,
"created_at": "2025-12-06T14:30:22Z"
}
]
}Get full conversation history for a topic.
Input:
{
"block_id": "block_20251206_143022"
}Get the user profile summary.
Output:
{
"profile": {
"preferences": {},
"topics_of_interest": [],
"last_updated": "2025-12-06T14:30:22Z"
}
}HMLR Memory System
βββ Governor # Smart routing (4 scenarios)
βββ ChunkEngine # Hierarchical chunking (turn β paragraph β sentence)
βββ Storage # XDG-compliant file + SQLite storage
βββ Bridge Blocks # Topic-based conversation organization
βββ MCP Server # Stdio transport for Claude Code integration
~/.local/share/remember/
βββ bridge_blocks/
β βββ 2025-12-06/
β βββ block_*.json # Conversation topics
β βββ day_metadata.json
βββ facts.db # SQLite fact database
βββ embeddings/
β βββ *.json # Vector embeddings
βββ user_profile.json # Long-term user profile
# Run all scenario tests (with REAL storage, no mocks!)
go test -v ./.scratch/
# Run specific scenario
go test -v ./.scratch/ -run TestScenario01remember-standalone/
βββ cmd/
β βββ server/ # Main entry point
βββ internal/
β βββ core/ # Governor, ChunkEngine
β βββ storage/ # Storage implementation
β βββ models/ # Data structures
β βββ mcp/ # MCP tools and handlers
βββ .scratch/ # Scenario tests (not committed)
βββ scenarios.jsonl # Documented test scenarios
βββ DESIGN.md # Full architecture design
This project follows scenario-driven testing with zero mocks:
# All tests use REAL storage, REAL SQLite, REAL files
.scratch/scenario_01_store_retrieve_test.go # Basic storage
.scratch/scenario_02_routing_test.go # Governor routing
.scratch/scenario_03_chunking_test.go # Hierarchical chunkingSee scenarios.jsonl for documented test scenarios.
β Phase 1: Foundation
- XDG storage initialization
- Bridge Block JSON storage
- SQLite facts database
β Phase 2: Core Components
- Governor with 4 routing scenarios
- ChunkEngine for hierarchical chunking
- FactScrubber with LLM-based extraction
- ContextHydrator for intelligent prompt assembly
- LatticeCrawler for vector-based candidate retrieval
β Phase 3: Background Agents
- Scribe agent for async user profile learning
- Goroutine-based async processing
β Phase 4: MCP Server
- Stdio transport
- All 5 MCP tools implemented
- .env loading for API keys
β Phase 5: Advanced Features
- Vector embeddings with OpenAI (text-embedding-3-small)
- LLM-based fact extraction (GPT-4o-mini)
- Semantic memory search with cosine similarity
- Dynamic user profiles with intelligent merging
- Integrate OpenAI embeddings (text-embedding-3-small)
- Implement FactScrubber with LLM
- Add vector-based semantic search
- Implement Scribe agent for user profiles
- Add ContextHydrator for prompt assembly
- Implement LatticeCrawler for memory retrieval
- RAGAS benchmark tests (3/3 passing at 1.00)
- Performance optimizations
- Port remaining RAGAS tests (7C, 8, 9, 12)
This project uses:
- TDD - Write tests first, then implement
- Scenario Testing - Real dependencies, no mocks
- Subagent-Driven Development - Fresh context per task
See DESIGN.md for full architecture details.
MIT License - See LICENSE file