Skip to content

Go MCP server for HMLR hierarchical memory system with semantic search

Notifications You must be signed in to change notification settings

harperreed/memory

Repository files navigation

HMLR Go MCP Server

A Go implementation of the HMLR (Hierarchical Memory Lookup & Routing) memory system, exposed as an MCP (Model Context Protocol) server for integration with Claude Code and other MCP clients.

What is HMLR?

HMLR replaces traditional vector-only RAG systems with a structured, state-aware memory architecture that achieves perfect (1.00) faithfulness and context recall across adversarial benchmarks.

Key Features:

  • 🧠 Temporal Resolution: Automatically handles conflicting facts across time
  • πŸ“‹ Bridge Blocks: Organizes conversations by topic with smart routing
  • 🎯 4 Routing Scenarios: Continuation, Resumption, New Topic, Topic Shift
  • πŸ” Semantic Recall: Retrieves relevant memories even with zero keyword overlap
  • πŸ’Ύ XDG Compliant: Stores data in ~/.local/share/remember/

Quick Start

Prerequisites

  • Go 1.23+ installed
  • OpenAI API key (for embeddings and LLM features)

Installation

  1. Clone the repository:
git clone <your-repo-url>
cd remember-standalone
  1. Create .env file with your API keys:
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY

Example .env:

OPENAI_API_KEY=sk-proj-...
ANTHROPIC_API_KEY=sk-ant-...  # Optional, for future features
MEMORY_OPENAI_MODEL=gpt-4o-mini  # Optional, defaults to gpt-4o-mini
  1. Build the server:
go build -o bin/hmlr-server ./cmd/server
  1. Run the server:
./bin/hmlr-server

The server will start on stdio and wait for MCP protocol messages.

Configuration

Environment Variables

Required:

  • OPENAI_API_KEY - Your OpenAI API key for embeddings and LLM features

Optional:

  • MEMORY_OPENAI_MODEL - Chat model to use (default: gpt-4o-mini)
    • Options: gpt-4o-mini, gpt-4o, o1-mini, etc.
    • Affects metadata extraction, fact extraction, and user profile learning
    • gpt-4o-mini is recommended for best balance of speed and quality

Model Selection Guide:

  • gpt-4o-mini: Recommended - Good balance of speed, quality, and cost (~$0.15/1M input tokens)
  • gpt-4o: Highest quality, slower, more expensive (~$2.50/1M input tokens)
  • o1-mini: Advanced reasoning capabilities (~$3/1M input tokens)

Usage with Claude Code

Add to your Claude Code MCP settings (~/.config/claude-code/mcp_settings.json):

{
  "mcpServers": {
    "remember": {
      "command": "/path/to/remember-standalone/bin/hmlr-server",
      "args": [],
      "env": {
        "OPENAI_API_KEY": "your-key-here",
        "MEMORY_OPENAI_MODEL": "gpt-4o-mini"
      }
    }
  }
}

MCP Tools

The server exposes 5 MCP tools:

1. store_conversation

Store a conversation turn in HMLR memory.

Input:

{
  "message": "What's the capital of France?",
  "context": "Optional additional context"
}

Output:

{
  "block_id": "block_20251206_143022",
  "turn_id": "turn_20251206_143022_abc123",
  "routing_scenario": "topic_continuation",
  "facts_extracted": 1
}

2. retrieve_memory

Search for relevant memories.

Input:

{
  "query": "What did we discuss about France?",
  "max_results": 5
}

Output:

{
  "memories": [
    {
      "block_id": "block_20251206_143022",
      "topic_label": "Geography",
      "relevance_score": 0.95,
      "summary": "Discussion about European capitals",
      "turns": [...]
    }
  ],
  "facts": [
    {
      "key": "capital_of_France",
      "value": "Paris",
      "confidence": 1.0
    }
  ]
}

3. list_active_topics

List all active conversation topics.

Output:

{
  "topics": [
    {
      "block_id": "block_20251206_143022",
      "topic_label": "Geography",
      "status": "ACTIVE",
      "turn_count": 3,
      "created_at": "2025-12-06T14:30:22Z"
    }
  ]
}

4. get_topic_history

Get full conversation history for a topic.

Input:

{
  "block_id": "block_20251206_143022"
}

5. get_user_profile

Get the user profile summary.

Output:

{
  "profile": {
    "preferences": {},
    "topics_of_interest": [],
    "last_updated": "2025-12-06T14:30:22Z"
  }
}

Architecture

HMLR Memory System
β”œβ”€β”€ Governor         # Smart routing (4 scenarios)
β”œβ”€β”€ ChunkEngine      # Hierarchical chunking (turn β†’ paragraph β†’ sentence)
β”œβ”€β”€ Storage          # XDG-compliant file + SQLite storage
β”œβ”€β”€ Bridge Blocks    # Topic-based conversation organization
└── MCP Server       # Stdio transport for Claude Code integration

Storage Layout

~/.local/share/remember/
β”œβ”€β”€ bridge_blocks/
β”‚   └── 2025-12-06/
β”‚       β”œβ”€β”€ block_*.json         # Conversation topics
β”‚       └── day_metadata.json
β”œβ”€β”€ facts.db                      # SQLite fact database
β”œβ”€β”€ embeddings/
β”‚   └── *.json                    # Vector embeddings
└── user_profile.json             # Long-term user profile

Development

Running Tests

# Run all scenario tests (with REAL storage, no mocks!)
go test -v ./.scratch/

# Run specific scenario
go test -v ./.scratch/ -run TestScenario01

Project Structure

remember-standalone/
β”œβ”€β”€ cmd/
β”‚   └── server/           # Main entry point
β”œβ”€β”€ internal/
β”‚   β”œβ”€β”€ core/            # Governor, ChunkEngine
β”‚   β”œβ”€β”€ storage/         # Storage implementation
β”‚   β”œβ”€β”€ models/          # Data structures
β”‚   └── mcp/             # MCP tools and handlers
β”œβ”€β”€ .scratch/            # Scenario tests (not committed)
β”œβ”€β”€ scenarios.jsonl      # Documented test scenarios
└── DESIGN.md           # Full architecture design

Scenario Testing

This project follows scenario-driven testing with zero mocks:

# All tests use REAL storage, REAL SQLite, REAL files
.scratch/scenario_01_store_retrieve_test.go  # Basic storage
.scratch/scenario_02_routing_test.go         # Governor routing
.scratch/scenario_03_chunking_test.go        # Hierarchical chunking

See scenarios.jsonl for documented test scenarios.

Implementation Status

βœ… Phase 1: Foundation

  • XDG storage initialization
  • Bridge Block JSON storage
  • SQLite facts database

βœ… Phase 2: Core Components

  • Governor with 4 routing scenarios
  • ChunkEngine for hierarchical chunking
  • FactScrubber with LLM-based extraction
  • ContextHydrator for intelligent prompt assembly
  • LatticeCrawler for vector-based candidate retrieval

βœ… Phase 3: Background Agents

  • Scribe agent for async user profile learning
  • Goroutine-based async processing

βœ… Phase 4: MCP Server

  • Stdio transport
  • All 5 MCP tools implemented
  • .env loading for API keys

βœ… Phase 5: Advanced Features

  • Vector embeddings with OpenAI (text-embedding-3-small)
  • LLM-based fact extraction (GPT-4o-mini)
  • Semantic memory search with cosine similarity
  • Dynamic user profiles with intelligent merging

Roadmap

  • Integrate OpenAI embeddings (text-embedding-3-small)
  • Implement FactScrubber with LLM
  • Add vector-based semantic search
  • Implement Scribe agent for user profiles
  • Add ContextHydrator for prompt assembly
  • Implement LatticeCrawler for memory retrieval
  • RAGAS benchmark tests (3/3 passing at 1.00)
  • Performance optimizations
  • Port remaining RAGAS tests (7C, 8, 9, 12)

Contributing

This project uses:

  • TDD - Write tests first, then implement
  • Scenario Testing - Real dependencies, no mocks
  • Subagent-Driven Development - Fresh context per task

See DESIGN.md for full architecture details.

License

MIT License - See LICENSE file

References

About

Go MCP server for HMLR hierarchical memory system with semantic search

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •