Skip to content

[Feature]: Compressed Memory Snippets for Token Efficiency #89

@zircote

Description

@zircote

Problem Statement

When memories are included in LLM context, they often contain verbose content that consumes unnecessary tokens. Mem0 reports 80% token cost reduction through intelligent compression while maintaining semantic meaning.

Currently Subcog includes full memory content in recalls, leading to:

  • Higher token costs
  • Reduced context window availability
  • Potential truncation of important content

Proposed Solution

Implement intelligent memory compression:

  1. Snippet Generation: Create concise summaries of memories
  2. Adaptive Compression: Adjust verbosity based on available context
  3. Key Point Extraction: Identify and preserve critical information
  4. Expandable References: Link to full content when needed

Compression strategies:

  • LLM-based summarization (highest quality, highest cost)
  • Extractive summarization (fast, preserves original text)
  • Structured extraction (facts, decisions, actions)
  • Hybrid based on memory type

Proposed API:

pub struct CompressionConfig {
    strategy: CompressionStrategy,
    target_ratio: f32,          // e.g., 0.2 = 20% of original
    preserve_keywords: bool,
    include_source_ref: bool,
}

pub enum CompressionStrategy {
    None,
    Extractive { sentences: usize },
    Abstractive { model: String },
    Structured { template: String },
    Adaptive { token_budget: usize },
}

MCP tool parameters:

subcog_recall:
  query: "authentication patterns"
  compression: "adaptive"
  token_budget: 500  # Max tokens for all results

Response format:

{
  "memories": [
    {
      "id": "abc123",
      "snippet": "Auth: Use JWT with refresh tokens, 15min expiry",
      "full_content_ref": "subcog://memory/abc123",
      "compression_ratio": 0.15
    }
  ]
}

Alternatives Considered

  • Full content always (current) - token inefficient
  • Truncation only - loses end of memories
  • User-provided summaries - maintenance burden

Additional Context

  • Mem0 claims 80% token reduction
  • Extends existing detail levels (light/medium/everything)
  • Important for cost-conscious deployments

Breaking Change: No
Priority: Nice to have
Contribution: Yes, with guidance

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions