Skip to content

Implement Reflection memory with LLM-based synthesis #43

@fogfish

Description

@fogfish

Problem: Need a working implementation of reflection-based memory that demonstrates the pattern without adding external dependencies. This provides both a useful memory strategy and a reference implementation for users.

Design Goals:

  1. Self-contained: No external dependencies beyond existing chatter LLM API
  2. Configurable: Customizable reflection triggers and synthesis prompts
  3. Practical: Works well with small observation counts (not just 1000s)
  4. Performant: Doesn't reflection on every commit
  5. Extensible: Easy to customize scoring and synthesis logic

Architecture (Reflection Memory):

┌─────────────────────────────────────────┐
│         Reflection Memory               │
├─────────────────────────────────────────┤
│  Recent: Stream (last N observations)   │
│  Insights: Synthesized high-level facts │
├─────────────────────────────────────────┤
│  Commit() → Store in recent            │
│             Auto-reflect if threshold     │
│                                          │
│  Reflect() → Ask LLM to synthesize     │
│              Extract key insights       │
│              Evict low-importance obs   │
│                                          │
│  Context() → Recent + Insights          │
└─────────────────────────────────────────┘

Required Changes:

  1. Create Reflection memory (memory/reflection.go):
package memory

import (
    "context"
    "fmt"
    "sync"
    
    "github.com/kshard/chatter"
    "github.com/kshard/thinker"
)

// ReflectionConfig controls reflection behavior
type ReflectionConfig struct {
    // RecentCapacity: number of recent observations to keep
    RecentCapacity int
    
    // InsightCapacity: maximum number of insights to retain
    InsightCapacity int
    
    // ReflectEvery: trigger reflection after N commits (0 = manual only)
    ReflectEvery int
    
    // ImportanceThreshold: minimum importance to avoid eviction (0.0-1.0)
    ImportanceThreshold float64
    
    // SynthesisPrompt: custom prompt for LLM synthesis (nil = default)
    SynthesisPrompt func([]*thinker.Observation) chatter.Message
}

// DefaultReflectionConfig provides sensible defaults
var DefaultReflectionConfig = ReflectionConfig{
    RecentCapacity:      20,
    InsightCapacity:     10,
    ReflectEvery:        15,  // Reflect before recent buffer fills
    ImportanceThreshold: 0.5,
    SynthesisPrompt:     nil,  // Use built-in
}

// Reflection implements reflection-based memory from "Generative Agents" paper.
// Combines recent observations with synthesized high-level insights.
type Reflection struct {
    mu      sync.Mutex
    llm     chatter.Chatter
    config  ReflectionConfig
    stratum chatter.Stratum
    
    // Recent observations
    recent    []*thinker.Observation
    commitCount int
    
    // Synthesized insights (high-level facts extracted by LLM)
    insights  []Insight
}

// Insight represents a high-level fact synthesized from observations
type Insight struct {
    Text       string  // The insight statement
    Importance float64 // Importance score (0.0-1.0)
    Evidence   []int   // Indices of observations that support this
}

var _ thinker.Memory = (*Reflection)(nil)

// NewReflection creates a new reflection-based memory
func NewReflection(llm chatter.Chatter, stratum chatter.Stratum, config ReflectionConfig) *Reflection {
    if config.RecentCapacity == 0 {
        config.RecentCapacity = DefaultReflectionConfig.RecentCapacity
    }
    if config.InsightCapacity == 0 {
        config.InsightCapacity = DefaultReflectionConfig.InsightCapacity
    }
    
    return &Reflection{
        llm:     llm,
        config:  config,
        stratum: stratum,
        recent:  make([]*thinker.Observation, 0, config.RecentCapacity),
        insights: make([]Insight, 0, config.InsightCapacity),
    }
}

func (r *Reflection) Purge() {
    r.mu.Lock()
    defer r.mu.Unlock()
    
    r.recent = make([]*thinker.Observation, 0, r.config.RecentCapacity)
    r.insights = make([]Insight, 0, r.config.InsightCapacity)
    r.commitCount = 0
}

func (r *Reflection) Commit(obs *thinker.Observation) {
    r.mu.Lock()
    defer r.mu.Unlock()
    
    // Score importance if not already set
    if obs.Reply.Importance == 0 {
        obs.Reply.Importance = r.scoreImportance(obs)
    }
    
    r.recent = append(r.recent, obs)
    r.commitCount++
    
    // FIFO eviction if full
    if len(r.recent) > r.config.RecentCapacity {
        r.recent = r.recent[1:]
    }
    
    // Auto-reflect if threshold reached
    if r.config.ReflectEvery > 0 && r.commitCount%r.config.ReflectEvery == 0 {
        // Spawn async reflection (don't block commit)
        go r.Reflect(context.Background())
    }
}

func (r *Reflection) Context(prompt chatter.Message) []chatter.Message {
    r.mu.Lock()
    defer r.mu.Unlock()
    
    ctx := make([]chatter.Message, 0)
    
    // System prompt
    if len(r.stratum) > 0 {
        ctx = append(ctx, r.stratum)
    }
    
    // Synthesized insights (as system knowledge)
    if len(r.insights) > 0 {
        var insightPrompt chatter.Prompt
        insightPrompt.With(chatter.Content{
            Note: "Previously learned insights:",
            Text: r.formatInsights(),
        })
        ctx = append(ctx, &insightPrompt)
    }
    
    // Recent observations (conversation history)
    for _, obs := range r.recent {
        ctx = append(ctx, obs.Query.Content, obs.Reply.Content)
    }
    
    ctx = append(ctx, prompt)
    return ctx
}

// Reflect triggers LLM-based synthesis of high-level insights
func (r *Reflection) Reflect(ctx context.Context) error {
    r.mu.Lock()
    defer r.mu.Unlock()
    
    if len(r.recent) == 0 {
        return nil  // Nothing to reflect on
    }
    
    // Build reflection prompt
    prompt := r.buildReflectionPrompt()
    
    // Ask LLM to synthesize insights
    reply, err := r.llm.Prompt(ctx, []chatter.Message{prompt})
    if err != nil {
        return fmt.Errorf("reflection synthesis failed: %w", err)
    }
    
    // Parse insights from LLM response
    newInsights := r.parseInsights(reply)
    
    // Merge with existing insights (deduplicate, keep important)
    r.mergeInsights(newInsights)
    
    // Evict low-importance observations (keep important ones)
    r.evictLowImportance()
    
    return nil
}

// scoreImportance calculates observation importance
func (r *Reflection) scoreImportance(obs *thinker.Observation) float64 {
    // Simple heuristic: recency + content length
    // You can customize this in config
    
    age := obs.Created.Age().Hours()
    recencyScore := 1.0 / (1.0 + age/24.0)  // Decay over days
    
    contentLength := len(obs.Reply.Content.String())
    lengthScore := min(1.0, float64(contentLength)/500.0)
    
    return 0.6*recencyScore + 0.4*lengthScore
}

// buildReflectionPrompt creates prompt for LLM synthesis
func (r *Reflection) buildReflectionPrompt() chatter.Message {
    if r.config.SynthesisPrompt != nil {
        return r.config.SynthesisPrompt(r.recent)
    }
    
    // Default synthesis prompt
    var prompt chatter.Prompt
    
    prompt.WithTask(`Analyze the following conversation observations and extract 3-5 high-level insights.
Focus on:
- User preferences and patterns
- Key facts and decisions
- Important context for future interactions

Format as JSON array of objects with "insight" and "importance" (0.0-1.0):
[{"insight": "User prefers...", "importance": 0.8}, ...]
`)
    
    // Include recent observations
    observations := make([]string, len(r.recent))
    for i, obs := range r.recent {
        observations[i] = fmt.Sprintf("%d. Q: %s\n   A: %s",
            i+1,
            obs.Query.Content.String(),
            obs.Reply.Content.String(),
        )
    }
    
    prompt.WithBlob("Observations", observations...)
    
    return &prompt
}

// parseInsights extracts insights from LLM response
func (r *Reflection) parseInsights(reply *chatter.Reply) []Insight {
    // Try to parse JSON array
    var raw []struct {
        Insight    string  `json:"insight"`
        Importance float64 `json:"importance"`
    }
    
    // Simple JSON extraction (you can use jsonify helper)
    text := reply.String()
    if err := json.Unmarshal([]byte(text), &raw); err != nil {
        // Fallback: split by lines
        return r.parseInsightsPlainText(text)
    }
    
    insights := make([]Insight, 0, len(raw))
    for _, item := range raw {
        if item.Insight != "" {
            insights = append(insights, Insight{
                Text:       item.Insight,
                Importance: item.Importance,
            })
        }
    }
    
    return insights
}

// parseInsightsPlainText fallback parser
func (r *Reflection) parseInsightsPlainText(text string) []Insight {
    // Simple line-based parsing
    lines := strings.Split(text, "\n")
    insights := make([]Insight, 0)
    
    for _, line := range lines {
        line = strings.TrimSpace(line)
        if line == "" || len(line) < 10 {
            continue
        }
        // Each line is an insight
        insights = append(insights, Insight{
            Text:       line,
            Importance: 0.5,  // Default
        })
    }
    
    return insights
}

// mergeInsights combines new insights with existing ones
func (r *Reflection) mergeInsights(newInsights []Insight) {
    // Add new insights
    r.insights = append(r.insights, newInsights...)
    
    // Sort by importance
    sort.Slice(r.insights, func(i, j int) bool {
        return r.insights[i].Importance > r.insights[j].Importance
    })
    
    // Keep top N
    if len(r.insights) > r.config.InsightCapacity {
        r.insights = r.insights[:r.config.InsightCapacity]
    }
}

// evictLowImportance removes observations below threshold
func (r *Reflection) evictLowImportance() {
    filtered := make([]*thinker.Observation, 0)
    for _, obs := range r.recent {
        if obs.Reply.Importance >= r.config.ImportanceThreshold {
            filtered = append(filtered, obs)
        }
    }
    r.recent = filtered
}

// formatInsights converts insights to text for context
func (r *Reflection) formatInsights() []string {
    formatted := make([]string, len(r.insights))
    for i, insight := range r.insights {
        formatted[i] = fmt.Sprintf("• %s (importance: %.1f)", insight.Text, insight.Importance)
    }
    return formatted
}

func min(a, b float64) float64 {
    if a < b {
        return a
    }
    return b
}
  1. Add tests (memory/reflection_test.go):
package memory_test

import (
    "context"
    "testing"
    
    "github.com/fogfish/it/v2"
    "github.com/kshard/thinker"
    "github.com/kshard/thinker/memory"
)

func TestReflectionMemory(t *testing.T) {
    llm := newMockLLM()  // Returns mock insights
    
    mem := memory.NewReflection(llm, "System", memory.DefaultReflectionConfig)
    
    // Commit observations
    for i := 0; i < 10; i++ {
        obs := createTestObservation(fmt.Sprintf("query %d", i))
        mem.Commit(obs)
    }
    
    // Trigger reflection
    err := mem.Reflect(context.Background())
    
    it.Then(t).Should(
        it.Nil(err),
    )
    
    // Context should include insights
    ctx := mem.Context(createTestPrompt("new query"))
    it.Then(t).Should(
        it.True(len(ctx) > 2),  // System + insights + observations + prompt
    )
}

func TestReflectionAutoTrigger(t *testing.T) {
    llm := newMockLLM()
    
    config := memory.ReflectionConfig{
        RecentCapacity: 20,
        ReflectEvery:   5,  // Reflect every 5 commits
    }
    
    mem := memory.NewReflection(llm, "", config)
    
    // Commit 5 observations - should trigger reflection
    for i := 0; i < 5; i++ {
        mem.Commit(createTestObservation(fmt.Sprintf("q%d", i)))
    }
    
    // Give async reflection time to run
    time.Sleep(100 * time.Millisecond)
    
    // Verify reflection occurred (check insights)
    ctx := mem.Context(createTestPrompt("test"))
    // Assert insights are present
}
  1. Add example (examples/12_reflective_memory/main.go):
package main

import (
    "context"
    "fmt"
    
    "github.com/kshard/chatter/provider/autoconfig"
    "github.com/kshard/thinker/agent"
    "github.com/kshard/thinker/codec"
    "github.com/kshard/thinker/memory"
)

func main() {
    llm, _ := autoconfig.FromNetRC("thinker")
    
    // Create reflection memory
    reflectMem := memory.NewReflection(
        llm,
        "You are a helpful assistant that learns from conversations",
        memory.ReflectionConfig{
            RecentCapacity:      15,
            InsightCapacity:     5,
            ReflectEvery:        10,  // Auto-reflect every 10 messages
            ImportanceThreshold: 0.4,
        },
    )
    
    // Create agent with reflective memory
    agent := agent.NewPrompter(llm, encode)
    agent.Automata.Memory = reflectMem  // Swap memory
    
    // Have conversation - memory learns patterns
    conversations := []string{
        "I'm learning Go programming",
        "I prefer functional style over OOP",
        "What's the best way to handle errors in Go?",
        "I work in fintech, so correctness is critical",
        "Show me an example of using channels",
        // ... more conversations
    }
    
    for i, input := range conversations {
        result, _ := agent.Prompt(context.Background(), input)
        fmt.Printf("Q: %s\nA: %s\n\n", input, result)
        
        // Manually trigger reflection after batch
        if i%5 == 4 {
            agent.Reflect(context.Background())
            fmt.Println("--- Memory reflected, insights updated ---\n")
        }
    }
    
    // Later conversations benefit from learned insights
    result, _ := agent.Prompt(context.Background(), 
        "What should I focus on learning next?")
    // Agent remembers: user learning Go, prefers functional, works in fintech
    fmt.Printf("Smart answer: %s\n", result)
}
  1. Document in README.md:
### Reflection Memory

Reflection memory learns patterns from conversations using LLM synthesis:

```go
import "github.com/kshard/thinker/memory"

reflectMem := memory.NewReflection(
    llm,
    systemPrompt,
    memory.ReflectionConfig{
        RecentCapacity:  15,   // Keep last 15 observations
        InsightCapacity: 5,    // Keep top 5 insights
        ReflectEvery:    10,   // Auto-reflect every 10 commits
    },
)

agent := agent.NewAutomata(llm, reflectMem, ...)

// Memory automatically synthesizes insights
// Combines recent observations + high-level patterns

Features:

  • ✅ LLM-based insight extraction
  • ✅ Importance-weighted retention
  • ✅ Automatic or manual reflection triggers
  • ✅ Customizable synthesis prompts
  • ✅ No external dependencies

See examples/12_reflective_memory for complete example.

Estimated Effort: 6 hours
Skills Required:

  • Memory system design
  • LLM prompt engineering
  • Go implementation
  • Testing

Breaking Changes: None (new optional memory type)

Benefits:

  • ✅ Working reference implementation
  • ✅ Demonstrates Observation.Importance usage
  • ✅ Practical for real applications
  • ✅ Customizable and extensible
  • ✅ Self-contained (no external deps)

Design Decisions:

  1. Hybrid approach: Recent observations + synthesized insights
  2. Configurable triggers: Auto or manual reflection
  3. Simple scoring: Default heuristics, easy to override
  4. Async reflection: Doesn't block commits
  5. JSON output: Structured insights from LLM

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions