Skip to content

feat: entity-salience retrieval reflex — auto-prefetch claim candidates from prompt context #223

Description

@plind-junior

What you're trying to do

gbrain extracts entity candidates from the live conversation context and prefetches matching pages into the prompt window without the agent having to ask. Vouch's kb.context is caller-driven — the agent has to know to query. For a long-running session, that means context drift the agent can't self-correct.

Add a server-side reflex: scan the calling tool's recent params (the last N query / task / text strings), extract entity candidates by FTS-matching against entities/, attach the top-K matched claims to _meta.vouch_salience on the next read response. Zero LLM calls — pure substring + FTS5.

Suggested shape

# Configurable in .vouch/config.yaml
retrieval:
  reflex:
    enabled: true
    window: 8         # last 8 caller queries
    top_k: 3          # surface up to 3 salient entities
  • New src/vouch/salience.py keeps a per-session ring buffer of recent caller strings (in-memory, never on disk).
  • Every read-side response from a session that has a buffer gets _meta.vouch_salience: [{entity_id, claim_count, top_claim_id}].
  • Resets on kb.session_end or after 30 minutes of inactivity.
  • Disabled in stateless callers (no session_id) and opt-out via config.

Acceptance

  • A session that searches "jwt" three times in a row gets _meta.vouch_salience highlighting the jwt entity on the next call.
  • Salience computation runs in <50ms p95 on a 1000-entity KB.
  • Disabling via config removes the field entirely.

Out of scope

  • LLM-mediated entity extraction — substring + FTS only in v1.
  • Cross-session salience — per-session only.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions