Skip to content

feat: auto-extracted typed edges on every approved page #224

Description

@plind-junior

What you're trying to do

gbrain claims +31.4 points P@5 over vector-only baselines because every page write extracts entity refs and creates typed edges (attended, works_at, invested_in, founded) with zero LLM calls. Vouch's Relation model exists but the graph is hand-populated. Approved pages carry citation lists; the typed-edge layer above that is missing.

Build it as a write-time pass: when a page is approved, parse its body for [[entity-id]] wiki-links + frontmatter entities: list, file the implied relations as auto-proposed (status working, proposed_by: auto-extractor). Reviewer can confirm or reject in bulk.

Suggested shape

  • New src/vouch/extractors/edges.py runs after proposals.approve lands a page.
  • Edge types: mentions (wiki-link), derived_from (citation in frontmatter), relates_to (entity in frontmatter entities:).
  • Auto-proposed edges land in proposed/ like any other proposal; the reviewer sees them grouped under the originating page.
  • vouch crystallize <session> auto-includes the extracted edges.

Acceptance

  • Approving a page with three [[entity]] wiki-links files three relation.mentions proposals.
  • Auto-extracted edges carry proposed_by: vouch-extractor; the audit log distinguishes them.
  • Rejecting an auto-extracted edge is a single CLI call; mass-rejecting is one bulk op.

Out of scope

  • LLM-mediated entity recognition beyond explicit wiki-links / frontmatter — substring + structured only.
  • Auto-merge of duplicate entities (separate issue: desktop #O1).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions