Benchmark: build realistic test corpus from agent memory patterns

## Part of #10

Build a realistic test corpus that mirrors how agents actually use memory in production.

## Requirements

- Anonymized markdown files covering all the patterns agents use:
  - `MEMORY.md` — curated long-term facts, preferences, decisions
  - `memory/YYYY-MM-DD.md` — daily notes with events, checks, conversations
  - `memory/tasks/*.md` — structured tasks with schema fields
  - `memory/topics/*.md` — research notes, competitive analysis
  - `memory/people/*.md` — person notes with relations
- Realistic size: ~30-50 files, ~50-100KB total
- Must exercise all query categories: exact facts, semantic, temporal, relational, cross-note, needle-in-haystack
- Include evolving facts (same topic updated across multiple daily notes)
- Include BM-specific features: observations `[key] value`, relations `relates_to [[Entity]]`, frontmatter schemas

## Deliverable

- `benchmark/corpus/` with all files
- `benchmark/queries.json` with 50+ annotated queries
- Ground truth: which files/chunks contain the answer for each query


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark: build realistic test corpus from agent memory patterns #11

Part of #10

Requirements

Deliverable

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Benchmark: build realistic test corpus from agent memory patterns #11

Description

Part of #10

Requirements

Deliverable

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions