Benchmark: OpenClaw builtin memory-core comparison

## Part of #10

Add the OpenClaw builtin memory search as a comparison provider in the benchmark suite.

## Requirements

- Index the same test corpus with OpenClaw's builtin memory-core provider
- Run the same queries through the builtin `memory_search` tool
- Score with the same metrics (Recall@K, MRR, Precision@K)
- Side-by-side comparison output: BM vs builtin per category

## Implementation options

1. **Preferred:** Start an OpenClaw instance with builtin memory pointing at the benchmark corpus, query via the memory tools
2. **Alternative:** Replicate the builtin's chunking + embedding + hybrid search logic in a standalone script (more work, but no OpenClaw dependency for CI)

## Key comparisons

- Semantic queries: both use vector search, should be similar
- Relational queries: BM has knowledge graph, builtin has only text
- Exact fact queries: builtin has BM25 hybrid, BM has FTS — compare
- Task queries: BM composited search scans tasks, builtin does not
- Context efficiency: BM returns structured observations, builtin returns raw chunks

## Depends on

- #11 (corpus)
- #12 (eval harness)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark: OpenClaw builtin memory-core comparison #13

Part of #10

Requirements

Implementation options

Key comparisons

Depends on

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Benchmark: OpenClaw builtin memory-core comparison #13

Description

Part of #10

Requirements

Implementation options

Key comparisons

Depends on

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions