Quality gate: CI regression checks with elf-eval trace compare

## Summary
Add CI regression gates using `elf-eval` trace compare.

## Problem
Without hard quality gates, retrieval regressions are discovered too late.

## In Scope
- Add CI stage running trace-compare against fixed trace IDs.
- Define thresholds for positional churn, set churn, and top-k retention.
- Fail CI when thresholds are exceeded.
- Publish machine-readable diff artifacts.

## Out of Scope
- Replacing existing integration tests.
- Full benchmark infrastructure expansion.

## Deliverables
- CI workflow updates.
- Baseline trace set.
- Gate configuration docs.

## Acceptance Criteria
- CI runs compare mode reproducibly.
- Regressions surface actionable metrics.
- Local reproduction instructions are documented.

## Dependencies
- #50
- #51

## Implementation Checklist
- [ ] Baseline trace snapshot committed.
- [ ] CI job added.
- [ ] Threshold config documented.
- [ ] Failure artifacts exported.

## Done When
- Retrieval quality regressions block merges by default.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quality gate: CI regression checks with elf-eval trace compare #54

Summary

Problem

In Scope

Out of Scope

Deliverables

Acceptance Criteria

Dependencies

Implementation Checklist

Done When

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Quality gate: CI regression checks with elf-eval trace compare #54

Description

Summary

Problem

In Scope

Out of Scope

Deliverables

Acceptance Criteria

Dependencies

Implementation Checklist

Done When

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions