Skip to content

Emit a structured learning trace #11

Description

@adaamko

Goal

Produce one ordered, human-readable record of what a learning run did: "synthesized N rules → iteration 1 patched X (accepted, +3 F1) → audit merged A,B → pruned C → final 22 rules." Today these facts are computed and printed in passing but never collected into a single reconstructable trace.

The pieces already exist (collect, don't recompute)

  • rulechef/learner.py::_log_patch_decision (L640) — per patch: candidate rules, accept/reject, metric delta (already written as trajectory records).
  • rulechef/pipeline.py::_apply_audit (L296) — applies merge/remove actions and has the before/after rules.
  • rulechef/coordinator.py::AuditResult (L26) and critique_rules — the audit actions and critic feedback objects.
  • rulechef/training_logger.py — the raw prompts/responses per step.

What to do

  1. Thread a run-scoped trace collector through learn_rules (the Pipeline in rulechef/pipeline.py is the natural owner) that appends one entry per step:
    • synthesis: initial rules (names + count)
    • iteration N: failure summary, patch added, accepted/rejected, F1 before→after
    • audit: actions taken (merged [a,b]->c, removed [d])
    • prune: rules dropped
    • final: surviving rules
  2. Write it to <storage>/<dataset>.trace.json and add a pretty-printer (reuse the style of print_ranking_report in rulechef/ranking.py).

Gotchas

  • Keep it opt-in or cheap — don't store full doc text per step, just rule names/ids and metrics (the prompts are already in training_logger).
  • The agentic vs simple coordinator paths differ; make sure both feed the collector.

Acceptance

A learning run writes a trace.json that reconstructs the step-by-step story, and print_trace(trace) renders it. Test on a tiny in-memory dataset (no LLM needed if you stub synthesis) asserting the trace has synthesis → … → final entries in order.

Pointers

rulechef/pipeline.py (run, _apply_audit L296), rulechef/learner.py (_log_patch_decision L640), rulechef/coordinator.py (AuditResult L26), rulechef/training_logger.py.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions