Skip to content

Mission replay, fork, and template system #86

@harrymunro

Description

@harrymunro

Summary

Implement a mission replay system that can re-execute a past mission from any checkpoint, fork a mission to explore alternative execution paths, or extract a mission as a reusable template.

Motivation

Nelson's structured data layer captures comprehensive mission history (mission-log.json, battle-plan.json, stand-down.json, fleet-status.json), but this data is write-only. There is no way to learn from past missions through experimentation. Every new mission starts from scratch, even when the team has solved similar problems before.

The AgentRR paper demonstrates that record-replay mechanisms convert execution traces into reusable experiences. Temporal's worker versioning enables workflow forking. Combined, these patterns make missions first-class reproducible artifacts.

Detailed Design

Mission Replay

python3 nelson-data.py replay \
  --source .nelson/missions/2026-04-08_201214_169967B8 \
  --from-checkpoint 2 \
  --modifications battle-plan-v2.json

Replay takes:

  • A source mission directory
  • An optional starting checkpoint (default: beginning)
  • Optional modifications (revised battle plan, different ship assignments, different mode)

The replay creates a new mission directory with a replayed_from field in sailing-orders.json, preserving lineage.

Mission Fork

python3 nelson-data.py fork \
  --source .nelson/missions/2026-04-08_201214_169967B8 \
  --at-checkpoint 3 \
  --name "auth-refactor-alt-approach"

Fork creates a new mission branching from a specific checkpoint, with:

  • All state up to the fork point preserved
  • New mission directory with independent state going forward
  • forked_from metadata for lineage tracking

Mission Templates

python3 nelson-data.py template \
  --source .nelson/missions/2026-04-08_201214_169967B8 \
  --name "research-mission-4-ship"
  --description "4 parallel research ships + 1 synthesis captain"

Templates extract:

  • Sailing orders structure (parameterized, not literal)
  • Battle plan topology (task count, dependency graph shape, ship classes)
  • Standing orders that fired (as warnings for future users)
  • Recommended mode and crew configuration
  • Historical performance metrics (avg budget, avg duration, success rate)
python3 nelson-data.py from-template \
  --template research-mission-4-ship \
  --outcome "Research competitive landscape for product X" \
  --metric "Deliver prioritized feature comparison"

Template Library

Templates stored in .nelson/templates/:

.nelson/templates/
  research-mission-4-ship.json
  feature-branch-delivery.json
  security-audit-3-ship.json
  database-migration.json

Cross-Mission Comparison

python3 nelson-data.py compare \
  --mission-a .nelson/missions/2026-04-08_201214 \
  --mission-b .nelson/missions/2026-04-09_101500 \
  --metrics "budget,duration,parallelism,outcome"

Output: structured comparison of two missions on selected metrics, enabling A/B evaluation of different approaches.

Rationale

  • HMS Audacious rated mission replay/fork as High impact / High effort / explicitly transformative
  • HMS Diamond identified mission replay as a Tier 4 next-gen capability
  • HMS Astute rated mission templating as a top-10 gap
  • HMS Daring identified carryover summaries (from AG2) as a related pattern
  • Cross-mission memory store (issue Move images from repo root to docs/images #8) is the prerequisite data layer

Effort Estimate

XL

Impact

Very High — makes missions first-class reproducible artifacts; category-defining capability

Dependencies

Requires: Cross-mission memory store (#8)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions