Skip to content

Conversation

@Soph
Copy link
Collaborator

@Soph Soph commented Jan 29, 2026

This adds the implementation of the new suffix based shadow branch logic that tries to keep a linear history in a shadow branch and if not will start a new suffix.

This also inverts the logic around getting warned about multi sessions to be an opt-in instead of an opt-out now. The assumption is that the default is good enough in most cases.


Shadow Branch Isolation via Suffixes

Problem

The current shadow branch architecture stores both file content and metadata on a single branch per base commit (entire/<hash>). This causes corruption when sessions interleave:

  1. Ghost files - When a user dismisses changes (git restore .) and starts new work, the old session's files remain in the shadow tree. The new checkpoint only processes files present in the worktree, leaving dismissed files as orphaned entries.

  2. Content corruption - New checkpoints overwrite file entries from previous sessions. If session A modified line 30 and session B (after dismiss) modifies line 5 of the same file, the shadow tree now contains only line 5's change. The git history shows "removed line 30, added line 5" even though session B never touched line 30.

  3. Attribution corruption - Attribution calculations compare the shadow tree against base and HEAD. Mixed/stale data from multiple sessions produces incorrect agent contribution metrics.

  4. No dismiss detection - Git provides no hooks for restore, reset, or checkout, so we cannot detect when work is discarded.

Why Not Just Reset the Shadow Branch?

When a new session starts and the worktree doesn't match the shadow branch, it might seem safe to reset or delete the old shadow data. However, this would break legitimate workflows involving git stash:

  1. Session A does work, creates checkpoints on -1
  2. User runs git stash (worktree is now clean)
  3. Session B starts—perhaps just to ask a question, no code changes
  4. User runs git stash apply (Session A's work returns to worktree)
  5. User commits

At step 5, we need Session A's checkpoint data for condensation. If we had reset the shadow branch at step 3 (seeing a clean worktree), that data would be lost.

With the suffix approach, Session B would only create -2 if it actually makes code changes that trigger a checkpoint. If Session B just answers questions without modifying files, no checkpoint is created, -1 remains untouched, and condensation works correctly.

The harder scenario:

  1. Session A creates checkpoints on -1
  2. User runs git stash
  3. Session B makes code changes → creates -2
  4. User dismisses Session B's changes (git restore .)
  5. User runs git stash apply (Session A's work returns)
  6. User commits

Now we have both -1 and -2, and need to condense from -1. Detecting that we should "go back" to -1 (rather than creating -3) would require recognizing that the worktree now matches -1's content—essentially running the continuation heuristic against all existing suffixes, not just the highest one. This adds complexity and may not be worth implementing initially. The data on -1 is at least preserved and could be condensed if we match it correctly at commit time.

Solution

Shadow branches use numeric suffixes to isolate independent streams of work:

entire/af5a770-1  ← First stream of work
entire/af5a770-2  ← Second stream (after first was dismissed)
entire/af5a770-3  ← Third stream (after second was dismissed)

When a session creates a checkpoint, it either continues on an existing suffix (if previous work is still present in the worktree) or creates a new suffix (if previous work was dismissed). This decision runs every time a checkpoint is needed—not just when a session starts.

This handles scenarios like:

  1. User works with Claude, creates checkpoints on -1
  2. Closes Claude (session state persists)
  3. Runs git restore . to dismiss changes
  4. Resumes with a new prompt
  5. On checkpoint, we detect the dismiss → session now continues on -2

The session ID can stay the same, but the suffix changes because the previous work stream was abandoned. Each suffix maintains its own isolated file tree, preventing ghost files and content corruption.

Suffix Decision Heuristic

When determining whether to continue on an existing suffix or create a new one:

Step 1: Check for clean worktree
If the worktree has no modifications, previous work was dismissed. Create a new suffix.

Step 2: Check for file overlap
Compare files touched by the existing shadow branch against files modified in the worktree. If there's no overlap (user is working on completely different files), create a new suffix.

Step 3: Check if agent's changes are preserved
For overlapping files, determine whether any of the agent's additions still exist in the worktree. This requires comparing three versions: base, shadow tip, and worktree.

Two approaches for step 3:

Approach Method Pros Cons
Exact line matching Line exists identically in both shadow and worktree (and not in base) Precise, no false positives Fails if user changed indentation or made minor edits
Fuzzy line matching Compare lines with whitespace normalized, or check if shadow's additions are contained within worktree's additions Tolerates minor edits May have false positives on common patterns

Both approaches use go-git's diff utilities to extract added lines between base→shadow, then check for their presence in the worktree content.

Decision matrix:

Worktree state Shadow touched files Agent's lines present Decision
Clean - - NEW suffix
Modified No overlap - NEW suffix
Modified Overlap exists Yes SAME suffix
Modified Overlap exists No NEW suffix

Condensation and Cleanup

Each session stores its suffix number in session state. On commit:

  1. Post-commit hook identifies sessions by their stored suffix
  2. Each session condenses only from its own suffixed branch
  3. All suffixes for that base commit are deleted (both condensed and abandoned)

Deleting abandoned suffixes is safe—if the user committed without that work, they chose not to keep it.

Session State Changes

Session state gains a ShadowBranchSuffix field tracking which suffix the session uses. This value may change during a session's lifetime if the user dismisses work and continues with new prompts.

Benefits

  • No ghost files - Each suffix starts fresh or continues cleanly
  • No content corruption - Sessions cannot overwrite each other's trees
  • Accurate attribution - Each suffix has coherent single-stream history
  • No dismiss detection needed - We detect "should I continue or start fresh" based on observable file state

Copilot AI review requested due to automatic review settings January 29, 2026 17:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR proposes a comprehensive redesign of the shadow branch architecture to prevent data corruption in multi-session workflows. The document introduces a suffix-based approach where shadow branches use numeric suffixes (e.g., entire/af5a770-1, entire/af5a770-2) to isolate independent streams of work, addressing ghost files, content corruption, and attribution issues.

Changes:

  • Adds detailed architecture documentation explaining the shadow branch suffix approach
  • Proposes implementation changes across checkpoint storage, session state management, and condensation logic
  • Includes heuristic algorithms for deciding when to create new suffixes vs. continuing existing ones
  • Documents migration strategy for existing unsuffixed shadow branches

return SuffixDecision{}, err
}

// Step 1: Clean worktree = new suffix
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function getModifiedFilesFromStatus is referenced but doesn't exist in the codebase. This is pseudocode and should be documented as such or replaced with actual implementation details showing how to extract modified files from git status.

Suggested change
// Step 1: Clean worktree = new suffix
// Step 1: Clean worktree = new suffix
// NOTE: getModifiedFilesFromStatus is pseudocode illustrating how you might
// derive a list of modified files from go-git's worktree status.

Copilot uses AI. Check for mistakes.
}

// Step 2: No overlap = new suffix
overlap := intersect(shadowTouched, worktreeModified)
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function intersect is referenced but doesn't exist in the codebase. This appears to be pseudocode representing set intersection logic. The document should clarify this is pseudocode or provide the actual implementation approach.

Copilot uses AI. Check for mistakes.
state *SessionState,
) (*CondenseResult, error) {
// Get shadow branch WITH SUFFIX
shadowBranchName := getShadowBranchNameForCommitWithSuffix(
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name getShadowBranchNameForCommitWithSuffix is inconsistent with the previously defined function name ShadowBranchNameForCommitWithSuffix (line 147). In the codebase, exported functions use pascal case without prefixes, while internal helper functions use camelCase with prefixes. Since this is used across packages (from checkpoint package), it should be ShadowBranchNameForCommitWithSuffix (exported, pascal case) to match the existing ShadowBranchNameForCommit function.

Suggested change
shadowBranchName := getShadowBranchNameForCommitWithSuffix(
shadowBranchName := ShadowBranchNameForCommitWithSuffix(

Copilot uses AI. Check for mistakes.
Comment on lines +524 to +530
iter.ForEach(func(ref *plumbing.Reference) error {
if strings.HasPrefix(ref.Name().Short(), prefix) {
toDelete = append(toDelete, ref.Name())
}
return nil
})

Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code uses iter.ForEach which continues iteration even if errors occur, but errors returned from the callback are silently ignored. The function should handle errors properly, either by accumulating them or by breaking iteration when an error occurs. Additionally, calling iter.Close() and checking its return value is the idiomatic pattern for go-git iterators.

Suggested change
iter.ForEach(func(ref *plumbing.Reference) error {
if strings.HasPrefix(ref.Name().Short(), prefix) {
toDelete = append(toDelete, ref.Name())
}
return nil
})
if err := iter.ForEach(func(ref *plumbing.Reference) error {
if strings.HasPrefix(ref.Name().Short(), prefix) {
toDelete = append(toDelete, ref.Name())
}
return nil
}); err != nil {
return err
}
if err := iter.Close(); err != nil {
return err
}

Copilot uses AI. Check for mistakes.
}, nil
}

// Get files touched by highest suffix
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function getFilesTouchedBySuffix is called but never defined in the document. This is a critical helper function needed for the suffix decision logic. The implementation should be included or at least documented with a comment explaining what it should do (e.g., comparing the shadow branch tree against its base to extract modified file paths).

Suggested change
// Get files touched by highest suffix
// Get files touched by highest suffix
// getFilesTouchedBySuffix should:
// - Read the shadow branch corresponding to (baseCommit, suffix).
// - Diff that shadow branch tree against baseCommit’s tree.
// - Return a de-duplicated list of file paths that were added, modified,
// or deleted in that suffix, relative to the repo root.
// These paths are then intersected with worktreeModified to decide whether
// it is safe to reuse the existing suffix or we must create a new one.

Copilot uses AI. Check for mistakes.
}

// DecideSuffix determines which suffix to use for checkpointing.
// Returns the suffix number and whether it's a new suffix.
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function signature shows return type (SuffixDecision, error) but the comment says "Returns the suffix number and whether it's a new suffix." The comment should be updated to reflect that the function returns a SuffixDecision struct that contains both the suffix number and the new/existing status, not just those values directly.

Suggested change
// Returns the suffix number and whether it's a new suffix.
// It returns a SuffixDecision containing the suffix number, whether it's new, and the reason.

Copilot uses AI. Check for mistakes.
### Suffix Decision Heuristic

```go
// In strategy/manual_commit_suffix.go (new file)
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The document references manual_commit_suffix.go as a new file to be created, but the implementation logic shown could alternatively fit in an existing file like manual_commit_session.go (which already handles session state and shadow branches) or manual_commit.go. Consider whether creating a new file is necessary or if the logic should be integrated into existing files to maintain cohesion. If a new file is warranted, document the rationale for the separation.

Suggested change
// In strategy/manual_commit_suffix.go (new file)
// In strategy/manual_commit_session.go

Copilot uses AI. Check for mistakes.
func (s *ManualCommitStrategy) PostCommit() error {
// ... existing condensation logic ...

// Clean up ALL suffixes for this base commit
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name logCtx is used but never defined in this code snippet. Based on the codebase conventions shown in the custom guidelines, logging context should be created from the existing context. Consider adding a line like logCtx := ctx.Context or clarifying where logCtx comes from.

Suggested change
// Clean up ALL suffixes for this base commit
// Clean up ALL suffixes for this base commit
logCtx := context.Background()

Copilot uses AI. Check for mistakes.
// Update WriteTemporaryOptions
type WriteTemporaryOptions struct {
BaseCommit string
Suffix int // NEW: which suffix to use
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing WriteTemporaryOptions struct doesn't have a Suffix field (as shown in checkpoint/checkpoint.go:104-139). This proposed addition should be clearly marked as a new field that needs to be added to the existing struct, not presented as if it already exists. Consider using a comment like "// NEW: Add this field to WriteTemporaryOptions" to make the proposed change explicit.

Suggested change
Suffix int // NEW: which suffix to use
Suffix int // NEW: Add this field to WriteTemporaryOptions

Copilot uses AI. Check for mistakes.
Comment on lines +512 to +516
hash := baseCommit
if len(hash) > checkpoint.ShadowBranchHashLength {
hash = hash[:checkpoint.ShadowBranchHashLength]
}
prefix := checkpoint.ShadowBranchPrefix + hash + "-"
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constant shadowBranchPrefix is referenced but doesn't exist in the codebase. The existing codebase uses checkpoint.ShadowBranchPrefix (exported from the checkpoint package) rather than a local constant. This function should use checkpoint.ShadowBranchPrefix to be consistent with the existing getShadowBranchNameForCommit function in manual_commit_session.go:243.

Copilot uses AI. Check for mistakes.
Entire-Checkpoint: 54bb2ec87cc8
@Soph Soph force-pushed the soph/improved-shadow branch from 11becec to fa0fe9d Compare January 30, 2026 07:50
@Soph Soph mentioned this pull request Jan 31, 2026
@Soph Soph closed this Feb 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant