-
Notifications
You must be signed in to change notification settings - Fork 140
revisited proposal for shadow branch handling #125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR proposes a comprehensive redesign of the shadow branch architecture to prevent data corruption in multi-session workflows. The document introduces a suffix-based approach where shadow branches use numeric suffixes (e.g., entire/af5a770-1, entire/af5a770-2) to isolate independent streams of work, addressing ghost files, content corruption, and attribution issues.
Changes:
- Adds detailed architecture documentation explaining the shadow branch suffix approach
- Proposes implementation changes across checkpoint storage, session state management, and condensation logic
- Includes heuristic algorithms for deciding when to create new suffixes vs. continuing existing ones
- Documents migration strategy for existing unsuffixed shadow branches
| return SuffixDecision{}, err | ||
| } | ||
|
|
||
| // Step 1: Clean worktree = new suffix |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function getModifiedFilesFromStatus is referenced but doesn't exist in the codebase. This is pseudocode and should be documented as such or replaced with actual implementation details showing how to extract modified files from git status.
| // Step 1: Clean worktree = new suffix | |
| // Step 1: Clean worktree = new suffix | |
| // NOTE: getModifiedFilesFromStatus is pseudocode illustrating how you might | |
| // derive a list of modified files from go-git's worktree status. |
| } | ||
|
|
||
| // Step 2: No overlap = new suffix | ||
| overlap := intersect(shadowTouched, worktreeModified) |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function intersect is referenced but doesn't exist in the codebase. This appears to be pseudocode representing set intersection logic. The document should clarify this is pseudocode or provide the actual implementation approach.
| state *SessionState, | ||
| ) (*CondenseResult, error) { | ||
| // Get shadow branch WITH SUFFIX | ||
| shadowBranchName := getShadowBranchNameForCommitWithSuffix( |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function name getShadowBranchNameForCommitWithSuffix is inconsistent with the previously defined function name ShadowBranchNameForCommitWithSuffix (line 147). In the codebase, exported functions use pascal case without prefixes, while internal helper functions use camelCase with prefixes. Since this is used across packages (from checkpoint package), it should be ShadowBranchNameForCommitWithSuffix (exported, pascal case) to match the existing ShadowBranchNameForCommit function.
| shadowBranchName := getShadowBranchNameForCommitWithSuffix( | |
| shadowBranchName := ShadowBranchNameForCommitWithSuffix( |
| iter.ForEach(func(ref *plumbing.Reference) error { | ||
| if strings.HasPrefix(ref.Name().Short(), prefix) { | ||
| toDelete = append(toDelete, ref.Name()) | ||
| } | ||
| return nil | ||
| }) | ||
|
|
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code uses iter.ForEach which continues iteration even if errors occur, but errors returned from the callback are silently ignored. The function should handle errors properly, either by accumulating them or by breaking iteration when an error occurs. Additionally, calling iter.Close() and checking its return value is the idiomatic pattern for go-git iterators.
| iter.ForEach(func(ref *plumbing.Reference) error { | |
| if strings.HasPrefix(ref.Name().Short(), prefix) { | |
| toDelete = append(toDelete, ref.Name()) | |
| } | |
| return nil | |
| }) | |
| if err := iter.ForEach(func(ref *plumbing.Reference) error { | |
| if strings.HasPrefix(ref.Name().Short(), prefix) { | |
| toDelete = append(toDelete, ref.Name()) | |
| } | |
| return nil | |
| }); err != nil { | |
| return err | |
| } | |
| if err := iter.Close(); err != nil { | |
| return err | |
| } |
| }, nil | ||
| } | ||
|
|
||
| // Get files touched by highest suffix |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function getFilesTouchedBySuffix is called but never defined in the document. This is a critical helper function needed for the suffix decision logic. The implementation should be included or at least documented with a comment explaining what it should do (e.g., comparing the shadow branch tree against its base to extract modified file paths).
| // Get files touched by highest suffix | |
| // Get files touched by highest suffix | |
| // getFilesTouchedBySuffix should: | |
| // - Read the shadow branch corresponding to (baseCommit, suffix). | |
| // - Diff that shadow branch tree against baseCommit’s tree. | |
| // - Return a de-duplicated list of file paths that were added, modified, | |
| // or deleted in that suffix, relative to the repo root. | |
| // These paths are then intersected with worktreeModified to decide whether | |
| // it is safe to reuse the existing suffix or we must create a new one. |
| } | ||
|
|
||
| // DecideSuffix determines which suffix to use for checkpointing. | ||
| // Returns the suffix number and whether it's a new suffix. |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function signature shows return type (SuffixDecision, error) but the comment says "Returns the suffix number and whether it's a new suffix." The comment should be updated to reflect that the function returns a SuffixDecision struct that contains both the suffix number and the new/existing status, not just those values directly.
| // Returns the suffix number and whether it's a new suffix. | |
| // It returns a SuffixDecision containing the suffix number, whether it's new, and the reason. |
| ### Suffix Decision Heuristic | ||
|
|
||
| ```go | ||
| // In strategy/manual_commit_suffix.go (new file) |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The document references manual_commit_suffix.go as a new file to be created, but the implementation logic shown could alternatively fit in an existing file like manual_commit_session.go (which already handles session state and shadow branches) or manual_commit.go. Consider whether creating a new file is necessary or if the logic should be integrated into existing files to maintain cohesion. If a new file is warranted, document the rationale for the separation.
| // In strategy/manual_commit_suffix.go (new file) | |
| // In strategy/manual_commit_session.go |
| func (s *ManualCommitStrategy) PostCommit() error { | ||
| // ... existing condensation logic ... | ||
|
|
||
| // Clean up ALL suffixes for this base commit |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable name logCtx is used but never defined in this code snippet. Based on the codebase conventions shown in the custom guidelines, logging context should be created from the existing context. Consider adding a line like logCtx := ctx.Context or clarifying where logCtx comes from.
| // Clean up ALL suffixes for this base commit | |
| // Clean up ALL suffixes for this base commit | |
| logCtx := context.Background() |
| // Update WriteTemporaryOptions | ||
| type WriteTemporaryOptions struct { | ||
| BaseCommit string | ||
| Suffix int // NEW: which suffix to use |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing WriteTemporaryOptions struct doesn't have a Suffix field (as shown in checkpoint/checkpoint.go:104-139). This proposed addition should be clearly marked as a new field that needs to be added to the existing struct, not presented as if it already exists. Consider using a comment like "// NEW: Add this field to WriteTemporaryOptions" to make the proposed change explicit.
| Suffix int // NEW: which suffix to use | |
| Suffix int // NEW: Add this field to WriteTemporaryOptions |
| hash := baseCommit | ||
| if len(hash) > checkpoint.ShadowBranchHashLength { | ||
| hash = hash[:checkpoint.ShadowBranchHashLength] | ||
| } | ||
| prefix := checkpoint.ShadowBranchPrefix + hash + "-" |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The constant shadowBranchPrefix is referenced but doesn't exist in the codebase. The existing codebase uses checkpoint.ShadowBranchPrefix (exported from the checkpoint package) rather than a local constant. This function should use checkpoint.ShadowBranchPrefix to be consistent with the existing getShadowBranchNameForCommit function in manual_commit_session.go:243.
Entire-Checkpoint: 54bb2ec87cc8
11becec to
fa0fe9d
Compare
Entire-Checkpoint: 54bb2ec87cc8
Entire-Checkpoint: 9878b713483d
Entire-Checkpoint: adff0201aecb
Entire-Checkpoint: 5d107d133886
This adds the implementation of the new suffix based shadow branch logic that tries to keep a linear history in a shadow branch and if not will start a new suffix.
This also inverts the logic around getting warned about multi sessions to be an opt-in instead of an opt-out now. The assumption is that the default is good enough in most cases.
Shadow Branch Isolation via Suffixes
Problem
The current shadow branch architecture stores both file content and metadata on a single branch per base commit (
entire/<hash>). This causes corruption when sessions interleave:Ghost files - When a user dismisses changes (
git restore .) and starts new work, the old session's files remain in the shadow tree. The new checkpoint only processes files present in the worktree, leaving dismissed files as orphaned entries.Content corruption - New checkpoints overwrite file entries from previous sessions. If session A modified line 30 and session B (after dismiss) modifies line 5 of the same file, the shadow tree now contains only line 5's change. The git history shows "removed line 30, added line 5" even though session B never touched line 30.
Attribution corruption - Attribution calculations compare the shadow tree against base and HEAD. Mixed/stale data from multiple sessions produces incorrect agent contribution metrics.
No dismiss detection - Git provides no hooks for
restore,reset, orcheckout, so we cannot detect when work is discarded.Why Not Just Reset the Shadow Branch?
When a new session starts and the worktree doesn't match the shadow branch, it might seem safe to reset or delete the old shadow data. However, this would break legitimate workflows involving
git stash:-1git stash(worktree is now clean)git stash apply(Session A's work returns to worktree)At step 5, we need Session A's checkpoint data for condensation. If we had reset the shadow branch at step 3 (seeing a clean worktree), that data would be lost.
With the suffix approach, Session B would only create
-2if it actually makes code changes that trigger a checkpoint. If Session B just answers questions without modifying files, no checkpoint is created,-1remains untouched, and condensation works correctly.The harder scenario:
-1git stash-2git restore .)git stash apply(Session A's work returns)Now we have both
-1and-2, and need to condense from-1. Detecting that we should "go back" to-1(rather than creating-3) would require recognizing that the worktree now matches-1's content—essentially running the continuation heuristic against all existing suffixes, not just the highest one. This adds complexity and may not be worth implementing initially. The data on-1is at least preserved and could be condensed if we match it correctly at commit time.Solution
Shadow branches use numeric suffixes to isolate independent streams of work:
When a session creates a checkpoint, it either continues on an existing suffix (if previous work is still present in the worktree) or creates a new suffix (if previous work was dismissed). This decision runs every time a checkpoint is needed—not just when a session starts.
This handles scenarios like:
-1git restore .to dismiss changes-2The session ID can stay the same, but the suffix changes because the previous work stream was abandoned. Each suffix maintains its own isolated file tree, preventing ghost files and content corruption.
Suffix Decision Heuristic
When determining whether to continue on an existing suffix or create a new one:
Step 1: Check for clean worktree
If the worktree has no modifications, previous work was dismissed. Create a new suffix.
Step 2: Check for file overlap
Compare files touched by the existing shadow branch against files modified in the worktree. If there's no overlap (user is working on completely different files), create a new suffix.
Step 3: Check if agent's changes are preserved
For overlapping files, determine whether any of the agent's additions still exist in the worktree. This requires comparing three versions: base, shadow tip, and worktree.
Two approaches for step 3:
Both approaches use go-git's diff utilities to extract added lines between base→shadow, then check for their presence in the worktree content.
Decision matrix:
Condensation and Cleanup
Each session stores its suffix number in session state. On commit:
Deleting abandoned suffixes is safe—if the user committed without that work, they chose not to keep it.
Session State Changes
Session state gains a
ShadowBranchSuffixfield tracking which suffix the session uses. This value may change during a session's lifetime if the user dismisses work and continues with new prompts.Benefits