Skip to content

Commit f4643ff

Browse files
PR 5: cd-attribution + subshell isolation + bash -c recursion (#8)
Completes the SPEC §9 + §10 behaviors deferred from earlier PRs. Locked interpretations #4 (bash -c overflow → outer ParsedCommand.IsUnparseable), #5 (only cd/chdir propagate; pushd/popd parse but don't), and #6 (cd $VAR → synthetic DynamicSkip attribution arg) now fully materialized. What's wired: - Internal/Bash/Parsing/CdAttributionContext.cs (parser-internal, mutable): * SubshellStack of monotonic IDs handles sibling subshells cleanly (depth alone doesn't distinguish (a) && (b)) * SetLiteralAttribution / SetDynamicAttribution * HasAttribution flag - BashCommandParser updates: * Only cd/chdir update attribution context (interp #5) * Subsequent clauses get synthetic IsCwdAttribution arg: - Literal cd target → Kind=Literal, IsPath=true, Resolved=<cwd> - DynamicSkip cd target → Kind=DynamicSkip, IsPath=false, Raw="<dynamic-cwd>" (interp #6) * Subshell ( ... ) entry pushes attribution stack, exit pops; clauses inside get IsSubshell=true; subshell mutations don't leak out * bash -c / sh -c real recursion (was single-clause framework in PR 3): inner ParsedCommand surfaced with IsBashCWrapped=true on each clause; outer bash -c clause consumed (not emitted) * Recursion depth cap at 5 → outer ParsedCommand.IsUnparseable=true with reason "bash -c recursion depth exceeded (>5)" (interp #4) * Outer cd attribution does NOT leak into bash -c inner clauses (v0.1 decision; tracked for v0.1.x revisit) - BashResolver internal overload Resolve(raw, treatAsPath, options, workingDirectoryUnknown): lets the propagator force DynamicSkip on relative-path args under dynamic cd (interp #6) without polluting the public BashParserOptions surface. Tests: - 8 new parser tests: attribution propagation, subshell isolation, sequential cd, pushd non-propagation, dynamic cd, sibling subshells, bash -c depth-2 success, bash -c depth-6 overflow. - CorpusRunnerTests: added IsCwdAttribution field on ExpectedArg. - 30 new corpus entries (71-100): 10 cd-in-compound + 10 subshell + 10 bash -c. 5 PR 3-4 corpus entries refreshed (25, 28, 34, 35, 52). - Total: 337/337 passing (was 296 at PR 4 baseline). SPEC.md updated: - §9 rule 3 explicit on which CwdVerbs propagate (only cd/chdir); added "Dynamic-cd attribution" subsection per interp #6 - §10 subshell flattening rephrased; bash -c recursion-limit replaced with "set ParsedCommand.IsUnparseable=true" per interp #4 Public API surface unchanged. PublicApiSnapshotTests still 18/18 green.
1 parent ddce4c7 commit f4643ff

43 files changed

Lines changed: 1908 additions & 105 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

IMPLEMENTATION_PLAN.md

Lines changed: 20 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -114,20 +114,26 @@ bulldoze priorities.
114114
- [x] Flag-value path classification table (`git -C` is path; `curl -d`
115115
is body data; `docker -v` is single literal IsPath=false per #8)
116116

117-
### 8. cd-in-compound propagation (SPEC §9)
118-
119-
- [ ] First-clause `cd` sets attributed cwd for the compound
120-
- [ ] Subsequent clauses receive synthetic `Arg` with
121-
`IsCwdAttribution=true`
122-
- [ ] Subsequent `cd` in the same compound replaces attribution
123-
- [ ] Subshell boundaries reset attribution
124-
125-
### 9. Subshell + bash -c surfacing (SPEC §10)
126-
127-
- [ ] Flatten subshell clauses into parent's `Clauses` with
128-
`IsSubshell=true`
129-
- [ ] Surface `bash -c` inner clauses inline with `IsBashCWrapped=true`
130-
- [ ] Recursion-depth cap
117+
### 8. cd-in-compound propagation (SPEC §9) — PR 5, complete
118+
119+
- [x] First-clause `cd`/`chdir` sets attributed cwd; only those two verbs
120+
propagate (interp #5)
121+
- [x] Subsequent clauses receive synthetic `Arg` with
122+
`IsCwdAttribution=true`; `Kind=Literal, IsPath=true` when cd
123+
target resolved, `Kind=DynamicSkip, IsPath=false` when dynamic
124+
(interp #6)
125+
- [x] Subsequent `cd` in the same compound replaces attribution
126+
- [x] Subshell boundaries isolate attribution via push/pop stack
127+
128+
### 9. Subshell + bash -c surfacing (SPEC §10) — PR 5, complete
129+
130+
- [x] Flatten subshell clauses into parent's `Clauses` with
131+
`IsSubshell=true`; sibling subshells handled via SubshellStack IDs
132+
- [x] Surface `bash -c "..."` / `sh -c "..."` inner clauses inline with
133+
`IsBashCWrapped=true`; outer cd attribution doesn't leak into
134+
inner shell (v0.1 decision)
135+
- [x] Recursion depth cap at 5 → outer
136+
`ParsedCommand.IsUnparseable=true` (interp #4)
131137

132138
### 10. Hand-authored corpus (SPEC §13 — minimum 105 entries)
133139

SPEC.md

Lines changed: 43 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -680,8 +680,15 @@ The agent's natural idiom is `cd /target && cmd1 && cmd2`. Bash semantics:
680680

681681
### Rules
682682

683-
1. **First clause is a `cd`-family verb** (per §6.2): the cd target becomes
684-
the **attributed cwd** for subsequent clauses in the same compound.
683+
1. **First clause is a `cd` or `chdir` verb**: the cd target becomes the
684+
**attributed cwd** for subsequent clauses in the same compound. **Only
685+
`cd` and `chdir`** propagate attribution per locked interpretation #5.
686+
`pushd`, `popd`, `push-location`, and `set-location` are still listed
687+
in `CwdVerbs` so their first non-flag positional is path-classified
688+
(the target shows up as `IsPath=true`), but they do **not** add a
689+
synthetic attribution arg to subsequent clauses. A future v0.1.x or
690+
v0.2 with PowerShell support may model `pushd`/`popd` as a proper
691+
directory stack.
685692
2. **Subsequent clauses inherit the attributed cwd** as if it were
686693
prepended with `-C` semantics. Specifically: a synthetic `Arg` with
687694
`IsPath=true`, `Resolved=<cd target>`, and `Kind=Literal` is added to
@@ -693,17 +700,40 @@ The agent's natural idiom is `cd /target && cmd1 && cmd2`. Bash semantics:
693700

694701
3. **A subsequent `cd`** in the same compound **replaces** the attributed
695702
cwd for clauses after it. (`cd /a && cmd1 && cd /b && cmd2` → cmd1
696-
inherits `/a`, cmd2 inherits `/b`.)
703+
inherits `/a`, cmd2 inherits `/b`.) The replacing `cd /b` itself still
704+
*receives* `/a` as a synthetic attribution arg (rule 2) before becoming
705+
the new source — additive semantics per rule 5.
697706

698707
4. **Subshell boundaries reset attribution.** `cd /a && (cd /b && cmd1) && cmd2`:
699708
cmd1 (inside subshell) inherits `/b`; cmd2 (outside subshell) inherits
700-
`/a` (the subshell's `cd /b` does not leak out).
709+
`/a` (the subshell's `cd /b` does not leak out). A subshell *inherits*
710+
outer attribution on entry (so `cd /a && (cmd)` still attributes cmd
711+
to /a) but its own cd changes stay isolated.
701712

702713
5. **Attribution does not change the clause's verb or original args.**
703714
The attribution is purely additive — the `cd` clause itself is still
704715
parsed normally, and subsequent clauses retain everything the user
705716
typed, plus the synthetic Arg.
706717

718+
### Dynamic-cd attribution (locked interpretation #6)
719+
720+
When the cd target itself is `Kind=DynamicSkip` (e.g. `cd $REPO`), we
721+
statically don't know the resolved cwd. To preserve the cwd-uncertainty
722+
signal for subsequent clauses:
723+
724+
- A synthetic `Arg { Raw="<dynamic-cwd>", Resolved=null, Kind=DynamicSkip,
725+
IsPath=false, IsCwdAttribution=true }` is appended to each subsequent
726+
clause (instead of the literal-cd flavor).
727+
- Relative path args in subsequent clauses are not re-resolved against a
728+
fall-back cwd; they surface as `Kind=DynamicSkip, IsPath=false,
729+
Resolved=null` so consumers route to safe-fail rather than trust a
730+
guessed working directory.
731+
732+
Consumers that iterate `IsPath=true` args won't see the synthetic
733+
attribution arg; consumers that specifically check `IsCwdAttribution`
734+
can detect "this clause's cwd context is unknown" and elevate to
735+
user-prompt instead of treating it like a default-cwd command.
736+
707737
### Example
708738

709739
Input: `cd /target && git -C /other log && cat file.txt`
@@ -739,15 +769,11 @@ already see the resolved path in another arg.
739769
### Subshells
740770

741771
Subshells are clauses wrapped in parens: `(cd /a && cmd)`. The parser
742-
recognizes the parens, parses the inner command, and emits a single
743-
clause with:
744-
745-
- `Verb` = empty (a subshell has no verb of its own)
746-
- `Args` = empty
747-
- `IsSubshell = true`
748-
- A nested `ParsedCommand` field — **but** since the AST should be flat
749-
for consumer convenience, instead we **flatten** the subshell into the
750-
parent's `Clauses` list with each inner clause's `IsSubshell=true`.
772+
recognizes the parens and **flattens** the subshell's inner clauses into
773+
the parent's `Clauses` list, marking each with `IsSubshell=true` so
774+
consumers can distinguish them from outer-compound clauses. A subshell
775+
*inherits* the outer compound's cd attribution on entry but its own cd
776+
changes stay isolated to the subshell (rule 4 above).
751777

752778
Specifically: `(cd /b && cmd) && cmd2` produces three clauses:
753779

@@ -779,7 +805,10 @@ by the recursion. Consumers that care that this came from a wrapper can
779805
inspect `IsBashCWrapped` on the surfaced clauses.
780806

781807
**Recursion limit:** parse `bash -c "bash -c ..."` chains up to depth 5.
782-
Deeper nesting → mark the deepest clause as `IsUnparseable = true`.
808+
Deeper nesting → set the outer `ParsedCommand.IsUnparseable = true` with
809+
reason `"bash -c recursion depth exceeded (>5)"` per locked interpretation
810+
#4. (`Clause` has no `IsUnparseable` field; we surface the overflow on the
811+
top-level ParsedCommand so consumers safe-fail per §11.)
783812

784813
---
785814

openspec/changes/v0.1-locked-interpretations/tasks.md

Lines changed: 55 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -161,23 +161,61 @@ the SPEC.md sections that get updated alongside the implementation.
161161

162162
## 5. PR 5 — cd attribution + subshells + bash -c (interpretations #4, #5, #6)
163163

164-
- [ ] 5.1 `Internal/Bash/Parsing/CdAttributionContext.cs` (parser-internal
165-
mutable; output AST stays immutable)
166-
- [ ] 5.2 Only `cd`/`chdir` propagate (interpretation #5); pushd/popd
167-
parse but don't propagate
168-
- [ ] 5.3 Synthetic `Arg{ IsCwdAttribution=true, … }` appended to
169-
subsequent clauses; `Kind=DynamicSkip` when cd target is dynamic
170-
(interpretation #6)
171-
- [ ] 5.4 Subshell `(...)` parsing with attribution stack push/pop
172-
- [ ] 5.5 `bash -c "..."` / `sh -c "..."` recursion (cap at 5)
173-
- [ ] 5.6 Recursion overflow → outer `ParsedCommand.IsUnparseable=true`
174-
(interpretation #4)
175-
- [ ] 5.7 Update `SPEC.md` §9 (clarify which CwdVerbs propagate; add
176-
dynamic-cd attribution subsection); §10 (replace "mark the deepest
177-
clause" wording with "set ParsedCommand.IsUnparseable=true")
178-
- [ ] 5.8 File GitHub issue: "Model pushd/popd directory stack semantics"
179-
- [ ] 5.9 Open OpenSpec change `cd-attribution-clarifications` for the
180-
§9/§10 deltas
164+
- [x] 5.1 `Internal/Bash/Parsing/CdAttributionContext.cs` — parser-internal
165+
mutable; SubshellStack of monotonic IDs (handles sibling subshells
166+
cleanly); SetLiteralAttribution / SetDynamicAttribution; HasAttribution
167+
- [x] 5.2 Only `cd`/`chdir` propagate (interpretation #5); pushd/popd
168+
parse as CwdVerbs but don't propagate. Test
169+
`Pushd_does_not_propagate_attribution` pins this.
170+
- [x] 5.3 Synthetic `Arg{ IsCwdAttribution=true, … }` appended to
171+
subsequent clauses. `Kind=Literal, IsPath=true, Resolved=<cwd>`
172+
when cd target resolved; `Kind=DynamicSkip, IsPath=false,
173+
Resolved=null, Raw="<dynamic-cwd>"` when cd target was DynamicSkip
174+
per interpretation #6.
175+
- [x] 5.4 Subshell `(...)` parsing with attribution stack push/pop.
176+
`cd /a && (cd /b && cmd1) && cmd2` — cmd1 sees /b attribution;
177+
cmd2 sees /a (subshell's /b doesn't leak out). `IsSubshell=true`
178+
set on every clause inside a subshell.
179+
- [x] 5.5 `bash -c "..."` / `sh -c "..."` recursion replaces PR 3's
180+
single-clause framework. Outer `bash -c` consumed; inner clauses
181+
surfaced inline with `IsBashCWrapped=true`. Outer cd attribution
182+
does NOT leak into inner shell (v0.1 decision).
183+
- [x] 5.6 Recursion depth cap at 5 → outer
184+
`ParsedCommand.IsUnparseable=true` with reason `"bash -c recursion
185+
depth exceeded (>5)"` per interpretation #4.
186+
- [x] 5.7 SPEC §9 updated (rule 3 explicit on which verbs propagate;
187+
dynamic-cd attribution subsection added per interp #6); §10
188+
updated (subshell flattening rephrased; bash -c recursion-limit
189+
replaced with "set ParsedCommand.IsUnparseable" per interp #4).
190+
- [x] 5.8 BashResolver internal overload `Resolve(raw, treatAsPath,
191+
options, workingDirectoryUnknown)` lets the propagator force
192+
DynamicSkip on relative-path args under dynamic-cd (interp #6)
193+
without polluting the public `BashParserOptions` surface.
194+
- [x] 5.9 30 new corpus entries (71-100): 10 cd-in-compound + 10
195+
subshell + 10 bash -c. 5 PR 3-4 corpus entries refreshed
196+
(25, 28, 34, 35, 52).
197+
- [x] 5.10 8 new parser tests covering attribution propagation, subshell
198+
isolation, sequential cd, pushd non-propagation, dynamic cd,
199+
sibling subshells, bash -c depth-2 success, depth-6 overflow.
200+
- [x] 5.11 **337/337 tests passing** (was 296 at PR 4 baseline). Public
201+
API surface unchanged.
202+
- [ ] 5.12 File GitHub issue: "Model pushd/popd directory stack semantics"
203+
(interp #5 option-C upgrade)
204+
205+
### PR 5 follow-ups (tracked for PR 6)
206+
207+
- Possible SPEC clarification: pin the sentinel `Raw` value for
208+
dynamic-cd synthetic attribution args (current implementation uses
209+
`"<dynamic-cwd>"` but SPEC is silent).
210+
- Outer cd attribution into bash -c inner clauses (v0.1: doesn't
211+
propagate; v0.1.x or v0.2 may revisit).
212+
- `cd -` (jump-to-previous-dir): currently treated as "no attribution
213+
change" because `-` is detected as a flag (no non-flag positional).
214+
v0.1.x may want explicit handling.
215+
- Existing PR 3 corpus entries 21-24, 26-27, 29-33 (the compound
216+
entries that had no `cd` prefix) likely don't need updates; but PR 6
217+
should audit all 100 corpus entries for AST drift now that PRs 1-5
218+
have all landed.
181219

182220
## 6. PR 6 — Corpus completeness + PII audit (interpretation #7)
183221

0 commit comments

Comments
 (0)