fix: redirect should consume exactly one destination#331
Open
fprochazka wants to merge 1 commit into
Open
Conversation
In bash, each redirect operator (>, <, >>, etc.) takes exactly one destination. The file_redirect rule previously used repeat1($._literal) which greedily consumed all subsequent tokens as redirect destinations. This caused commands like `grep 2>/dev/null -q "pattern" /etc/shells` to incorrectly parse -q, "pattern", and /etc/shells as redirect destinations instead of command arguments. Changes: - file_redirect: change repeat1($._literal) to $._literal - redirected_statement: allow argument fields after redirects so that tokens following a mid-command redirect are correctly preserved
AMZN-hgoffin
added a commit
to AMZN-hgoffin/tree-sitter-bash
that referenced
this pull request
Apr 7, 2026
Redirects between arguments (e.g. grep 2>/dev/null -q pattern file) previously caused the parser to misattach subsequent arguments as redirect destinations, because file_redirect uses repeat1 for its destination field. This patch uses the external scanner to peek ahead past redirect operators and their destinations. If more non-redirect words follow, the scanner emits a mid-command token that keeps the redirect inside command's repeat. If no words follow, the redirect is trailing and uses redirected_statement as before. Covers all redirect operators: > >> < 2> 2>&1 >& <& >| &> &>> >&- <&-. Handles chained redirects, process substitution destinations, herestrings, and close-fd operators. Also adds redirect support inside [ ] test brackets (e.g. [ -f file 2>/dev/null ]). The mid-command redirect rules are aliased to file_redirect, so node-types.json is unchanged. Consumers that already handle redirect fields on command nodes (which is required for pre-command redirects like '2>&1 cmd') need no changes. Related: tree-sitter#233, tree-sitter#331 PR tree-sitter#331 takes a different approach: changing file_redirect to accept a single destination and adding argument fields to redirected_statement. That is a simpler grammar change but a breaking change to node-types.json that requires consumers to handle the new argument field. This patch preserves the existing tree structure for trailing redirects and only changes behavior for inputs that previously produced incorrect trees.
AMZN-hgoffin
added a commit
to AMZN-hgoffin/tree-sitter-bash
that referenced
this pull request
Apr 7, 2026
Redirects between arguments (e.g. grep 2>/dev/null -q pattern file) previously caused the parser to misattach subsequent arguments as redirect destinations, because file_redirect uses repeat1 for its destination field. This patch uses the external scanner to peek ahead past redirect operators and their destinations. If more non-redirect words follow, the scanner emits a mid-command token that keeps the redirect inside command's repeat. If no words follow, the redirect is trailing and uses redirected_statement as before. Covers all redirect operators: > >> < 2> 2>&1 >& <& >| &> &>> >&- <&-. Handles chained redirects, process substitution destinations, herestrings, and close-fd operators. Also adds redirect support inside [ ] test brackets (e.g. [ -f file 2>/dev/null ]). The mid-command redirect rules are aliased to file_redirect, so node-types.json is unchanged. Trailing redirects still produce redirected_statement exactly as before. The one known tree shape change is for trailing redirects inside backtick command substitutions (e.g. echo `cmd >file` arg), where the redirect moves from redirected_statement to a direct child of command. Backticks use the same symbol to open and close, making it impractical for the scanner's lookahead to determine whether a backtick terminates the current context. This is safe for consumers because command already accepts redirect children for the pre-name case (e.g. 2>&1 cmd), so all consumers must already handle redirects on command nodes. Related: tree-sitter#233, tree-sitter#331 PR tree-sitter#331 takes a different approach: changing file_redirect to accept a single destination and adding argument fields to redirected_statement. That is a simpler grammar change but a breaking change to node-types.json that requires consumers to handle the new argument field. This patch preserves the existing tree structure for trailing redirects and only changes behavior for inputs that previously produced incorrect trees.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #233.
In bash, each redirect operator (
>,<,>>, etc.) takes exactly one destination. Thefile_redirectrule previously usedrepeat1($._literal)which greedily consumed all subsequent tokens as redirect destinations.This caused commands like
grep 2>/dev/null -q "pattern" /etc/shellsto incorrectly parse-q,"pattern", and/etc/shellsas redirect destinations instead of command arguments.Changes
file_redirect: changerepeat1($._literal)to$._literal(single destination)redirected_statement: allowargumentfields after redirects so that tokens following a mid-command redirect are correctly preserved as argumentsBreaking changes
file_redirect.destinationis no longermultiple: trueinnode-types.json— it is now always a single noderedirected_statementgains a new optionalargumentfield for tokens that follow a mid-command redirectExamples
Before (broken):
After (fixed):
Test plan
find /path -name "*.sql" 2>/dev/null -exec grep -l "pattern" {} \;[...]), pipes, compound statements, function definitions