Skip to content

Support one-to-many piping in the pipeline syntax #500

Open
@xiaq

Description

@xiaq

This issue is filed from #485, which asks for the functionality of piping both the stdout and the stderr of a command to different commands. That bug was closed because the functionality is now possible with the low-level run-parallel and pipe builtins, but no new syntax were introduced.

This issue discusses the possibility of extending the pipeline syntax to support such a pipeline configuration. Citing @mqudsi's comment, it is not easy to come with a unambiguous syntax for this:

The primary issue with with working on both is parsing intent. Presuming stream redirection operators like 1>| and 2>|, how the command

./foo 1>| bar 2>| bar2

is parsed is ambiguous. Is the stderror redirection meant to apply to ./foo or the output of bar? This can only be solved by using brackets:

./foo 1>| bar 2>| bar2
./foo 1>| { bar 2>| bar2 }

A comment about this. I am not sure whether @mqudsi proposes that ./foo 1>| bar 2>| bar2 to mean "pipe stdout of foo to bar, and stderr of foo to bar2", but if that is the case, this is quite counter-intuitive. Traditional pipelines always work in a linear fashion, so it is tempting to interpret this as "pipe stdout of foo to bar, and stderr of bar to bar2".

The syntax for the pipeline should prioritize linear pipelines and make non-linear pipelines more explicit.


Traditionally, this functionality is implemented with process substitution:

foo > >(bar) 2> >(bar2)

However, process substitution relies support for either /dev/fd filesystem or named FIFOs. This is backwards: named FIFOs or /dev/fd is indeed needed if the process substitution needs to be used as command arguments, but when used in redirections, the same functionality is entirely implementable with plain, unnamed pipes.

In fact, in Elvish it is already possible to do this, except that you have to manage the lifecycle of pipes manually:

pout = (pipe)
perr = (pipe)
run-parallel {
  foo > $pout 2> $perr
  pwclose $pout
  pwclose $perr
} {
  bar < $pout
  prclose $pout
} {
  bar2 < $perr
  prclose $perr
}

Note that in the first function passed to run-parallel, foo > $pout 2> $perr resembles the process substitution version. This is expected.


Now for brainstorming a new syntax!

I think this is a bad idea, but a very intuitive syntax can look like this:

foo | bar
   2| bar2

I have chosen to change 2>| proposed by @mqudsi to 2| for terseness. Like 2>, there must not be any space between 2 and |.

When you have longer pipelines you will need to align them up:

foo | bar | quux
   2| bar2 # applies to stderr of foo
         2| quux2 # applies to stderr of bar

This syntax really takes whitespace-dependent syntax to the extreme. Again I don't think it's a good idea.

Another idea is supporting putting markers on commands in a pipeline, so that they can be referred to later on. Here I use ^name both as marker and reference, but it's likely we will need separate syntax for them:

foo ^f | bar ^b | quux
   ^f 2| bar2
            ^b 2| quux2

The parser can work by looking beyond the pipeline on the first line, and as long as subsequent lines start with a marker, add that to the part of the pipeline.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    🎨Design

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions