Skip to content

7. Shell

Claude Roux edited this page Feb 27, 2022 · 7 revisions

Shell

Version française

First of all, let's clarify what we mean by: shell. The shell is the basic command interpreter of an operating system. When you use cd, ls, rm, you are using instructions from the shell. It's as simple as that...

Pipe

Those who developed the Unix shell have done a remarkable job, even if one can sometimes hesitate about the relevance of some command names, whose encrypted form is sometimes difficult to interpret for a neophyte. However, the main concept at the heart of Unix, the pipe has been a revolution. At a time when machines had very little memory, the ability to simply chain line-by-line processing suddenly allowed the transition to the execution scale that until then required large specific programs. Thus, as Brian Kernighan, the inventor of the awk interpreter with Aho and Weinberg, recounts, one could now count the presence of particular words within a document too big to hold in memory, in one single line of code.

Awk

It is no coincidence that the names of Aho, Weinberger and Kernighan appear here (awk is the combination of their initials). Indeed, Unix has not only introduced the notion of pipe, it has also generalized the use of regular expressions. And awk was designed to get the best from these expressions.

Thanks to awk, users were finally able to act on the data returned by the operating system. awk introduced some fairly revolutionary concepts, such as dictionaries and field extracted from the input line. However, awk has a rather unorthodox syntax and rereading an old script can sometimes be a challenge. I don't want to get into any controversy about this, I have been an enthusiastic user of gawk for a long time, in pain, but happy to have had such a useful tool at hand.

And for a long time, I wondered if there could be an alternative to awk...

LispE

LispE is a home-made Lisp interpreter, written with the hope of implementing C++ code that is as elegant and readable as possible. It's hard for me to objectively quantify whether I achieved the desired result, but I think I produced one of the cleanest code in my entire career. If I were to have only one measure, it would be a comparison with my earlier work, most of it being quite crude both in terms of implementation and design.

The advantage of producing your own interpreter is that you can do with it what you want with it... Really what you want to do with it...

For example, to use it as an alternative to awk.

Uh... Lisp?

I can already hear some amused comments: Really, you were complaining about the exotic syntax of awk and you want to use Lisp instead?

To that, I would answer only one thing... Ok! That's true... Let's say that Lisp is a bit saturated in parentheses.

But, the language is powerful, very flexible with a syntax that is both very simple and functional (in both senses of the word).

It takes almost nothing to turn it into a nice shell language.

Lisp as a shell language

LispE offers an environment where it is possible to edit and execute Lisp instructions directly. In particular, LispE offers an interactive interpreter which allows, for example, to test instructions or to check the value of global variables at the end of an execution.

Moreover, this interactive interpreter allows the execution of instructions from the shell. To do so, all you need is to precede the instruction with a '!'.

◀▶1> !ls

It is even possible to store the result of this instruction in a Lisp variable. All you have to do is write: !var=instruction so that var receives the content of the execution of instruction:

◀▶1> !v=ls

In fact, this instruction is translated into an underlying Lisp instruction:

(setq v (command "ls"))

If you browse the history, you will certainly find this line. command is an instruction exposed by LispE which allows the execution of a shell command and returns the result of this command in the form of a list.

Command line

But LispE goes further. We have also added additional command line options, which allow to integrate a call to LispE within a pipe sequence.

The '-a' option

This is the simplest option. It is used on the command line in front of a pipe.

ls -al | lispe -a

LispE reads what the pipe returns and initializes a list: _args, where each element corresponds to a line read on stdin.

LispE is then run as an interactive interpreter, which allows you to build the necessary code step by step, at your own pace.

ls -al | lispe -a

_args contains therefore:


("." ".." ".DS_Store" ".git" "Lispe" "Makefile" "Makefile.in" "README.md" "check" \
"checkrgx.py" "examples" "include" "macCopie.sh" "src" "template" "versionne.lisp" "")  

It is also possible to build a program that can examine the contents of this list and process it:

ls -al | lispe program.lisp -a

Let's note that if we want to edit the program in the internal editor of LispE, we have to place the edit command afterwards:

ls -al | lispe -a -e program.lisp

The -p option (-pb/-pe)

This option, on the other hand, allows you to really place the interpreter within a pipe sequence.

Its syntax is very simple:

ls -al | lispe -p 'instructions'.

The instructions are executed for each new line coming from the pipe, unlike -a which reads all lines in advance.

What makes this option interesting is the fact of having access to particular variables:

accu1,accu2,...,accu9: Nine predefined accumulators initialized to 0 at start up
l0: is the complete current line
l1,l2,l3...: are variables corresponding to a field of 'l0'.
ln: corresponds to the number of fields found in 'l0'.
ll: is a list composed of all the fields

The fields come from splitting the full line along spaces.

ls -al | lispe -p 'l10'

# yields:

# .
# ..
# DS_Store
# .git
# Lispe
# Makefile
# Makefile.in
# README.md
# check
# checkrgx.py
# ...

# We calculate the size of all the files in the directory
ls -al | lispe -p '(+= accu1 l6)'.

544
800
11044
...

Note that fields containing only numbers become numeric values:

ls -al | lispe -p '(print (type l2) " " (type l3))' "

# integer_ string_
# integer_ string_
# integer_ string_
# integer_ string_
# integer_ string_
...

The instructions are executed in a string. The last instruction returns the final result:

ls -al | lispe -p '(type l2) (type l3)'.

# Here only (type l3) is displayed, even if the set has been executed

# string_
# string_
# string_
...

You can use ll to iterate across all fields:

ls -al | lispe -p '(loop v ll (print (type v) " ")'

# string_ integer_ 
# string_ integer_ string_ string_ integer_ string_ string_ integer_ string_ string_ string_ string_ string_ integer_ string_ string_ string_ string_ integer_ string_ string_ string_ string_ integer_ string 
# string_ integer_ string_ string_ integer_ string_ string_ integer_ string_ string_ string_ string_ string_ integer_ string_ string_ string_ string_ integer_ string_ string_ string_ string_ integer_ string 
# string_ integer_ string_ string_ integer_ string_ string_ integer_ string_ string_ string_ string_ string_ integer_ string_ string_ string_ string_ integer_ string_ string_ string_ string_ integer_ string 
# string_ integer_ string_ string_ integer_ string_ string_ integer_ string_ string_ string_ string_ string_ integer_ string_ string_ string_ string_ integer_ string_ string_ string_ string_ integer_ string 
# string_ integer_ string_ string_ integer_ string_ string_ integer_ string_ string_ string_ string_ string_ integer_ string_ string_ string_ string_ integer_ string_ string_ string_ string_ integer_ string 

-pe/-pb

These options must be placed before -p.

  • -pb: Allows you to execute code before -p is executed.
  • -pe: Execute code at the end of the execution of -p.
ls -al | lispe -pb '(setq s 0) -pe '(println "Sum=" s)' -p '(+= s l5)'

Allows to calculate the sum of the size of the files present in a directory.

The -P option

This option works on the same principle as -p but instead of taking code, it uses a file containing a program.

IMPORTANT: this file must contain a runpipe function that serves as an entry point to the program.

ls -al | lispe -P program.lisp

and program.lisp must contain at least the runpipe function:

(defun runpipe()
    (println ll)
)

This function will be called for each new line coming from stdin.

Regular expression conditions: -r/-R

To make sure that the transformation to a shell language is completed, we add the -r/-R options associated with -p/-P.

These options check if a regular expression matches the input line before executing or not the code given with -p/-P.

The difference between -r and -R is on the nature of regular expressions:

# The following form filters out lines that do not contain at least 3 digits in sequence
ls -al | lispe -r "\d\d\d" -p "l0"

# which gives:

# -rw-r--r-- 1 roux NLE\Domain Users 10244 26 Sep 16:42 .DS_Store
# -rwxr-xr-x@ 1 roux NLE\Domain Users 1715 30 Sep 09:18 Makefile
# -rwxr-xr-x@ 1 roux NLE\Domain Users 3297 23 Sep 14:02 checkrgx.py

# Same thing but with an internal regular expression (see documentation)
ls -al | lispe -R "%d%d%d" -p "l0"
Clone this wiki locally