Interpreter for Piet Programming Language, which is fully written in Rust.
Piet is a programming language in which programs look like abstract paintings. The language is named after Piet Mondrian, who pioneered the field of geometric abstract art.
A famous Hello, world! program by Thomas Schoch (link) |
$ git clone 'https://github.com/your-diary/piet_programming_language'
$ cd piet_programming_language/
$ cargo run -- --help
$ cargo run -- <image file>
Or if you prefer Docker:
$ git clone 'https://github.com/your-diary/piet_programming_language'
$ cd piet_programming_language/
$ docker build -t piet_programming_language .
$ docker run --rm piet_programming_language --help
$ docker run --rm -it piet_programming_language --base64 "$(base64 ./path/to/image.png)"
MSRV(Minimum Supported Rust Version): Rust 1.87.0
$ cargo install --locked --git 'https://github.com/your-diary/piet_programming_language'
$ git clone 'https://github.com/your-diary/piet_programming_language'
$ cd piet_programming_language/
$ cargo install --locked --path .
$ piet_programming_language <image file>
$ piet_programming_language --help
Interpreter for Piet Programming Language
Usage: piet_programming_language [OPTIONS] <IMAGE_FILE>
Arguments:
<IMAGE_FILE>
Options:
-c, --codel-size <CODEL_SIZE> Specifies the codel size (default: auto detect)
--fall-back-to-white Treats unknown colors as white instead of error
--fall-back-to-black Treats unknown colors as black instead of error
--max-iter <MAX_ITER> Terminates the program after this number of iterations
--base64 Interprets <IMAGE_FILE> as a base64-encoded image instead of a file path
-v, --verbose Enables debug output (path trace etc.)
-h, --help Print help
-V, --version Print version
The official specification doesn't define Piet Programming Language very strictly: some behaviors are implementation-defined.
In this section, we explain how such behaviors are implemented.
Additional colours (such as orange, brown) may be used, though their effect is implementation-dependent. In the simplest case, non-standard colours are treated by the language interpreter as the same as white, so may be used freely wherever white is used. (Another possibility is that they are treated the same as black.)
By default, our implementation marks any unknown color as an error, immediately terminating the interpreter before your program starts.
You can change this behavior by specifying --fall-back-to-white
or --fall-back-to-black
option. The former treats unknown colors as white, and the latter treats them as black.
Individual pixels of colour are significant in the language, so it is common for programs to be enlarged for viewing so that the details are easily visible. In such enlarged programs, the term "codel" is used to mean a block of colour equivalent to a single pixel of code, to avoid confusion with the actual pixels of the enlarged graphic, of which many may make up one codel.
The codel size is automatically detected or can be specified via --codel-size
option. Note that, generally speaking, the codel size cannot be uniquely determined. If a positive integer n
is valid as a codel size, then any divisor of n
is also valid. It is even known there is a program whose behavior changes as the codel size changes (see Multi-Codel Size). When automatic detection is performed, the maximum valid n
is used.
The stack is notionally infinitely deep, but implementations may elect to provide a finite maximum stack size. If a finite stack overflows, it should be treated as a runtime error, and handling this will be implementation dependent.
Our implementation uses Rust's Vec
type as a stack and doesn't explicitly set the limit for its size. The actual limit depends on your computer (e.g. RAM size). What happens when a stack overflows is undefined.
The maximum size of integers is notionally infinite, though implementations may implement a finite maximum integer size. An integer overflow is a runtime error, and handling this will be implementation dependent.
We use Rust's isize
type to handle integers. The number of bits of an isize
is normally 32
or 64
. What happens when an overflow occurs is undefined (we naively use Rust's builtin integer operations).
If a divide by zero occurs, it is handled as an implementation-dependent error, though simply ignoring the command is recommended.
We follow the recommendation.
If the top value is zero, this is a divide by zero error, which is handled as an implementation-dependent error, though simply ignoring the command is recommended.
We follow the recommendation.
If a roll is greater than an implementation-dependent maximum stack depth, it is handled as an implementation-dependent error, though simply ignoring the command is recommended.
As noted earlier, there is no explicit implementation-dependent maximum stack depth.
Practically, when the depth (i.e. the second top entry of a stack) is larger than len(stack) - 2
(i.e. the whole stack minus the two popped entries), the command is simply ignored according to
Any operations which cannot be performed (such as popping values when not enough are on the stack) are simply ignored, and processing continues with the next command.
The spec is vague about how to determine number/character boundaries; it only states:
Reads a value from STDIN as either a number or character
Data values exist only as integers, though they may be read in or printed as Unicode character values with appropriate commands.
Practically, we assume that, when users want to input a series of numbers, they would separate them by whitespace (e.g. 1 -2 -3
, 1\n-2\n-3
, etc.) though lining up numbers without whitespace is also theoretically possible (e.g. 123-5-6
may be interpreted as [123, -5, -6]
, which is how std::cin
works in C++).
On the other hand, when users want to input a string, they typically do not want to ignore whitespace (e.g. Hello, world!\n
should be read as is) though skipping whitespace is also theoretically possible (e.g. a b c
may be interpreted as ['a', 'b', 'c']
, which is how std::cin
works in C++).
To support both use-cases, we adopt the following implementation:
-
in(char)
literally reads the next Unicode character, including whitespace. -
in(number)
reads the longest match of the (pseudo) regex[ \t\n]*<non_blank_word>[ \t]*\n?
, where<non_blank_word>
is defined as a sequence of non-whitespace characters.-
The reason that trailing whitespace is also consumed is to let
in(number)
followed byin(char)
read100
andh
respectively from the stdin100 hello
. -
The reasons that the consumption stops at the first newline are:
-
Users may want to read integers on one line and then read the next line as it is as a string (including whitespace).
-
If we don't stop at the first newline, then the command would wait indefinitely until it reaches EOF or a non whitespace character, which is especially problematic when the stdin is canonical.
-
For example,
-
if stdin contains
\n\n 123 hello
, thenin(number)
consumes\n\n 123
and leaveshello
-
if stdin contains
-5 \n hello
, thenin(number)
consumes-5 \n
and leaveshello
-
When the top entry of a stack exceeds the range [0, char::MAX]
(i.e. when it isn't a valid Unicode character), the command is simply ignored according to
Any operations which cannot be performed (such as popping values when not enough are on the stack) are simply ignored, and processing continues with the next command.
Some important implementation details:
-
The complexity of
roll
command only depends on the depth (i.e.O(depth)
). Even if the number of rolls (i.e. the top entry of a stack) is large, the command runs quickly. The similar applies topointer
command andswitch
command, both of which takeO(1)
. -
All of the commands are implemented as atomic operations; when a command is ignored according to
Any operations which cannot be performed (such as popping values when not enough are on the stack) are simply ignored, and processing continues with the next command.
, the stack is kept intact.
Examples:
-
When
divide
command tries to pop two entries from the stack but the size of the stack is one, no entry is popped. -
When the top entry of the stack is
0
,divide
command is ignored as zero-division and no entry is popped.
One exception is that
in
command consumes stdin even if the read value was invalid (e.g. invalid string forin(number)
command). -
Many unit tests are written. The coverage is around 80%.
As integration tests, almost all of the samples exhibited in Piet Program Gallery are tested with some caveats:
-
While Rust codes for the tests are included in this repository, the tested images are NOT included for copyright reasons.
-
Some tests are set
#[ignore]
(i.e. skipped) because they fail. As far as we investigated, we suspect the reason is not because our implementation is incorrect but because some samples are not standard-compliant (anymore). In particular, how white blocks shall be handled was not clarified in the first version of the spec, and it was afterwards clarified as seen in the latest spec. Our implementation conforms to the spec as of 2024/10/20.
This project follows Semantic Versioning.
$ cargo build --release #This is required as the integration tests use release binary.
$ cargo test
$ cargo llvm-cov --open
$ cargo doc --open
$ cargo doc && fd | entr cargo doc