© 2019-2020 Arne Bachmann.
This project implements an interpreter for the stack-based programming language AWFUL, written in Python. AWFUL takes inspiration from many programming language designs and concepts, but is not really informed by them and combines them in the most awful way (e.g. use of terms may differ from existing languages).
The result is a fun little programming language and an usable interpreter that allows to write stack-based programs with a reduced set of keywords and language concepts. Be prepared for many wasted hours getting even simple things right. It's kind of awful.
The first example:
include console
"Hello world" println
Here is an example using extensive stack operations (considered actually bad practice in Awful):
# Fibonacci implementation
include stack
## Compute the Fibonacci numbers iteratively
def fib 1
dup is-neg if (
error "fib requires a positive interger"
)
dup 1 le if break # for 0 and 1, just return this number
def fib' 3 # q p n
dup 1 le # q p n n<=1
if break pop2 # q
3tor over # n q p q
+ # n q p+q
swapover # p+q q n
1 - # p+q q n-1
fib' # tail recursion
end 1
1 0 # n 1 0
rot3 # 1 0 n
fib' # start the loop
end 1
## Test cases
assert -1 "ERROR" from -1 fib end
assert 0 from 0 fib end
assert 1 from 1 fib end
assert 8 from 6 fib end
assert 6765 from 20 fib end
assert 222232244629420445529739893461909967206666939096499764990979600 from 300 fib end
AWFUL's features include:
- bootstrap language definition - parts of the language (i.e.
for
,gt
,rot
) are built up from a few primitives likenib
) - tail recursion optimization (no callstack limitation due to recursion)
- hierarchical unified type system (internal data type representations are built on top of each other)
- nested variable structures (
a
,a.b
) - namespaces defined by function invocation, not per source definition
- built-in testing and invariants via
assert
statement - stack safety checks
- function references via
&
and variables - extensive internal machine logging via
--debug
,--warn
,--calls
,--names
,--tokens
, and keywordsdebug on/off/this
- nested
include
- source files are valid Markdown documents!
- general appearance and terminology mostly inspired by Python
- there are no lexical scopes - all included code lines are interpreted as if they were one big file :-/
Either install via
pip[3] install awful
, or clone from Github and install viapython[3] setup.py build && pip[3] install -e .
Run via
awful arguments-and-or-options
Runningawful --help
will display the usage information:
AWFUL V0.5 (C) 2019-2020 Arne Bachmann
python[3] [-O] awfl.py [file1.awfl [file2.awfl [...]]] [<options>]
-O Remove Python assert statements from runtime
--optimize Remove Awful comments and asserts
--decimals <n> Set decimal computation precision to n digits (default: 1000)
--run "token s" Run given AWFUL commands and quit
--repl Start interactive AWFUL shell after processing given files
--help Show this interpreter options
Debugging:
--warn Show runtime warnings to find code problems
--debug Show parser info and enable live debugger
--calls Display call hierarchy
--names Display all words in all namespaces
--tokens Display tokens at parsetime
--fifo Display last 15 tokens and stacks in case of error
--stats Show interpreter run statistics before interpreter shutdown
--interactive In case of test case failure, drop into a python debug shell
--test Run test cases
Keywords and symbols:
Values: nil False True "
Grouping: ( ) [ ![ ]
Maths: + - * //
Stack: nib bin pop dup
Functions: def dyndef end & alias apply
Logic: not if eq le ge as up rm
Structures: . pull push pushi
Files: open create close read write
Control: break error
Testing: assert from debug ignore on off
- token: a string of characters extracted from the source file, separated by whitespace from adjacent tokens (exceptions: quoted string parsing and comments, which after parsing constitute one token even with spaces included)
- data structure: the internal representation of any data, implemented solely as Python dictionaries
- symbol: a token that is used to specify a data structure key, either an integer number or a lower-case string.
Symbols are usually represented starting with a colon, e.g.
:num
- variable: a data structure stored in a namespace under a certain key. Invoking the key puts (a copy of) the variable's contents on the stack; function variables put a function reference on the stack
- TOS: top of stack (top-most element)
- NOS: next of stack (second element from top)
- token parsing errors (syntax, grammar) and runtime violations (function stack invariants and type checks) are issued by two separate exceptions
- the internal data representation for all AWFUL data types are, for symmetry's sake, always dictionaries with keys as static Python strings (symbols from the token stream, or stringified integers), and mapped values as integers.
For built-in types and standard library types, the 'type' points to a numeric type identifier
nil
is a singleton data type with no other map keys excepttype
being set to0
- lists are implemented as Python
dict
s with the key being the list index 0..n-1- strings are implemented as lists of charset codepoints
- numbers are implemented as serialized strings from Python data types
int
<fraction
<decimal
- Boolean values with the symbols
True
andFalse
are implemented having a numeric value of -1 and 0, respecitvely
- Boolean values with the symbols
- numbers are implemented as serialized strings from Python data types
- strings are implemented as lists of charset codepoints
- since many interpreter operations on the stack's data structure are in-place, the runtime usually makes deep copies of data to avoid reference problems
- each namespace (or vocabulary in Forth lingo) constitutes a set of local words that can be looked up (a word points to either a function or a data structure)
- nested namespaces are implemented as a stack of Python maps that can be accessed from top (local) to bottom (global)
- each function call creates a fresh local namespace at runtime; the outer namespaces can explicitly be referenced by dot prefixes (one dot per parent) - this usually requires the user to know how many parent hops are needed to modify a variable in the outer scopes and makes composition hard
- when not using the dot notation, symbols are automatically looked up in parent namespaces up to the global one, otherwise a runtime exception interrupts the program
- redefine a function by using a
dyndef
in a deeper namespace
- there is one global stack that facilitates data transfer between all consecutive operations
- basic stack operations supported in AWFUL:
dup
: duplicate (deep-copy) TOSpop
: remove TOSn
nib
: remove n-th topmost element and push it onto TOS (0 = keep TOS, 1 = swap with TOS, 2 = rotate 3 elements)n
bin
: remove TOS and insert it down n positions (0 = keep at TOS, 1 = swap with NOS, 2 = reverse rotate 3 elements)
- variable names are stored as words in the current namespace and may optionally contain nested sub-entries via dot notation
- variable definition:
value
as name[.sub1[.sub2[...]]]>
store value undername
in current namespace - variable update:
value
up name
update value ofname
in current namespace - variable destruction:
rm name
. Be careful: removes variable from the current runtime namespace, not necessarily the one the variable was defined in!
- to manipulate data in more complex data types, the following operations must be used:
data :symbol
pull
extract the substructuresymbol
(or an integer index) from the datum and pushes on top of the stackdata substructure :symbol
push
add or update the substructure in the datum, consuming it from the stackdata substructure :symbol
pushi
add or update a direct integer value under the symbol key, consuming it from the stack (this is used to manipulate code points or internal metadata like:type
and:num
directly)
-
function invocation:
arguments
name
call a function that consumes zero or more input arguments from the stack and puts zero or more output values onto the stack -
function reference creation:
&name
puts a reference to the function on the stack -
function reference invocation:
arguments reference
apply
call a function from a function reference in TOS -
global function definition:
def name n-input-args body end m-output-args
inputs
andoutputs
are integers that define the expected number of elements on the stack before and after the function call- the function body will be evaluated on each call, including potentially contained function definitions and asserts
-
local function definition
dyndef name n-input-args body end m-output-args
, similar to a global function definition, but is precompiled and only available in the inner scope. Allows redefining already occupied function names
-
Originally nested variables, including numeric indexes, could be accessed directly via dot notation. This, however, required the AWFUL code to compute the variable sub-path keys via concatentation. Since concatenation was to be programmed in AWFUL itself, there was a deadlock situation between interpreter and language bootstrapping.
The solution to that problem was the introduction of the
push[i]
andpull
commands that combine or extract sub-structures. This way we can still implement the list and string concatenation functions in AWFUL and retain the sub-structures feature in the language with very little specialized code.
- When mixing stack with reference variables, prefer variables retrieval over stack duplication for readability
- Loop functions usually have the form of
- variable processing (from stack)
- step function definition
- loop invocation
- postprocessing
- Naming recommendations
- Name variables using the function name as a prefix (e.g.
for
implememtation has an internal variablefor-step
). This makes it easier to detect and avoid naming clashes - Use the single quote suffix for single naming clash avoidance
- Name variables using the function name as a prefix (e.g.
- standard
- selftests
- basicmaths
- basics
- lists
- loops
- maps
- sets
- stack
- types
- maths
- basicmaths
- lists
- stack
- strings
- types
- io
- console
- files
- streams
- strings
- system
Please note that many library implementations favor the use of other existing library functions over careful algorithmic design, often leading to very disadvantageous runtime behavior, which is just awful.
- Insert print statements like
stdout "Bla" printfn
- Insert
debug this
for stack prints- Or put
:X debug this pop
into the source to mark the current code location withX
- Or put
- Use
debug on
anddebug off
to enable detailed output for a passage of code - Enable loggers via
--warn
and--debug
- Use
--calls
,--names
,--tokens
- Use
--fifo
for a number of last tokens and stacks - Use
--interactive
to drop into the Python debugger
- allow group opening
(
on different line than precedingif
- allow empty function bodies
- document all keywords and the language basics
- option to run assert statements only once (and not on every call)
- remove assert from token string after execution?
- add test vs. assert
- add exception catching: e.g. try block catch block end
- optimization: refactor literal lists and functions into one big static table
- function invariant checking should be done in the interpreter, not in Python functions
- add evaluated lists allowing top use variables as content (currently only literals allowed)
- leave file and line number in items for better debugging
- add polymorphism
- remove all EOLs from item string
- allow more special character escaping (\n\r\t \0\xXX\uXXXX\UXXXXXXXX) in literal strings
- use colored syntax in REPL
- implement full Awful grammar using some kind of parser library
- add multi threading capabilities
- interpret
alias
only once - allow to implement commands using python
eval
for certain tasks
- -2: open file handle
- -1: codepoint
- 0:
nil
- 1: bool - element
0
is -1 (true) or 0 (false) - 2: number - string representation of a Python number (int, float, decimal, fraction)
- 3: string -
:num
marks number of codepoints,0
..num-1
are integer codepoints - 4: list
- 5: function (reference) - string representation of a function name
- 6: maps - cf. [libs/maps.awfl]
- 7: sets - cf. [libs/sets.awfl]
The interpreter's performance is - as expected - not just horrible, but really awful. For the Fibonacci code, a slowdown factor of roughly 1000 compared to Python is observed. Maps are implemented as lists implemented as nested dictionaries...
You can skip running all self-tests, pre- and post-conditions and invariant checking by providing -O
to the Python interpreter (this disables all Python asserts).
Using the command line option --optimize
will skip all AWFUL assert
statements as well.
> time awful examples/fibonacci.awfl
user 0m33,283s
> time python3 -OO awful/awfl.py examples/fibonacci.awfl
user 0m32,258s
> time pypy3 awful/awfl.py examples/fibonacci.awfl
user 0m12,697s
> time python3 examples/fibonacci.py
user 0m0,025s
> time pypy3 fibonacci.py
user 0m0,059s