Skip to content

hazohelet/shirp

Repository files navigation

Shirp - micro scheme interpreter

Shirp is a tree-walk interpreter of a tiny subset of Scheme programming language. I use the R7RS-small documentation as a reference, and the behavior of the built-in functions mimics that of Gauche, an existing R7RS implementation.

Usage

How to build Shirp

Running make in the root directory will build the interpreter executable named shirp. You can start REPL by typing ./shirp. Shirp REPL prints >> to prompt user input. If the REPL receives incomplete user input(has unclosed parentheses), it prints .. to prompt further input.

If you want to run a script, pass the file path to the executable shirp, for example, ./shirp test.scm. This command will run the script and start the REPL with the environment generated by the script interpretation. You can also run a script by calling load function in the REPL, for example, by calling (load "test.scm").

Note that you need to run the interpreter in the directory where prelude.scm is located. prelude.scm is a library file that defines some of the built-in procedures.

Execution Examples & Tests

You can find many examples of Shirp executing Scheme programs in the test directory, namely in test/expr.c, test/kadai1.c, test/kadai2.c, test/quote.c. You can run these tests by typing make test in the root directory.

Here are some selected examples of Shirp REPL executions.

>> (define a 42)
>> (define b (+ 10 1))
>> a
42
>> b
11
>> (if (< a b) 10 20)
20
>> (define g (lambda (a b) (+ a b)))
>> (g 3 5)
8
>> (define l (cons 1 (cons 2 (cons 3 (list)))))
>> l
(1 2 3)
>> (cadr l)
2
>> (define l (cons 4 (cons 5 '())))
>> l
(4 5)
>> (define l '(+ (+ 1 2) (- 7   8)))
>> l
(+ (+ 1 2) (- 7 8))
>> (define (fact n) (if (< n 2) 1 (* n (fact (- n 1)))))
>> (fact 5)
120
>> (define (fact-rec n acc) (if (< n 2) acc (fact-rec (- n 1) (* n acc)))
.. )
>> (fact-rec 5 1)
120
>> (cond (else 42))
42
>> (cond (#f 2) (#f))
#<undef>
>> (cond (4) (7) (else 88))
4
>> (cond (else  1 23 4 56 7))
7
>> (= 42 42 42 42 42)
#t
>> (= 42 42 42 -42 42)
#f
>> (< 1 2 3 4 5)
#t
>> (< 2 1 3 5 4)
#f
>> (> 1 2 3 4 5)
#f
>> (> 5 4 3 2 1)
#t
>> (>= 5 4 4 3 3 2 1)
#t
>> (even? 9)
#f
>> (odd? -11)
#t
>> (cadadr (cadadr (cadadr '(1 (2 (4 (5 (6 (7 3)))))))))
3
>> (reverse '(1 2 (3 4) 5 (6 7 (8 9) 10) 11))
(11 (6 7 (8 9) 10) 5 (3 4) 2 1)
>> (define lst1 (list 1 2 3))
>> (define lst2 (list 1 2 3))
>> (eq? lst1 lst2)
#f
>> (equal? lst1 lst2)
#t
>> (define a 100)
>> (define h (lambda (x) (lambda (y) (+ x (* a y)))))
>> (let ((a 0)) ((h 1) (+ a 5)))
501
>> (define f (+ h 7))
error: argument expected to be a number, but got `closure`
   (define f (+ h 7))
                ^
>> unknowon_func
error: undefined variable: unknowon_func
   unknowon_func
   ^~~~~~~~~~~~~

Implemented features & Built-in procedures

As shown in the examples above, Shirp supports the following fundamental features.

Lexical Elements

<identifier> can contain letters, digits, and special characters !$%&*+-./:<=>?@^_~.

<number> is <integer> or <floating>.

<integer> is a sequence of digits with an optional prefix character + or -.

<floating> is a sequence of digits with a single . and an optional prefix character + or -.

<string> is a sequence of characters enclosed in double quotes ". No escaping sequences are implemented.

<boolean> is #t, #true, #f or #false.

All integers are represented as int64_t, and all floating numbers as double. (eq? -.0 .0) evaluates to #t.

Syntax

Shirp is intended to parse the following Extended Backus-Naur Form (EBNF) grammar.

<program> := <command or definition>+
<command or definition> := <command> | <definition>
<command> := <expression>
<definition> := (define <identifier> <expression>) | (define (<identifier> <def formals>) <body>)
<def formals> := <identifier>* | <identifier>* . <identifier>

<expression> :=   <identifier> | <literal> | <procedure call> | <lambda expression>
                | <conditional> | <assignment> | <derived expression>

<literal> := <quotation> | <self-evaluating>
<self-evaluating> := <boolean> | <number> | <string>
<quotation> := '<datum> | (quote <datum>)
<procedure call> := (<operator> <operand>*)
<operator> := <expression>
<operand> := <expression>
<lambda expression> := (lambda <formals> <body>)
<formals> := (<identifier>*) | (<identifier>+ . <identifier>)
<body> := <definition>* <sequence>
<sequence> := <command>* <expression>
<conditional> := (if <test> <consequent> <alternate>)
<test> := <expression>
<consequent> := <expression>
<alternate> := <expression> | <empty>

<assignment> := (set! <identifier> <expression>)

<derived expression> :=   (cond <cond clause>+) | (cond <cond clause>* (else <sequence>))
                        | (and <test>*) | (or <test>*) | (let (<binding spec>*) <body>)
<cond clause> := (<test> <sequence>) | (<test>)
<binding spec> := (<identifier> <expression>)

<datum> := <simple datum> | <compound datum>
<simple datum> := <boolean> | <number> | <string> | <symbol>
<symbol> := <identifier>
<compound datum> := <list>
<list> := (<datum>*) | (<datum>+ . <datum>)

However, Shirp cannot evaluate <def formals> -> <identifier>* . <identifier> and <formals> -> (<identifier>+ . <identifier>) correctly as of now.

Proper Tail Recursion

Shirp represents the Scheme Environment as a stack of Frames. Each Frame is a hash table that maps identifiers to values.

Typically, when a procedure is called, a new Frame is pushed to the stack to save the local variables. Then, when the procedure returns, the Frame is popped from the stack to restore the previous environment.

Shirp supports proper tail recursion, which means that the interpreter does not push a new frame on the environment for procedure calls in tail contexts.

According to the R7RS-small documentation,

The last expression within the body of a lambda expression, shown as below, occurs in a tail context.

(lambda <formals> <definition>* <expression>* <tail expression>)

As is defined in the R7RS-small documentation, Shirp supports the following tail contexts.

(if <expression> <tail expression> <tail expression>)
(if <expression> <tail expression>)

(cond <cond clause>*)
(cond <cond clause>* (else <tail sequence>))

(and <expression>* <tail expression>)
(or <expression>* <tail expression>)
(let <binding spec>* <tail body>)

<cond clause> := (<test> <tail sequence>)
<tail body> := <definition>* <tail sequence>
<tail sequence> := <expression>* <tail expression>

Tail context search is implemented as a depth-first search in the AST(Abstract Syntax Tree).

Built-in procedures

  • + - * / = < > <= >= div remainder car cdr null? pair? list? symbol? number? list cons and or eq? equal? sqrt load even? odd? abs cadr caar cddr cadadr caaddr gcd reverse
  • / and div works differently from the Scheme standard for integers.

  • eq? internal does not conform to the Scheme standard.

Garbage Collection

A very simple mark-and-sweep garbage collector is implemented in Shirp. Currently, it collects objects every time the interpreter finishes evaluating user REPL input. Thus, when the user gives many inputs to the REPL, Shirp takes a long time to respond to every user input. There is much room for improvement.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published