Abstract Syntax Trees

The first thing I did to kick off the interpreter project was to determine its scope. I decided that I would only deal with integers, symbols, functions, and lists - a bare minimum required to make something that resembles Lisp. At this point I needed to convert these concepts into code Haskell would understand - exactly the same thing I'd do with any other language. Immediately I was rewarded. I was able to create a type that supported a required data structure in only four lines of Haskell code! The definition was encoded into what Haskell calls an algebraic data type and looked like this:

data Expr = BlaiseInt Integer |
            BlaiseSymbol String |
            BlaiseFn ([Expr]->Expr) |
            BlaiseList [Expr]

Algebraic data types are tricky beasts. Because they're very different from everything people with imperative programming backgrounds are used to, they tend to cause much confusion for a long time until intuitive understanding finally develops. Above code simply states than an expression may be and integer, a symbol, a function, or a list, and each version contains a single member field, respectively an integer, a string, a function that takes a list of expressions and returns an expression, and a list of expressions. The closest (very lossy) translation to Java that I can think of is this:

abstract class Expr {}

final class BlaiseInt extends Expr {
    public int _value;
}

final class BlaiseSymbol extends Expr {
    public String _value;
}

interface BlaiseFunction {
    public Expr evaluateFunction(List arguments);
}

final class BlaiseFn extends Expr {
    BlaiseFunction _value;
}

final class BlaiseList extends Expr {
    List _value;
}

Notice how the Java version is already much longer than the Haskell one despite the fact that I took obvious shortcuts and avoided writing constructors and accessor methods. Additionally, as we continue developing the interpreter, extending the Java translation appropriately will become a very costly and verbose task. I will attempt to demonstrate what I mean at various points in the article. One example I can provide immediately is actually instantiating the data structure. For example, suppose I want to use it to create a list of three integers. In Haskell I can do this:

myList = BlaiseList [BlaiseInt 1, BlaiseInt 2, BlaiseInt 3]

A Java alternative would look like this (we'll even give it an advantage, assume BlaiseList and BlaiseInt have appropriate constructors):

BlaiseList myList = new BlaiseList();
myList._value.add(new BlaiseInt(1));
myList._value.add(new BlaiseInt(2));
myList._value.add(new BlaiseInt(3));

While we could (unnecessarily) transform the above Haskell code to take up four lines, the clarity is already starting to speak for itself: you can do more with Haskell in less code that's easier to read and maintain in the long run.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abstract Syntax Trees

Clone this wiki locally