-
Notifications
You must be signed in to change notification settings - Fork 4
Abstract Syntax Trees
The first thing I did to kick off the interpreter project was to determine its scope. I decided that I would only deal with integers, symbols, functions, and lists - a bare minimum required to make something that resembles Lisp. At this point I needed to convert these concepts into code Haskell would understand - exactly the same thing I'd do with any other language. Immediately I was rewarded. I was able to create a type that supported a required data structure in only four lines of Haskell code! The definition was encoded into what Haskell calls an algebraic data type and looked like this:
data Expr = BlaiseInt Integer |
BlaiseSymbol String |
BlaiseFn ([Expr]->Expr) |
BlaiseList [Expr]
Algebraic data types are tricky beasts. Because they're very different from everything people with imperative programming backgrounds are used to, they tend to cause much confusion for a long time until intuitive understanding finally develops. Above code simply states than an expression may be and integer, a symbol, a function, or a list, and each version contains a single member field, respectively an integer, a string, a function that takes a list of expressions and returns an expression, and a list of expressions. The closest (very lossy) translation to Java that I can think of is this:
abstract class Expr {}
final class BlaiseInt extends Expr {
public int _value;
}
final class BlaiseSymbol extends Expr {
public String _value;
}
interface BlaiseFunction {
public Expr evaluateFunction(List arguments);
}
final class BlaiseFn extends Expr {
BlaiseFunction _value;
}
final class BlaiseList extends Expr {
List _value;
}
Notice how the Java version is already much longer than the Haskell one despite the fact that I took obvious shortcuts and avoided writing constructors and accessor methods. Additionally, as we continue developing the interpreter, extending the Java translation appropriately will become a very costly and verbose task. I will attempt to demonstrate what I mean at various points in the article. One example I can provide immediately is actually instantiating the data structure. For example, suppose I want to use it to create a list of three integers. In Haskell I can do this:
myList = BlaiseList [BlaiseInt 1, BlaiseInt 2, BlaiseInt 3]
A Java alternative would look like this (we'll even give it an advantage, assume BlaiseList and BlaiseInt have appropriate constructors):
BlaiseList myList = new BlaiseList();
myList._value.add(new BlaiseInt(1));
myList._value.add(new BlaiseInt(2));
myList._value.add(new BlaiseInt(3));
While we could (unnecessarily) transform the above Haskell code to take up four lines, the clarity is already starting to speak for itself: you can do more with Haskell in less code that's easier to read and maintain in the long run.