jescript aims to be an Erlang interpreter that runs on the JVM. It is in the very early stages of development.
Build status can be found on this CloudBees Jenkins instance.
Feb 2012
The main grammar is complete up to what is specified in the 4.7 spec. The preprocessor supports constant macros, but macro functions, includes or ifdefs yet.
One of the initial things that drew me to Erlang was that I loved the simple grammar, which reminded me of ML. In ML, programs have mathematical interpretations -- the language has formal semantics. It turns out I was only partially right about Erlang. The main syntax of top level forms and expressions is simple indeed. Unlike Java, where there are a million ways to declare a variable, things are well defined in Erlang. These productions were about half the size (~100 lines of SableCC) of a comparable set of productions for Java (~200 lines).
However, the Erlang grammar is still big, weighing in at ~250 lines of SableCC. One pain is the slight differences between expressions, patterns, and guards. This required making three nearly identical copies of the relevant productions. Yuck. Another icky point is the reliance on a preprocessor. Macros are central to the language (e.g. declaring constant), so there's no avoiding a preprocessor. The preprocessor grammar is not complicated, but it is bulky. Because SableCC integrates lexer and parser generation (as do many compiler-compilers) lexing expressions need to be duplicated. And, of course, preprocessing by definition means that we're adding another pass over the code.
Apart from the bulk, the grammar is relatively straight forward. No endless stream of shift/reduce conflicts like when writing a Java grammar. The hardest productions to write were for list comprehension, which, as it turns out is a common problem. I settled for allowing arbitrary expression for the generator and then catching errors later in the compiler. There may be a better solution, but I haven't found it yet.
Next steps are to type check and translate the AST to an IR. I'll probably build an AST interpreter just for giggles somewhere in there too.
Lexer:
- core_scan.erl in Erlang source (definitions for "char", "integer", "float", "atom", "string", and "var")
- Erlang tutorial
- Escape sequences, Erlang reference manual
Grammar:
- Erlang 4.7.3 Reference Manual DRAFT 0.7 (especially Section E.2, The Main Grammar)
- Erlang Reference Manual
- core_parse.yrl in Erlang source
- Comments on mailing list from Joe Armstrong
- LL grammar written for Emacs integration
Intermediate Representation:
- An introduction to Core Erlang
- Core Erlang 1.0.3 language specification
- Desugaring list comprehensions and pattern matching (lecture)
- v3_core.erl translates from normal to core Erlang.