Content
This is a course design for Compiler Principles of NUAA CCST 2020.
It includes a PL/0 bytecode compiler and virtual machine.
This repo was used to store the code, but now let's make it more useful. This repo will help you to learn a simple implementation of compiler of a simple language PL/0, which uses LL(1) grammar.
LL(1) grammar means that every production rule can be determined by the first symbol of itself. To make it more understandable, we can use a simple example to illustrate it.
Look at the following grammar:
<code_block> ::= "begin" {<stmt>} "end"
<stmt> ::= <condition> | <loop>
<loop> ::= "while" <expr> "do" <code_block>
<condition> ::= "if" <expr> "then" <code_block> ["else" <code_block>]As we could see, <stmt> accept two kinds of statement: <condition> and <loop>. But how to distinguish them?
And abviously, <condition> alawys starts with "if",
and <loop> always starts with "while". So the answer is:
if (lookahead(tok_kind::keyword_if)) {
// parse <condition>
} else if (lookahead(tok_kind::keyword_while)) {
// parse <loop>
}And we could see that we could use lookahead() to check the next 1 token, then we will definitely know what kind of statement we will parse. That's what 1 means in LL(1).
For LL(k), k means some of the rules are determined by the next k(by maximum) tokens.
In fact most of the languages used in the industry are LL(1) with some special cases.
And of course, some of these languages are LL(k).
Languages using LL grammar are usually easier to maintain.
But LL(1) only requires one token lookahead.
So it is easy to implement the parser.
PL/0 is a simple language. It is a subset of Pascal.
The grammar is really simple, so it only requires LL(1) parser.
Here's the grammar:
<program> ::= "program" <id> ";" <block>
<decl> ::= <var-decl>
| <const-decl>
| <procedure-decl>
<var-decl> ::= "var" <id> [":=" <expr>] ";"
<const-decl> ::= "const" <id> [":=" <expr>] {"," <id> [":=" <expr>]} ";"
<procedure-decl> ::= "procedure" <id> "(" [<param-list>] ")" ";" <block>
<param-list> ::= <id> {"," <id>}
<block> = {<decl>} <body>
<body> = "begin" <stmt> {";" <stmt>} "end"
<stmt> ::= <assign-stmt>
| <if-stmt>
| <while-stmt>
| <call-stmt>
| <read-stmt>
| <write-stmt>
| <block>
<assign-stmt> ::= <id> ":=" <expr>
<if-stmt> ::= "if" <expr> "then" <stmt> [ "else" <stmt> ]
<while-stmt> ::= "while" <expr> "do" <stmt>
<call-stmt> ::= "call" <id> "(" [<actual-param-list>] ")"
<actual-param-list> ::= <expr> {"," <expr>}
<read-stmt> ::= "read" "(" <id> {"," <id>} ")"
<write-stmt> ::= "write" "(" [<expr> {"," <expr>}] ")"
<expr> ::= <term> {("+" | "-") <term>}
<term> ::= <factor> {("*" | "/") <factor>}
<factor> ::= <id>
| <number>
| "(" <expr> ")"
| "-" <factor>use this command to compile this project:
makeTo get help:
./pl0 -huse this command to use pl0 interpreter:
./pl0 -exec <filename>use this command to use pl0 interpreter(debug):
./pl0 -debug <filename>