A feature-rich interpreter implemented in Go that supports dynamic typing, functions with closures, arrays, hash maps, strings, and more. This project demonstrates the complete architecture of a modern programming language implementation, from lexical analysis to evaluation.
- Integers:
42,100,-5 - Strings:
"Hello","World","Escaped \"quotes\"" - Booleans:
true,false - Arrays:
[1, 2, 3],["a", "b", "c"] - Hash Maps:
{"key": "value", "age": 25} - Functions: First-class functions with closures
- Addition:
+(also string concatenation) - Subtraction:
- - Multiplication:
* - Division:
/
- Less than:
<(works with numbers and strings) - Greater than:
>(works with numbers and strings) - Less than or equal:
<= - Greater than or equal:
>= - Equal:
== - Not equal:
!=
- AND:
&& - OR:
|| - NOT:
!
- Negation:
-x - Logical NOT:
!x
- Array indexing:
array[0] - Hash map access:
hash["key"]
x = 10;
name = "Alice";
numbers = [1, 2, 3];
person = {"name": "Bob", "age": 30};// Function declaration
function add(x, y) {
return x + y
}
// Anonymous functions
multiply = function(x, y) { return x * y };
// Higher-order functions
function makeAdder(x) {
return function(y) { return x + y }
}
addFive = makeAdder(5);
result = addFive(10); // Returns 15Functions capture their lexical environment:
function counter() {
count = 0;
return function() {
count = count + 1;
return count
}
}
c = counter();
print(c()); // 1
print(c()); // 2function factorial(n) {
if (n <= 1) {
return 1
} else {
return n * factorial(n - 1)
}
}
print(factorial(5)); // 120numbers = [1, 2, 3, 4, 5];
print(numbers[0]); // 1
print(length(numbers)); // 5
// Array iteration
sum = 0;
i = 0;
while (i < length(numbers)) {
sum = sum + numbers[i];
i = i + 1
};person = {
"name": "Alice",
"age": 30,
"city": "New York"
};
print(person["name"]); // Alice
// Nested structures
data = {
"users": [
{"name": "Bob", "id": 1},
{"name": "Carol", "id": 2}
],
"count": 2
};greeting = "Hello, " + "World!";
print(greeting); // Hello, World!
print(length(greeting)); // 13
// String escape sequences
message = "Line 1\nLine 2\tTabbed";
quoted = "She said \"Hello\"";if (x > 10) {
print("Greater than 10")
} else {
print("Less than or equal to 10")
}counter = 0;
while (counter < 10) {
print(counter);
counter = counter + 1
}print(42);
print("Hello World");
print([1, 2, 3]);
print({"key": "value"});Returns the length of strings, arrays, or hash maps:
print(length("hello")); // 5
print(length([1, 2, 3])); // 3
print(length({"a": 1, "b": 2})); // 2simple-compiler/
├── cmd/
│ └── compiler/
│ └── main.go # Entry point with comprehensive demo
├── internal/
│ ├── token/
│ │ └── token.go # Token definitions and keywords
│ ├── lexer/
│ │ └── lexer.go # Lexical analysis with string support
│ ├── ast/
│ │ └── ast.go # AST nodes for all language features
│ ├── parser/
│ │ └── parser.go # Recursive descent parser with Pratt parsing
│ └── evaluator/
│ ├── environment.go # Lexically scoped variable storage
│ └── evaluator.go # Dynamic tree-walking interpreter
├── test/
│ └── integration_test.go # Comprehensive test suite
├── .gitignore
├── go.mod
└── README.md
- Go 1.22.4 or higher
Clone the repository:
git clone https://github.com/phillips/simple-compiler.git
cd simple-compilergo run ./cmd/compilerThis runs a comprehensive demo showcasing all language features including functions, closures, recursion, arrays, hash maps, and string operations.
go build -o compiler ./cmd/compiler
./compilergo test ./test/...// Variables and arithmetic
x = 10;
y = 20;
sum = x + y;
print(sum); // 30
// String operations
greeting = "Hello, " + "World!";
print(greeting); // Hello, World!// Simple function
function add(a, b) {
return a + b
}
print(add(5, 3)); // 8
// Closure example
function makeMultiplier(factor) {
return function(x) {
return x * factor
}
}
double = makeMultiplier(2);
triple = makeMultiplier(3);
print(double(5)); // 10
print(triple(5)); // 15// Arrays
numbers = [1, 2, 3, 4, 5];
squares = [];
i = 0;
while (i < length(numbers)) {
squares = squares + [numbers[i] * numbers[i]];
i = i + 1
};
print(squares); // [1, 4, 9, 16, 25]
// Hash maps
person = {
"name": "Alice",
"age": 30,
"hobbies": ["reading", "coding"]
};
print(person["name"]); // Alice
print(person["hobbies"][0]); // readingfunction fibonacci(n) {
if (n <= 1) {
return n
} else {
return fibonacci(n - 1) + fibonacci(n - 2)
}
}
i = 0;
while (i <= 10) {
print(fibonacci(i));
i = i + 1
}// Map-like functionality
function applyToAll(arr, f) {
result = [];
i = 0;
while (i < length(arr)) {
result = result + [f(arr[i])];
i = i + 1
};
return result
}
numbers = [1, 2, 3, 4, 5];
doubled = applyToAll(numbers, function(x) { return x * 2 });
print(doubled); // [2, 4, 6, 8, 10]The lexer (internal/lexer/lexer.go) converts source code into tokens:
- Recognizes keywords (
function,return,if,else,while,print,true,false) - Handles string literals with escape sequences (
\n,\t,\",\\) - Identifies all operators and delimiters
- Tracks line and column positions for error reporting
The parser (internal/parser/parser.go) builds an AST from tokens:
- Uses recursive descent with Pratt parsing for expressions
- Handles function definitions and calls
- Parses array and hash map literals
- Supports index expressions for both arrays and hash maps
- Distinguishes between code blocks and hash literals
The AST (internal/ast/ast.go) defines nodes for:
- Literals:
IntegerLiteral,StringLiteral,BooleanLiteral,ArrayLiteral,HashLiteral,FunctionLiteral - Expressions:
InfixExpression,PrefixExpression,CallExpression,IndexExpression - Statements:
Assignment,IfStatement,WhileStatement,PrintStatement,ReturnStatement,BlockStatement
The evaluator (internal/evaluator/evaluator.go) executes the AST:
- Dynamic typing with
interface{} - Function values with captured environments (closures)
- Array and hash map operations
- Built-in function support
- Recursive function calls
- Short-circuit evaluation for logical operators
The environment (internal/evaluator/environment.go) manages variable scope:
- Lexical scoping with environment chaining
- Closure support through captured environments
- Nested scope management for functions and blocks
Program → Statement*
Statement → FunctionDecl | IfStmt | WhileStmt | PrintStmt |
ReturnStmt | Assignment | ExprStmt | Block
FunctionDecl → "function" IDENTIFIER "(" Parameters? ")" Block
Parameters → IDENTIFIER ("," IDENTIFIER)*
IfStmt → "if" "(" Expression ")" Block ("else" Block)?
WhileStmt → "while" "(" Expression ")" Block
PrintStmt → "print" "(" Expression ")"
ReturnStmt → "return" Expression
Assignment → IDENTIFIER "=" Expression ";"?
ExprStmt → Expression ";"?
Block → "{" Statement* "}"
Expression → Logical
Logical → Comparison (("&&" | "||") Comparison)*
Comparison → Term (("==" | "!=" | "<" | ">" | "<=" | ">=") Term)*
Term → Factor (("+" | "-") Factor)*
Factor → Unary (("*" | "/") Unary)*
Unary → ("!" | "-") Unary | Postfix
Postfix → Primary (CallExpr | IndexExpr)*
CallExpr → "(" Arguments? ")"
IndexExpr → "[" Expression "]"
Arguments → Expression ("," Expression)*
Primary → INTEGER | STRING | "true" | "false" | IDENTIFIER |
FunctionLit | ArrayLit | HashLit | "(" Expression ")"
FunctionLit → "function" "(" Parameters? ")" Block
ArrayLit → "[" (Expression ("," Expression)*)? "]"
HashLit → "{" (HashPair ("," HashPair)*)? "}"
HashPair → Expression ":" Expression
- OR (
||) - AND (
&&) - Equality (
==,!=) - Comparison (
<,>,<=,>=) - Addition/Subtraction (
+,-) - Multiplication/Division (
*,/) - Prefix (
!,-) - Call/Index (
(),[])
The language uses dynamic typing with automatic type coercion where appropriate:
- Integers: 64-bit signed integers
- Strings: UTF-8 strings with escape sequence support
- Booleans:
trueorfalse - Arrays: Heterogeneous collections (can mix types)
- Hash Maps: String, integer, or boolean keys with any value type
- Functions: First-class values with closure support
- Null: Represented as
nilinternally
Values are considered "falsy" in boolean contexts:
false(boolean)0(integer)""(empty string)nil(null/undefined)
All other values are "truthy".
The compiler provides detailed error messages:
[Line 5, Column 3] expected ')' after function parameters
[Line 10, Column 15] unexpected token in array literal
error: division by zero
error: undefined variable: x
error: not a function: INTEGER
error: wrong number of arguments: expected 2, got 3
error: array index out of bounds: 10
error: unusable as hash key: ARRAY
The project includes comprehensive tests covering:
- Lexer: tokenization of all language features
- Parser: AST construction and error cases
- Evaluator: all data types, operations, and control flow
- Integration: complete programs with complex features
Run tests:
go test -v ./test/...- Lexer: O(n) where n is source code length
- Parser: O(n) for most constructs, recursive descent
- Evaluator: Tree-walking interpreter, no optimization
- Environment: Hash map lookups, O(1) average case
- Closures: Environments are referenced, not copied
Potential features to add:
- For loops and iterators
- Break/continue statements
- Object-oriented features (structs/classes)
- Module system and imports
- Standard library (file I/O, networking, etc.)
- Type annotations and checking
- Pattern matching
- REPL (Read-Eval-Print Loop)
- Bytecode compilation and VM
- JIT compilation
- Debugger support
This project is licensed under the MIT License.
This project demonstrates modern interpreter implementation techniques including:
- Lexical analysis with proper string handling
- Pratt parsing for operator precedence
- Dynamic typing with heterogeneous collections
- Lexical scoping and closure implementation
- First-class functions
- Built-in function integration
- Tree-walking interpretation
The implementation serves as a foundation for understanding programming language design and can be extended with additional features like type systems, optimization passes, or compilation to bytecode/native code.