FCC is a minimal and pedagogical Forth compiler written in a single C file that generates assembly code for the FASM assembler. It supports a Forth-like syntax and serves as both a learning tool for compiler construction and a functional compiler for programs.
- Author: Chris Curl
- License: MIT License (c) 2025
- Language: C
- Target: 32-bit x86 Assembly (FASM format)
- fcc.c: The Forth compiler source code
- binaries/fcl: The Forth compiler for Linux
- binaries/fcw.exe: The Forth compiler for Windows
The compiler follows a streamlined three-phase approach:
- IRL Generation - Parse source and generate Intermediate Representation Language (IRL)
- Iterative Optimization - Repeatedly perform peephole optimizations until no changes
- Code Generation - Output platform-specific assembly code (Linux/Windows)
- There is a solution file
fcc.sln. - This is a 32-bit system, so only the 32-bit configuration is supported.
- It makes a program named
fcw.exe. fcw.exeis the Forth Compiler for Windows.
- There is a Makefile.
- This is a 32-bit system, so only the 32-bit configuration is supported.
makecreates a program namedfcl.fclis the Forth Compiler for Linux.
- Run
fcworfcldepending on if you are running Windows or Linux. - The programs take a single parameter, the name of a source file.
- The programs write the generated source to stdout.
- Any errors detected are written to stderr.
- Redirect the output into a file (e.g. -
fcl pgm.fh > pgm.asm). - Execute
fasmusing that file for input (e.g. -fasm pgm.asm).
- Linux: see the Makefile for the 'test' target.
- Windows: see the 'make.bat' file.
#define VARS_SZ 500 // Maximum number of variables/symbols
#define STRS_SZ 500 // Maximum number of string literals
#define LOCS_SZ 500 // Size of local storage array
#define CODE_SZ 5000 // Maximum number of IRL instructions
#define HEAP_SZ 5000 // Maximum number of characters in the HEAPtypedef struct {
char type; // 'I'=Integer, 'F'=Function, 'T'=Target
char name[23]; // Symbol name
char asmName[8]; // Generated assembly name
int sz; // Size in bytes
char *str; // String pointer
} SYM_T;next_ch()- Advances to next character, handles line reading and EOFnext_line()- Reads next line from input filenext_token()- Extracts next token, handles comments (//) and numbers
checkNumber(char *w, int base)- Parses numbers in multiple bases:- Binary:
%1010(prefix%) - Decimal:
#123or123(prefix#or none) - Hexadecimal:
$FF(prefix$) - Character literals:
'Y'(single quotes) - Supports negative numbers with
-prefix
- Binary:
findSymbol(char *name, char type)- Locates symbol by name and typeaddSymbol(char *name, char type)- Adds new symbol to table
The compiler uses an internal instruction set:
Stack Operations:
PUSHA,POPA- Push/pop accumulatorSWAP,SP4- Stack manipulationPOPB- Pop to second register
Memory Operations:
STORE,FETCH- 32-bit memory store/loadCSTORE,CFETCH- 8-bit (byte) memory store/loadLOADSTR- Load string address
Arithmetic:
ADD,SUB,MULT,DIVIDE- Basic arithmeticDIVMOD- Division with both quotient and remainderLT,GT,EQ,NEQ- ComparisonsAND,OR,XOR- Bitwise operations
Control Flow:
TESTA- Test accumulator against zeroJMP,JMPZ,JMPNZ- Conditional/unconditional jumpsTARGET- Jump target labelsDEF,CALL,RETURN- Function definition and calls
Register and Pointer Operations:
MOVAB,MOVAC,MOVAD- Copy accumulator to EBX, ECX, EDXADDEDI,SUBEDI- Add/subtract constant to EDI (pointer arithmetic)EDIOFF- Load EDI+offset into EAXSYS- System call interrupt
A-Register Operations:
AFET,ASTO- Fetch from/store to A register variableAINC,ADEC- Increment/decrement A register variable
Special:
LIT- Literal valuesPLEQ- Plus-store operation (+!)INCTOS,DECTOS- Increment/decrement top of stackCODE- Embed straight FASM code into assembly file
Variables:
var myVar // Declare variable (default size 1 DWORD)
var buf 100 allot // Declare variable with size 100 DWORDsFunctions:
: myFunc // Function definition
42 myVar ! // Store 42 in myVar
; // End functionControl Structures:
condition if // Conditional execution
// code
then
begin // Loops
// code
condition
while // While loop
again // Infinite loop
until // Until loopStack Operations:
42 // Push literal
dup // Duplicate TOS
drop // Remove TOS
swap // Swap TOS and NOS
over // Copy second to topMemory Operations:
@ // Fetch 32-bit value from address
! // Store 32-bit value to address
c@ // Fetch 8-bit (byte) value from address
c! // Store 8-bit (byte) value to address
+! // Add to memory location
1+ 1- // Increment/decrement TOSRegister, Locals, and System Operations:
->reg1 // Copy TOS to EAX (no-op, EAX is TOS)
->reg2 // Copy TOS to EBX
->reg3 // Copy TOS to ECX
->reg4 // Copy TOS to EDX
sys // Execute system call (INT 0x80)
+locs // Add 24 to EDI (allocate 6 locals)
-locs // Subtract 24 from EDI (free last 6 locals)
l0..l5 // Push addr of local #x to the stackA-Register Operations:
a@ // Fetch value from A register variable
a! // Store value to A register variable
a+ // Increment A register variable
a- // Decrement A register variableString Literals:
s" Hello" // Push string address to stackArithmetic and Logic:
+ - * / // Basic arithmetic
/mod // Division with quotient and remainder
< = <> > // Comparisons
AND OR XOR // Bitwise operationsSource Code Comments:
// // Comment until the end of the line
( ... ) // In-line commentInline Assembly Code:
: bye code
xor ebx, ebx
mov eax, 1
int 0x80
end-code
;Linux (32-bit):
- ELF executable format
- No external library dependencies
- Direct system calls via
syscommand orcode - Custom function call convention using EBP stack
Windows (32-bit):
- PE executable format
- Windows API integration
- Built-in console output support
Common Features:
- Enhanced optimization with iterative peephole passes
- Uses EDI to point to a
locsarray for local storage - A-register variable for quick access operations
- Syntax errors show line number, column, and source context
- Fatal errors terminate compilation
- Warnings are displayed as comments in output
: c@a+ a@ c@ a+ ;
// strlen: n = length of string at address a
: strlen ( a--n )
+locs a@ l0 ! a!
0 begin 1+ c@a+ while
1- l0 @ a! -locs ;
var (counter)
: counter ( --n ) (counter) @ ;
: counter! (n -- ) (counter) ! ;
: increment counter 1+ counter! ;
var (limit)
: limit ( --n ) (limit) @ ;
: limit! (n -- ) (limit) ! ;
: mil ( n--m ) 1000 dup * * ;
: main
0 counter!
1 mil limit!
begin
increment counter limit >
until
// Program complete
;- Input Processing - Read source file (error if no argument provided)
- IRL Generation - Parse declarations and generate intermediate representation
- Iterative Optimization - Repeatedly perform peephole optimizations until no changes
- Code Generation - Output assembly with startup code and runtime support
- Symbol Output - Generate variable declarations with proper sizing in data section
- No built-in I/O functions (must use system calls via
sys) - Limited error checking and recovery (errors output to stderr)
- No floating-point support
- Fixed-size tables and heap
elseclause not yet implemented
- Byte and word memory access (
c@,c!,@,!) - Direct system call support via register operations
- Pointer arithmetic and local array access via EDI and
locs - Variable-sized variable declarations with
allot - A-register variable for optimized frequent access
- Multi-base number literals (binary, decimal, hex, character)
- Iterative optimization passes for better code generation
- Compact, single-file, self-contained compiler
- Clean separation of IRL generation and code emission
- Stack-based execution model with register and pointer access
- Enhanced error reporting with stderr output
This compiler serves as an example of a minimal but functional compiler implementation, demonstrating core compiler concepts in a clear and understandable way.