Skip to content

Minimal C compiler made for Compiler Sessional (CSE310) course of BUET

Notifications You must be signed in to change notification settings

OitijhyaHoque/min-c-compiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Minimal C Compiler

This compiler can translate code written in a subset of C to 8086 assembly. Demo in this video. Made for CSE310 (Compiler Sessional) - BUET.

Features:

This compiler supports a subset of the C language with the following features:

Core

  • Data Types: The compiler handles int, void types.
  • Variables: Supports both global and local variable declarations. You can declare multiple variables in a single statement (e.g., int x, y;).
  • Arrays: One-dimensional arrays of int with a constant size are supported (e.g., int data[100];). Array elements can be accessed using an index (e.g., data[5]).
  • Literals: The compiler can parse constant integer (123) values.

Operators

Follows standard C precedence rules:

  • Arithmetic: Addition +, subtraction -, multiplication *, division /.
  • Relational: ==, !=, <, >, <=, >=.
  • Logical: &&, ||, and ! (NOT).
  • Assignment: =
  • Unary: Unary minus (e.g., -x) and unary plus (+x).
  • Increment/Decrement: Post-increment ++ (e.g., i++) and post-decrement -- (e.g., i--).

Control Flow

  • Conditionals: if and if-else statements.
  • Loops: for loops (C-style for(;;)), and while loops.
  • Statement Blocks: Nested blocks using curly braces { ... } are supported to manage scope.

Functions

  • Declaration & Definition: Supports both function declarations (prototypes) and definitions.
  • Parameters: Functions can be defined with multiple parameters or with no parameters.
  • Return Values: Functions can return values using the return statement.
  • Recursion: Recursive function calls are fully supported.

Built-in I/O

  • Printing: Includes a simple built-in println(variable); statement to print the value of a variable to the console, followed by a newline.

Requirements:

  • flex 2.6.4 (scanner generator)
  • bison 3.8.2 (parser generator)
  • g++ 11.4.0

emu8086 was used to run the generated assembly.

Directories & Scripts:

  • specs: specifications and CFG
  • src: symbol table, parser, scanner, intermediate code generator
    • symbol_info.cpp, scope_table.cpp, symbol_table.cpp: symbol table implementation
    • scanner.l: scanner
    • parsetree.cpp: parse tree implementation
    • parser.y: parser
    • asm.cpp: intermediate code generator
  • tests: sample code that can be compiled by this compiler
  • build.sh: script to build the compiler
  • debug.sh: sets -g flag in when building the compiler, for debugging purposes
  • test.sh: runs the compiler on all the test files in tests folder
    • -k flag can be used to keep the log files generated during the test
    • -c clean up the repository
    • -m make a output directory and store the generated assembly there
  • verify_output.sh: Converts the test C files (which don't have the stdio.h header) to make them runnable by g++. This does the following:
    • append #include <stdio.h> at the top
    • add definition of println function (wraps a printf("%d\n", x)) at the top.
    • write the modified file in output directory
    • compile the modified file using g++ and run the executable
    • write the output to expected_output.log in the output directory

Grammar:

program : program unit
program : unit
unit : var_declaration
unit : func_declaration
unit : func_definition
func_declaration : type_specifier ID LPAREN parameter_list RPAREN SEMICOLON
func_declaration : type_specifier ID LPAREN RPAREN SEMICOLON
func_definition : type_specifier ID LPAREN parameter_list RPAREN compound_statement
func_definition : type_specifier ID LPAREN RPAREN compound_statement
parameter_list : parameter_list COMMA type_specifier ID
parameter_list : parameter_list COMMA type_specifier
parameter_list : type_specifier ID
parameter_list : type_specifier
compound_statement : LCURL statements RCURL
compound_statement : LCURL RCURL
var_declaration : type_specifier declaration_list SEMICOLON
type_specifier : INT
type_specifier : FLOAT
type_specifier : VOID
declaration_list : declaration_list COMMA ID
declaration_list : declaration_list COMMA ID LSQUARE CONST_INT RSQUARE
declaration_list : ID
declaration_list : ID LSQUARE CONST_INT RSQUARE
statements : statement
statements : statements statement
statement : var_declaration
statement : expression_statement
statement : compound_statement
statement : FOR LPAREN expression_statement expression_statement expression RPAREN statement
statement : IF LPAREN expression RPAREN statement
statement : IF LPAREN expression RPAREN statement ELSE statement
statement : WHILE LPAREN expression RPAREN statement
statement : PRINTLN LPAREN ID RPAREN SEMICOLON
statement : RETURN expression SEMICOLON
expression_statement : SEMICOLON
expression_statement : expression SEMICOLON
variable : ID
variable : ID LSQUARE expression RSQUARE
expression : logic_expression
expression : variable ASSIGNOP logic_expression
logic_expression : rel_expression
logic_expression : rel_expression LOGICOP rel_expression
rel_expression : simple_expression
rel_expression : simple_expression RELOP simple_expression
simple_expression : term
simple_expression : simple_expression ADDOP term
term : unary_expression
term : term MULOP unary_expression
unary_expression : ADDOP unary_expression
unary_expression : NOT unary_expression
unary_expression : factor
factor : variable
factor : ID LPAREN argument_list RPAREN
factor : LPAREN expression RPAREN
factor : CONST_INT
factor : CONST_FLOAT
factor : variable INCOP
factor : variable DECOP
argument_list : arguments
argument_list :
arguments : arguments COMMA logic_expression
arguments : logic_expression

About

Minimal C compiler made for Compiler Sessional (CSE310) course of BUET

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published