Skip to content

A compiler built from scratch in Go for the COOL programming language, generating and optimizing LLVM IR using standard optimization passes. The project extends COOL with a basic module system and custom data structures like linked lists, all implemented from scratch.

Notifications You must be signed in to change notification settings

imanerh/Compiler-From-Scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Compiler From Scratch

Prerequisites

  1. Install LLVM:
    Download and install the latest version of LLVM from the official GitHub releases page:
    LLVM Releases

    Make sure to add LLVM to your system's PATH after installation.

Project Structure

  • The src folder contains the main compiler script llvm.go under the llvm folder.
  • The input folder contains sample files that have been successfully compiled using this compiler.
  • The output/ir folder is used to store the initial LLVM IR (with no optimization), and the intermediate IR files with various optimizations applied.
  • The output/bin folder is used to store the final executable (named output.exe if no optimizations applied, and output_optimized.exe if the COOL program was compiled with optimizations).

How to Run

Clone the Repository

First, clone the repository and navigate to the project's root directory:

   git clone https://github.com/imanerh/Compiler-From-Scratch.git
   cd path/to/your/project

Option 1: Run Without Optimization

  1. Navigate to the src directory

    cd src
  2. Run the compiler
    Execute main.go, specifying the input Cool file:

    go run main.go <input_file.cool>
    • The <input_file.cool> should be relative to the src directory, e.g: ..\\input\\LinkedListIntExample.cool
    • This generates the LLVM IR (output.ll) and executable (output.exe) in the output/ir and output/bin directories, respectively. Then, it runs the executable output.exe.

Option 2: Run With Optimizations

To compile the LLVM IR with optimizations, follow these steps:

  1. Generate LLVM IR
    Run the src/main.go script to generate the LLVM IR file output.ll as explained in the previous section.

  2. Apply Optimizations
    Make sure you are in the main folder of the project. Use the Makefile to apply a series of LLVM optimization passes and generate the optimized executable:

    make

    The optimized executable will be generated at output/bin/output_optimized.exe. To run the executable, use the command:

    ./output/bin/output_optimized.exe
  3. Clean Build Files
    To remove all intermediate files and the final executable, run:

    make clean

Optimization Passes Explained

  • mem2reg: Promotes memory-based allocations to registers, enabling further optimizations by simplifying the control flow and removing unnecessary memory accesses.
  • instcombine: Combines multiple simple instructions into fewer, more efficient instructions.
  • simplifycfg: Simplifies the control flow graph by removing unnecessary branches and unreachable blocks.
  • loop-unroll: Unrolls loops to reduce loop overhead and improve performance by executing multiple iterations in a single loop body.

These passes are applied in sequence to progressively optimize the generated LLVM IR, resulting in more efficient machine code.

Module System Extension

The module system allows organizing code into separate files and importing them using the import keyword. Modules are expected to reside in the input/ directory, and their filenames must match their module names. When an import ModuleName; statement is encountered, the corresponding file is located and parsed before the importing file. During compilation, all imported modules are concatenated into a single LLVM module to simplify handling, though this does not support separate compilation.

Currently, the system does not perform semantic analysis, meaning it does not check for undefined references, type mismatches, or incorrect module usages beyond basic syntax validation. Additionally, the system does not yet optimize or prune unused imports.

At least now, we can write some code elsewhere and import it!

Notes

  • module and import keywords are case-insensitive.
  • Module names should be Uppercase.
  • Module filenames should match their module name.
  • All imported modules must be in the input/ directory.
  • Use import ModuleName; without specifying a path.
  • Imported modules are parsed before the importing file.
  • Concatenation is used to combine all imported modules into a single LLVM module during compilation.

Using the LinkedList Extension

The module system now supports Linked Lists, implemented in the module LinkedList. To use it:

  • Import it at the beginning of your Cool file using:

    import LinkedList;
  • The LinkedList class and its methods are defined in input/LinkedList.ccol.

  • Your Cool file can be anywhere, it doesn’t have to be inside the input/ directory.

  • Full examples of usage can be found in:

    • input/LinkedListIntExample.cool (for integers)
    • input/LinkedListStringExample.cool (for strings)
  • Basic usage example:

    import LinkedList;
    
    class Main inherits IO {
         list : LinkedList <- new LinkedList;
    
         main() : Object {{
            list.addFirst(2210);
            list.addLast(800);
            out_int(list.getSize()); -- Outputs 2
            list.removeFirst();
            list.removeLast();
            -- Other supported functions include: list.getSize(), list.getHead(), list.get(index), list.isEmpty() 
         }};
    };

About

A compiler built from scratch in Go for the COOL programming language, generating and optimizing LLVM IR using standard optimization passes. The project extends COOL with a basic module system and custom data structures like linked lists, all implemented from scratch.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published