A guide covering LLVM including the applications, libraries and tools that will make you a better and more efficient LLVM development.
Note: You can easily convert this markdown file to a PDF in VSCode using this handy extension Markdown PDF.
LLVM is a library that has collection of modular/reusable compiler and toolchain components (assemblers, compilers, and debuggers). With these components LLVM can be used as a compiler framework, providing a front-end(parser and lexer) and a back-end (code that converts LLVM's representation to actual machine code).
-
LLVM-based compiler: This is a compiler built partially or completely with the LLVM infrastructure. For example, a compiler might use LLVM for the frontend and backend but use GCC and GNU system libraries to perform the final link.
-
LLVM libraries: This is the reusable code portion of the LLVM infrastructure.
-
LLVM core: The optimizations that happen at the intermediate language level and the backend algorithms form the LLVM core where the project started.
-
Clang is a language front-end and tooling infrastructure for languages in the C language family (C, C++, Objective C/C++, OpenCL, CUDA, and RenderScript) for the LLVM project.
- LLVM Language Reference Manual
- LLVM Programmer’s Manual
- Getting Started with the LLVM System
- Learn LLVM 17: A beginner's guide to learning LLVM compiler tools and core libraries with C++ by Kai Nacke
- LLVM API with Rust Kindle Edition by Vishal Patil
- LLVM Techniques, Tips, and Best Practices Clang and Middle-End Libraries: Design powerful and reliable compilers using the latest libraries and tools from LLVM by Min-Yih Hsu
- Crafting Interpreters by Robert Nystrom
- Writing Compilers and Interpreters: A Software Engineering Approach 3rd Edition by Ronald Mak
- Build Your Own Programming Language: A programmer's guide to designing compilers, interpreters, and DSLs for solving modern computing problems by Clinton L. Jeffery
- LLVM @ Google Scholar
- LLVM @ ACM-DL
- LLVM @ IEEEXplore
- LLVM @ DBLP
Visual Studio Code is a code editor redefined and optimized for building and debugging modern web and cloud applications.
Code Server is a tool that allows you to run VS Code on any machine anywhere and access it in the browser.
Clang-Format is a tool to format C/C++/Java/JavaScript/Objective-C/Objective-C++/Protobuf code.
Clang-Tidy is a clang-based C++ "linter" tool. Its purpose is to provide an extensible framework for diagnosing and fixing typical programming errors, like style violations, interface misuse, or bugs that can be deduced via static analysis. clang-tidy is modular and provides a convenient interface for writing new checks.
Clangd is a Visual Studio Code extension that provides C/C++ language IDE features for VS Code using clangd.
LLD is a linker from the LLVM project that is a drop-in replacement for system linkers and runs much faster than them. It also provides features that are useful for toolchain developers. The linker supports ELF (Unix), PE/COFF (Windows), Mach-O (macOS) and WebAssembly in descending order.
TinyGo is a Go compiler(based on LLVM) intended for use in small places such as microcontrollers, WebAssembly (Wasm), and command-line tools.
Codon is a high-performance Python compiler that compiles Python code to native machine code without any runtime overhead. Typical speedups over Python are on the order of 10-100x or more, on a single thread.
oneAPI DPC++ compiler is a LLVM-based compiler project that implements compiler and runtime support for the SYCL* language.
Cling is an interactive C++ interpreter, built on top of Clang and LLVM compiler infrastructure. Cling realizes the read-eval-print loop (REPL) concept, in order to leverage rapid application development.
Emu is a GPGPU library for Rust with a focus on portability, modularity, and performance. It's a CUDA-esque compute-specific abstraction over WebGPU providing specific functionality to make WebGPU feel more like CUDA.
Scalene is a high-performance CPU, GPU and memory profiler for Python that does a number of things that other Python profilers do not and cannot do. It runs orders of magnitude faster than many other profilers while delivering far more detailed information.
McSema is an executable lifter. It translates ("lifts") x86, amd64, aarch64, sparc32, and sparc64 executable binaries from native machine code to LLVM bitcode.
QuarkslaB Dynamic binary Instrumentation (QBDI) is a modular, cross-platform and cross-architecture DBI framework. It aims to support Linux, macOS, Android, iOS and Windows operating systems running on x86, x86-64, ARM and AArch64 architectures.
FileCheck is a flexible pattern matching file verifier.
tblgen is a description to C++ Code.
clang-tblgen is a description to C++ Code for Clang.
lldb-tblgen is a description to C++ Code for LLDB.
llvm-tblgen is a target description to C++ Code for LLVM.
mlir-tblgen is a description to C++ Code for MLIR.
lit is a LLVM Integrated Tester.
llvm-exegesis is a LLVM Machine Instruction Benchmark.
llvm-locstats is a calculate statistics on DWARF debug location.
llvm-pdbutil is a PDB File forensics and diagnostics.
llvm-profgen is a LLVM SPGO profile generation tool
bugpoint is a automatic test case reduction tool.
llvm-extract is a extract a function from an LLVM module.
llvm-bcanalyzer is a LLVM bitcode analyzer.
llvm-addr2line is a drop-in replacement for addr2line.
llvm-ar is a LLVM archiver.
llvm-cxxfilt is a LLVM symbol name demangler.
llvm-install-name-tool is a LLVM tool for manipulating install-names and rpaths.
llvm-nm is a list LLVM bitcode and object file’s symbol table.
llvm-objcopy is a object copying and editing tool.
llvm-objdump is a LLVM’s object file dumper.
llvm-ranlib is a generates an archive index.
llvm-readelf is a GNU-style LLVM Object Reader.
llvm-size is a print size information.
llvm-strings is a print strings.
llvm-strip is a object stripping tool.
C++ is a cross-platform language that can be used to build high-performance applications developed by Bjarne Stroustrup, as an extension to the C language.
C is a general-purpose, high-level language that was originally developed by Dennis M. Ritchie to develop the UNIX operating system at Bell Labs. It supports structured programming, lexical variable scope, and recursion, with a static type system. C also provides constructs that map efficiently to typical machine instructions, which makes it one was of the most widely used programming languages today.
Embedded C is a set of language extensions for the C programming language by the C Standards Committee to address issues that exist between C extensions for different embedded systems. The extensions hep enhance microprocessor features such as fixed-point arithmetic, multiple distinct memory banks, and basic I/O operations. This makes Embedded C the most popular embedded software language in the world.
C & C++ Developer Tools from JetBrains
Open source C++ libraries on cppreference.com
C++ Tools and Libraries Articles
Introduction C++ Education course on Google Developers
C and C++ Coding Style Guide by OpenTitan
Learn C : An Interactive C Tutorial
C++ Online Training Courses on LinkedIn Learning
Learn C Programming Online Courses on edX
Learn C++ with Online Courses on edX
Coding for Everyone: C and C++ course on Coursera
C++ For C Programmers on Coursera
Basics of Embedded C Programming for Beginners on Udemy
C++ For Programmers Course on Udacity
C++ Fundamentals Course on Pluralsight
Introduction to C++ on MIT Free Online Course Materials
Introduction to C++ for Programmers | Harvard
Online C Courses | Harvard University
C++ Client Libraries for Google Cloud Services
Visual Studio is an integrated development environment (IDE) from Microsoft; which is a feature-rich application that can be used for many aspects of software development. Visual Studio makes it easy to edit, debug, build, and publish your app. By using Microsoft software development platforms such as Windows API, Windows Forms, Windows Presentation Foundation, and Windows Store.
Visual Studio Code is a code editor redefined and optimized for building and debugging modern web and cloud applications.
Vcpkg is a C++ Library Manager for Windows, Linux, and MacOS.
ReSharper C++ is a Visual Studio Extension for C++ developers developed by JetBrains.
AppCode is constantly monitoring the quality of your code. It warns you of errors and smells and suggests quick-fixes to resolve them automatically. AppCode provides lots of code inspections for Objective-C, Swift, C/C++, and a number of code inspections for other supported languages. All code inspections are run on the fly.
CLion is a cross-platform IDE for C and C++ developers developed by JetBrains.
Code::Blocks is a free C/C++ and Fortran IDE built to meet the most demanding needs of its users. It is designed to be very extensible and fully configurable. Built around a plugin framework, Code::Blocks can be extended with plugins.
CppSharp is a tool and set of libraries which facilitates the usage of native C/C++ code with the .NET ecosystem. It consumes C/C++ header and library files and generates the necessary glue code to surface the native API as a managed API. Such an API can be used to consume an existing native library in your managed code or add managed scripting support to a native codebase.
Conan is an Open Source Package Manager for C++ development and dependency management into the 21st century and on par with the other development ecosystems.
High Performance Computing (HPC) SDK is a comprehensive toolbox for GPU accelerating HPC modeling and simulation applications. It includes the C, C++, and Fortran compilers, libraries, and analysis tools necessary for developing HPC applications on the NVIDIA platform.
Thrust is a C++ parallel programming library which resembles the C++ Standard Library. Thrust's high-level interface greatly enhances programmer productivity while enabling performance portability between GPUs and multicore CPUs. Interoperability with established technologies such as CUDA, TBB, and OpenMP integrates with existing software.
Boost is an educational opportunity focused on cutting-edge C++. Boost has been a participant in the annual Google Summer of Code since 2007, in which students develop their skills by working on Boost Library development.
Automake is a tool for automatically generating Makefile.in files compliant with the GNU Coding Standards. Automake requires the use of GNU Autoconf.
Cmake is an open-source, cross-platform family of tools designed to build, test and package software. CMake is used to control the software compilation process using simple platform and compiler independent configuration files, and generate native makefiles and workspaces that can be used in the compiler environment of your choice.
GDB is a debugger, that allows you to see what is going on `inside' another program while it executes or what another program was doing at the moment it crashed.
GCC is a compiler Collection that includes front ends for C, C++, Objective-C, Fortran, Ada, Go, and D, as well as libraries for these languages.
GSL is a numerical library for C and C++ programmers. It is free software under the GNU General Public License. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting. There are over 1000 functions in total with an extensive test suite.
OpenGL Extension Wrangler Library (GLEW) is a cross-platform open-source C/C++ extension loading library. GLEW provides efficient run-time mechanisms for determining which OpenGL extensions are supported on the target platform.
Libtool is a generic library support script that hides the complexity of using shared libraries behind a consistent, portable interface. To use Libtool, add the new generic library building commands to your Makefile, Makefile.in, or Makefile.am.
Maven is a software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a project's build, reporting and documentation from a central piece of information.
TAU (Tuning And Analysis Utilities) is capable of gathering performance information through instrumentation of functions, methods, basic blocks, and statements as well as event-based sampling. All C++ language features are supported including templates and namespaces.
Clang is a production quality C, Objective-C, C++ and Objective-C++ compiler when targeting X86-32, X86-64, and ARM (other targets may have caveats, but are usually easy to fix). Clang is used in production to build performance-critical software like Google Chrome or Firefox.
OpenCV is a highly optimized library with focus on real-time applications. Cross-Platform C++, Python and Java interfaces support Linux, MacOS, Windows, iOS, and Android.
Libcu++ is the NVIDIA C++ Standard Library for your entire system. It provides a heterogeneous implementation of the C++ Standard Library that can be used in and between CPU and GPU code.
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build parse trees and also generates a listener interface that makes it easy to respond to the recognition of phrases of interest.
Oat++ is a light and powerful C++ web framework for highly scalable and resource-efficient web application. It's zero-dependency and easy-portable.
JavaCPP is a program that provides efficient access to native C++ inside Java, not unlike the way some C/C++ compilers interact with assembly language.
Cython is a language that makes writing C extensions for Python as easy as Python itself. Cython is based on Pyrex, but supports more cutting edge functionality and optimizations such as calling C functions and declaring C types on variables and class attributes.
Spdlog is a very fast, header-only/compiled, C++ logging library.
Infer is a static analysis tool for Java, C++, Objective-C, and C. Infer is written in OCaml.
Assembly is a low-level programming language. It uses mnemonic codes and labels to represent machine-level code with each instruction corresponding to just one machine operation.
RISC-V Foundation is a non-profit corporation controlled by its 500 members(NVIDIA, Google, Samsung, Raspberry Pi, SiFive, Canonical, and Western Digital) to drive forward the adoption and implementation of the free and open RISC-V instruction set architecture (ISA).
Intel® 64 and IA-32 Architectures Software Developer’s Manual
Introduction to x64 Assembly from Intel
x86 Assembly Language Reference Manual for Open Solaris
AMD64 Architecture Programmer’s Manual Volume 1-5
AMD Developer Guides, Manuals, and ISA Documents
The Assembler language on z/OS from IBM
MIPS Architecture & Technology from Wave Computing
Microsoft Macro Assembler reference
Compiler Intrinsics and Assembly Language from Microsoft
x86 and amd64 instruction Reference
Intro to x86 Assembly Language Programming
Learn Assembly Programming courses on Udemy
Assembly Languages and Assemblers courses on Coursera
Intro to Assembly Language from MIT
Arm Instruction Emulator (ArmIE) is a tool that emulates Scalable Vector Extension (SVE) and SVE2 instructions on AArch64/ARM64 platforms.
FASM (flat assembler) is an assembler for x86 processors that supports Intel-based assembly language on the IA-32 and x86-64 computer architectures.
Microsoft Assembler (MASM) for x64 is Microsoft's assembler that accepts x64 assembler language.
MASM/TASM is a VSCode extension that offers a way to run and debug DOS(80x86) assembly TASM/MASM through DOSBox and msdos-player.
NASM is an asssembler/disassembler for the x86 CPU architecture portable to nearly every modern platform, and with code generation for many platforms old and new.
GAS is the assembler used by the GNU Project for the default back-end of GCC. It is used to assemble the GNU operating system and the Linux kernel.
MIPS is a reduced instruction set computer (RISC) instruction set architecture (ISA) developed by MIPS Technologies, Inc.. In June 2018 MIPS was Acquired by AI Startup Wave Computing.
LLVM is a library that has collection of modular/reusable compiler and toolchain components (assemblers, compilers, debuggers, etc.). With these components LLVM can be used as a compiler framework, providing a front-end(parser and lexer) and a back-end (code that converts LLVM's representation to actual machine code).
TinyGo is a Go compiler(based on LLVM) intended for use in small places such as microcontrollers, WebAssembly (Wasm), and command-line tools.
Tock is an embedded operating system designed for running multiple concurrent, mutually distrustful applications on Cortex-M and RISC-V based embedded platforms. Tock's design centers around protection, both from potentially malicious applications and from device drivers.
PlatformIO is a professional collaborative platform for embedded development with no vendor lock-in. It provides support for multiplatforms and frameworks such as IoT, Arduino, CMSIS, ESP-IDF, FreeRTOS, libOpenCM3, mbed OS, Pulp OS, SPL, STM32Cube, Zephyr RTOS, ARM, AVR, Espressif (ESP8266/ESP32), FPGA, MCS-51 (8051), MSP430, Nordic (nRF51/nRF52), NXP i.MX RT, PIC32, RISC-V.
PlatformIO for VSCode is a plugin that provides support for the PlatformIO IDE on VSCode.
Keystone is a lightweight multi-platform, multi-architecture(Arm, Arm64, Hexagon, Mips, PowerPC, Sparc, SystemZ & X86) assembler framework.
Unicorn is a lightweight, multi-platform, multi-architecture CPU emulator framework(ARM, AArch64, M68K, Mips, Sparc, X86) based on QEMU.
- If would you like to contribute to this guide simply make a Pull Request.
Distributed under the Creative Commons Attribution 4.0 International (CC BY 4.0) Public License.