ZNCC

Zak's New(-ish) C(-ish) Compiler.

(Alternatively, ZNCC is a recursive acronym for ZNCC is Not a C Compiler.)

Introduction

This is the latest version of my C-like compiler, which is distantly based on LICE: https://github.com/dorktype/LICE

This is NOT meant to be a full replacement for standards-compliant C/C++ compilers. It's rather intended for bootstrapping new architectures and for use in minimalist high-level computing systems (e.g. where code needs to be strictly audited for security reasons or needs to be kept simple for educational purposes, but still needs some complex/modern functionality).

The compiler mostly targets modern, 64-bit platforms and focuses on converting simple pre-processed C-like code into simple assembler code for the target architecture. It can be easily adapted for other use-cases or bundled with additional tools.

Major Features

Supports multiple architectures and is easy to retarget
- Mostly tested on x86-64/AMD64 Linux
- Partial support for Windows ABI
- RV64 is about half-supported (basic tests work but there are many broken bits), with minimal support for RV32 (enough for "Hello world")
- Linux targets should also work for FreeBSD/OpenBSD/Solaris/etc. with minimal re-tooling (macOS may require a little more work, but not much)
- Also includes some minimal/experimental support for other/new architectures
Single compiler tool supports multiple targets (no need for per-architecture compiler builds)
Compiler is mostly self-hosting (at least on fully-supported targets)
Supports most (not all) essential C features with some extensions
- Includes some Objective C-like OOP extensions
- Basic floating-point support is included (assuming the target has such support)

NOTE: More detailed language & design features are covered in their own sections.

Building & Usage

Command-Line Build (Linux/Unix/bash/...)

Using your default host C compiler (Unix-like best practices):

cc -ozncc zncc.c

Or specifically using GCC or similar:

gcc -ozncc zncc.c

Visual Studio IDE Build (Windows)

Create a new C++ Command Line project
Copy/paste the zncc.c code into your main .c++ file
Rename your main .c++ file to zncc.c (or something else with a .c ending)
Create new header files, zncp.h and zncg.h, copying/pasting the associated code in
Build/run and enjoy!

NOTE: This trick generally works for getting simple C programs working in Visual Studio, which doesn't seem to be configured well for C programming by default.

Basic Usage

The compiler takes as input C-like program code (without any preprocessor directives) and generates as output assembler code.

For easy testing with files, the --input and --output arguments can be given:

./zncc --input mycode.c --output mycode.s

The assembler code then needs to be assembled/linked (see beneath).

Extended Usage/Testing/Self-Hosting (Linux/Unix Only)

First build the compiler as above. Then download the ZNLC headers, you can place these anywhere convenient (e.g.including in the compiler directory).

Then, create a preprocessed version using GCC's frontend or another preprocessor (NOTE: This differs a little between platforms):

gcc -E -Ipath/to/ZNLC/include -D_ZCC -D_ZCC_X64 zncc.c > zncc.X64.c

If this succeeds, zncc.X64.c should be the raw, preprocessed C code for the appropriate target. The next step is to run the compiler, producing assembly code:

./zncc --input zncc.X64.c --output zncc.X64.s

Then you can assemble & link, again GCC's frontend comes in handy on Linux:

gcc -static -ozncc.X64 zncc.X64.s

This will produce the self-hosted version zncc.X64, which you can test by compiling itself as above:

./zncc.X64 --input zncc.X64.c --output zncc.X64.again.s
gcc -static -ozncc.X64.again zncc.X64.again.s

Target Options

The default settings for now reflect the testing environment (future/integrated versions may detect settings a little better).

More-specific options can be relayed through environment variables:

The value of CCB_FAMILY controls the target architecture:
- x86 or X86 for commonplace Intel/AMD processors used in most PCs/laptops (currently only supported in 64-bit mode)
- risc-v/RISC-V/riscv/RISCV for RISC-V (RV32/RV64-based) or compatible targets
- arm or ARM for ARM-based targets (currently mostly unimplemented)
- Potentially other/experimental settings
The value of CCB_WORDSIZE specifies the basic word-size of the target processor:
- Only the value of 64 fully works at the moment (to target 64-bit PCs and RV64)
- The value of 32 can be used to test the RV32 target (which is less complete than the 64-bit modes)
- The value of 16 is also recognised but there are no 16-bit targets at this stage
The value of CCB_CALLCONV controls the default calling conventions:
- standard or STANDARD generally implies the "System V" or similar conventions used by Linux/BSD/Solaris systems
- windows or WINDOWS specifies Microsoft Windows (or ReactOS/WINE) conventions
- Note that there is some support for specifying calling conventions on a function-by-function basis, but this isn't fully fleshed-out
The value of CCB_ASMFMT controls the assembler format:
- gas is often most useful on Linux/similar systems, and conforms to GNU/GCC's default assembler syntax
- fasm generates code for Flat Assembler which works on x86 systems: https://flatassembler.net/ (this may also be useful for porting to NASM & other targets)
- raw uses a simplified syntax, i.e. for testing new targets without good/standard assemblers (this is mostly useless for PC & RISC-V targets for now)
The value of CCB_BINFMT controls the binary format or linker semantics assumed in the assembler code:
- elf is generally the default on modern Linux/BSD/Solaris systems, and is ideal for linking with GCC/clang code on those platforms
- flat can be used for producing small "flat binary" code snippets, particularly with Flat Assembler

NOTE: These names reflect internal naming (CCB being short for "C-like Compiler Backend") will be updated before the final release.

Language Features & Limitations

The compiler generally accepts C-like code with some (experimental/incomplete) Objective C-style extensions. It would roughly be on-par with a pre-standard C compiler, except for modern targets.

This means that features like integers, functions, structs, arrays, pointers, etc. generally work as per usual, but there are some C features which are not implemented in ZNCC:

Unpacking "vararg" parameters will not work
- Varargs can be declared/called but not unpacked
- This basically means you need to use a different/standard compiler to build any "printf"-like functions (but you can still access them)
Passing structures as arguments or return values of functions will not work
- In other words, arguments are expected to be either integer/pointer-sized or floating-point values
- This may need to be revised in order to target 32-bit platforms properly (which may need to pass around 64-bit integers)
String constants with the same text are not guaranteed to be == at runtime
- This kind of stuff can generally be done and may sometimes be automatic, but depends on linker features
- It's generally considered bad practice to rely on this anyway, but may be worth noting
Literals are limited to common forms (e.g. don't expect it to support wide strings or pointers to struct literals)
There will probably never be any support for bitfields
There is currently no support for C++ style classes/namespaces/templates/..
- Some minimal support may be added in the future, but likely not the whole lot
- The Objective C-like features will be the main focus for OOP-like extensions in the short term

Bonus Bugs & Temporary Limitations

Floating-point support exists and should generally "work" on supported targets, but is minimal
- This can probably be expanded quite easily in future versions, but will eventually require some platform-specific options
Large numbers of function arguments will partly work, but not reliably
- Large numbers of integer/pointer arguments will "work", but may misbehave when combined with floating-point arguments (and may throw off the stack alignment for floating point inside the function)
- This is just an incomplete feature (i.e. missing float support), and should be easy to fix incrementally
Number conversions and precise signed/unsigned/etc. semantics have not been thoroughly fuzz-tested
- There are likely to be some issues with specific types or combinations, but these are usually easy to fix once identified
Pointer arithmetic exists, but is limited
- For now, this means: Use simple addition/subtraction for pointers, don't rely on increment operators or inner pointer syntax working correctly to obtain offsets
- These issues can probably be solved incrementally (if not, warnings/errors can be added to catch broken cases)
Error reporting is present, but is not ideal for catching bugs/suggesting solutions
There is no/minimal optimisation (this is intentional, the focus is on making the thing work first)
The object-oriented extensions have potential, but deciding on an exact ABI is difficult
- Efficient implementations would easily become platform-specific, while unoptimised implementations may be impractical
- This means that there are some OOP features, but they need to be tweaked and integrate them into some kind of platform to be useful

Other Limitations/Design Features

There is no built-in preprocessor or associated tools. I recently began integrating such tooling, but ran into a couple of issues.

Most notably, any generally-useful preproccessor ends up more complex than the compiler, so self-hosting can become troublesome.

Secondly, the compiler itself (and LICE which was used as a foundation) are licensed under the terms of Unlicense, while other/third-party components usually imply other (or more-vague) licensing conditions.

And thirdly, the compiler part is more-or-less essential for all use cases. The preprocessor, linker, assembler, and so on are more naturally specialised for particular platforms or combinations of technologies. For example a regular desktop target might require a full preprocessor and other tools, whereas an embedded target may only need to recompile straightforward programs and may use the compiler directly with a special frontend & assembler limited to that target.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
LICENSE		LICENSE
README.md		README.md
zncc.c		zncc.c
zncg.h		zncg.h
zncp.h		zncp.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ZNCC

Introduction

Major Features

Building & Usage

Command-Line Build (Linux/Unix/bash/...)

Visual Studio IDE Build (Windows)

Basic Usage

Extended Usage/Testing/Self-Hosting (Linux/Unix Only)

Target Options

Language Features & Limitations

Bonus Bugs & Temporary Limitations

Other Limitations/Design Features

See Also

About

Releases

Packages

Languages

License

ZYSF/ZNCC

Folders and files

Latest commit

History

Repository files navigation

ZNCC

Introduction

Major Features

Building & Usage

Command-Line Build (Linux/Unix/bash/...)

Visual Studio IDE Build (Windows)

Basic Usage

Extended Usage/Testing/Self-Hosting (Linux/Unix Only)

Target Options

Language Features & Limitations

Bonus Bugs & Temporary Limitations

Other Limitations/Design Features

See Also

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages