Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Packages.props
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
<PackageReference Update="System.Memory" Version="4.5.4" />
<PackageReference Update="CppSharp" Version="0.10.1" />
<PackageReference Update="Antlr4" Version="4.6.6"/>
<PackageReference Update="OpenSoftware.DgmlBuilder" Version="1.13.0" Condition="'$(TargetFramework)'=='net47'" />
<PackageReference Update="OpenSoftware.DgmlBuilder" Version="1.14.0" />
<PackageReference Update="System.Collections.Immutable" Version="1.7.0" />
<!-- See: https://github.com/dotnet/docfx/issues/5536 before updating...-->
<PackageReference Update="docfx.console" Version="2.48.1" />
Expand Down
7 changes: 3 additions & 4 deletions Samples/CodeGenWithDebugInfo/CortexM3Details.cs
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,18 @@
using System.Collections.Generic;

using Ubiquity.NET.Llvm;
using Ubiquity.NET.Llvm.Interop;
using Ubiquity.NET.Llvm.Types;
using Ubiquity.NET.Llvm.Values;

using static Ubiquity.NET.Llvm.Interop.Library;

namespace TestDebugInfo
{
internal class CortexM3Details
: ITargetDependentDetails
{
public CortexM3Details( )
public CortexM3Details( ILibLlvm libLLVM )
{
RegisterARM( );
libLLVM.RegisterTarget( CodeGenTarget.ARM );
}

public string ShortName => "M3";
Expand Down
263 changes: 131 additions & 132 deletions Samples/CodeGenWithDebugInfo/Program.cs

Large diffs are not rendered by default.

7 changes: 3 additions & 4 deletions Samples/CodeGenWithDebugInfo/X64Details.cs
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,18 @@
using System.Collections.Generic;

using Ubiquity.NET.Llvm;
using Ubiquity.NET.Llvm.Interop;
using Ubiquity.NET.Llvm.Types;
using Ubiquity.NET.Llvm.Values;

using static Ubiquity.NET.Llvm.Interop.Library;

namespace TestDebugInfo
{
internal class X64Details
: ITargetDependentDetails
{
public X64Details( )
public X64Details( ILibLlvm libLLVM )
{
RegisterX86( );
libLLVM.RegisterTarget( CodeGenTarget.X86 );
}

public string ShortName => "x86";
Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
uid: code-generation-with-debug-info
---

# CodeGenWithDebugInfo
Sample application to generate target machine code. The sample is
provided in the [source tree](https://github.com/UbiquityDotNET/Llvm.NET/tree/master/Samples/CodeGenWithDebugInfo).
Expand All @@ -15,7 +19,7 @@ The CodeGenWithDebugInfo sample will generate LLVM IR and machine code for the f
>are expected to be minor. Updating the sample to replicate the latest Clang version is left as an exercise for
>the reader :grin:

[!code-c[Main](../../../Samples/CodeGenWithDebugInfo/Support Files/test.c)]
[!code-c[Main](Support Files/test.c)]

This sample supports targeting two different processor types x64 and ARM Cortex-M3

Expand Down Expand Up @@ -46,14 +50,14 @@ In this sample that is handled in the constructor of the target dependent detail
would allow command line options for the CPU target variants and feature sets. For this sample those are just
hard coded into the target details class to keep things simple and focused on the rest of the code generation.

[!code-csharp[Main](../../../Samples/CodeGenWithDebugInfo/ITargetDependentDetails.cs#ITargetDependentDetails)]
[!code-csharp[Main](ITargetDependentDetails.cs#ITargetDependentDetails)]

This interface isolates the rest of the code from knowing which architecture is used, and theoretically
could include support for additional targets beyond the two in the sample source.

The sample determines which target to use based on the second command line argument to the application

[!code-csharp[Main](../../../Samples/CodeGenWithDebugInfo/Program.cs#TargetDetailsSelection)]
[!code-csharp[Main](Program.cs#TargetDetailsSelection)]

## Creating the BitcodeModule
To generate code in Ubiquity.NET.Llvm a [BitcodeModule](xref:Ubiquity.NET.Llvm.BitcodeModule) is required as
Expand All @@ -80,7 +84,7 @@ the target and a target specific [DataLayout](xref:Ubiquity.NET.Llvm.DataLayout)
extracts these from the [TargetMachine](xref:Ubiquity.NET.Llvm.TargetMachine) provided by the target
details interface for the selected target.

[!code-csharp[Main](../../../Samples/CodeGenWithDebugInfo/Program.cs#CreatingModule)]
[!code-csharp[Main](Program.cs#CreatingModule)]

## Creating the DICompileUnit
LLVM Debug information is all scoped to a top level [DICompileUnit](xref:Ubiquity.NET.Llvm.DebugInfo.DICompileUnit).
Expand All @@ -104,7 +108,7 @@ to expose types in a consistent fashion. Ubiquity.NET.Llvm provides a set of cla
This sample uses the [DebugBasicType](xref:Ubiquity.NET.Llvm.DebugInfo.DebugBasicType). To define the basic types
used in the generated code with appropriate debug information.

[!code-csharp[Main](../../../Samples/CodeGenWithDebugInfo/Program.cs#CreatingBasicTypesWithDebugInfo)]
[!code-csharp[Main](Program.cs#CreatingBasicTypesWithDebugInfo)]

This constructs several basic types and assigns them to variables:

Expand All @@ -120,7 +124,7 @@ Creating qualified (const, volatile, etc...) and pointers is just as easy as cre
The sample needs a pointer to a const instance of the struct foo. A qualified type for constant foo is
created first, then a pointer type is created for the const type.

[!code-csharp[Main](../../../Samples/CodeGenWithDebugInfo/Program.cs#CreatingQualifiedTypes)]
[!code-csharp[Main](Program.cs#CreatingQualifiedTypes)]

## Creating structure types
As previously mentioned, the LLVM types only contain basic layout information and not full source
Expand All @@ -133,7 +137,7 @@ metadata. A collection of these is then used to create the final composite type
data in a simple single call. The sample only needs to create one such type for the `struct foo`
in the example source code.

[!code-csharp[Main](../../../Samples/CodeGenWithDebugInfo/Program.cs#CreatingStructureTypes)]
[!code-csharp[Main](Program.cs#CreatingStructureTypes)]

## Creating module metadata and global variables
The sample code contains two global instances of `struct foo` `bar` and `baz`. Furthermore, bar
Expand All @@ -142,9 +146,9 @@ forms the initialized value of `bar.c`, the source only provides const values fo
entries of a 32 element array. The const data is created via [ConstArray](xref:Ubiquity.NET.Llvm.Values.ConstantArray).
The full initialized const data for bar is the created from [Context.CreateNamedConstantStruct](xref:Ubiquity.NET.Llvm.Context.CreateNamedConstantStruct*)

[!code-csharp[Main](../../../Samples/CodeGenWithDebugInfo/Program.cs#CreatingGlobalsAndMetadata)]
[!code-csharp[Main](Program.cs#CreatingGlobalsAndMetadata)]

[!code-csharp[Main](../../../Samples/CodeGenWithDebugInfo/Program.cs#AddModuleFlags)]
[!code-csharp[Main](Program.cs#AddModuleFlags)]

Once the constant data is available an LLVM global is created for it with a name that matches the source name
via [AddGlobal](xref:Ubiquity.NET.Llvm.BitcodeModule.AddGlobal*). To ensure the linker lays out the structure
Expand All @@ -158,15 +162,15 @@ For the `baz` instance the process is almost identical. The major difference is
structure is initialized to all zeros. That is the initialized data for the structure is created with
[NullValueFor](xref:Ubiquity.NET.Llvm.Values.Constant.NullValueFor*), which creates an all zero value of a type.

[!code-csharp[Main](../../../Samples/CodeGenWithDebugInfo/Program.cs#CreatingGlobalsAndMetadata)]
[!code-csharp[Main](Program.cs#CreatingGlobalsAndMetadata)]

LLVM modules may contain additional module flags as metadata that describe how the module is generated
or how the code generation/linker should treat the code. In this sample the dwarf version and debug metadata
versions are set along with a VersionIdentString that identifies the application that generated the module.
Additionally, any target specific metadata is added to the module. The ordering of these is generally not
relevant, however it is very specific in the sample to help ensure the generated IR is as close to the
Clang version as possible making it possible to run llvm-dis to generate the textual IR files and compare them.
[!code-csharp[Main](../../../Samples/CodeGenWithDebugInfo/Program.cs#AddModuleFlags)]
[!code-csharp[Main](Program.cs#AddModuleFlags)]

## Declaring the functions
The function declarations for both of the two function's is mostly the same, following a common pattern:
Expand All @@ -189,7 +193,7 @@ registers). For the two processors this sample supports Clang only uses this for
calls the TargetDetails.AddABIAttributesForByValueStructure) to add the appropriate attributes for the target
as needed.

[!code-csharp[Main](../../../Samples/CodeGenWithDebugInfo/Program.cs#FunctionDeclarations)]
[!code-csharp[Main](Program.cs#FunctionDeclarations)]

## Generating function bodies
This is where things really get interesting as this is where the actual code is generated for the functions. Up
Expand Down
24 changes: 24 additions & 0 deletions Samples/Kaleidoscope/Chapter2/CodeGenerator.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
// -----------------------------------------------------------------------
// <copyright file="CodeGenerator.cs" company="Ubiquity.NET Contributors">
// Copyright (c) Ubiquity.NET Contributors. All rights reserved.
// </copyright>
// -----------------------------------------------------------------------

using Kaleidoscope.Grammar.AST;
using Kaleidoscope.Runtime;

namespace Kaleidoscope.Chapter2
{
internal sealed class CodeGenerator
: IKaleidoscopeCodeGenerator<IAstNode>
{
public void Dispose( )
{
}

public OptionalValue<IAstNode> Generate( IAstNode ast )
{
return OptionalValue.Create( ast );
}
}
}
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
---
uid: Kaleidoscope-ch2
---

# 2. Kaleidoscope: Implementing the parser
The chapter 2 sample doesn't actually generate any code. Instead it focuses on the general
structure of the samples and parsing of the language. The sample for this chapter enables all
language features to allow exploring the language and how it is parsed to help better understand
the rest of the chapters better. It is hoped that users of this library find this helpful.

The LUbiquity.NET.Llvm version of Kaleidoscope leverages ANTLR4 to parse the language into a parse tree.
The Ubiquity.NET.Llvm version of Kaleidoscope leverages ANTLR4 to parse the language into a parse tree.
This has several advantages including logical isolation of the parsing and code generation.
Additionally, it provides a single formal definition of the grammar for the language. Understanding
the language grammar from reading the LVM tutorials and source was a difficult task since it isn't
Expand Down Expand Up @@ -262,7 +266,7 @@ This is a simple rule for sub-expressions within parenthesis for example: `(1+2)
the addition so that it occurs before the division since, normally the precedence of division is higher.
The parse tree for that expression looks like this:

![Parse Tree](parsetree-paren-expr.svg)
![Parse Tree](./parsetree-paren-expr.svg)

### FunctionCallExpression
```antlr
Expand All @@ -271,7 +275,7 @@ Identifier LPAREN (expression[0] (COMMA expression[0])*)? RPAREN
This rule covers a function call which can have 0 or more comma delimited arguments. The parse tree
for the call `foo(1, 2, 3);` is:

![Parse Tree](parsetree-func-call.svg)
![Parse Tree](./parsetree-func-call.svg)

### VarInExpression
```antlr
Expand Down Expand Up @@ -366,20 +370,21 @@ classes so they are extensible from the parser assembly without needing to deriv
methods etc. Thus, the Kaleidoscope.Grammar assembly contains partial class extensions that provide simpler
property accessors and support methods to aid is generating the AST.

See [Kaleidoscope Parse Tree Examples](Kaleidoscope-Parsetree-examples.md) for more information and example
See [Kaleidoscope Parse Tree Examples](xref:Kaleidoscope-Parsetree-examples) for more information and example
diagrams of the parse tree for various language constructs.

## Abstract Syntax Tree (AST)
To further simplify code generators the Kaleidoscope.Runtime library contains the AstBuilder type that is
an ANTLR parse tree visitor. AstBuilder will convert a raw ANTLR IParseTree into an `IEnumerable<IFunctionNode>`.
That is, it visits the declarations and definitions in the parse tree to produce an ordered sequence of declarations
and definitions as they appeared in the source. For interactive modes - the sequence will have only a single element.
However, when parsing a whole source file, the parse tree may contain multiple declarations and definitions.
an ANTLR parse tree visitor. AstBuilder will convert a raw ANTLR IParseTree into a a tree of `IAstNode` elements.
That is, it visits the declarations and definitions in the parse tree to produce a full tree of declarations
and definitions as they appeared in the source. For interactive modes - the tree will have only one top level node.
However, when parsing a whole source file, the parse tree may contain multiple declarations and definitions under
a RootNode.

The [Kaleidoscope AST](Kaleidoscope-AST.md) is a means of simplifying the original parse tree into
constructs that are easy for the code generation to use directly. In the case of Kaleidoscope there are
a few types of nodes that are used to generate LLVM IR. The AstBuilder class is responsible for
generating an AST from an ANTLR4 parse tree.
The [Kaleidoscope AST](xref:Kaleidoscope-AST) is a means of simplifying the original parse tree into
constructs that are easy for the code generation to use directly and to validate the syntax of the input source.
In the case of Kaleidoscope there are a few types of nodes that are used to generate LLVM IR. The AstBuilder class
is responsible for generating an AST from an ANTLR4 parse tree.

The major simplifying transformations performed in building the AST are:
* Convert top-level functions to a pair of FunctionDeclaration and FunctionDefinition
Expand All @@ -391,52 +396,60 @@ The major simplifying transformations performed in building the AST are:
>operators no longer exists in the AST! The AST only deals in function declarations, definitions and the built-in
>operators. All issues of precedence are implicitly resolved in the ordering of the nodes in the AST.
>Thus, the code generation doesn't need to consider the issue of user defined operators or operator
>precedence at all. ([Chapter 6](Kaleidoscope-ch6.md) covers the details of user defined operators)
>
>precedence at all. ([Chapter 6](xref:Kaleidoscope-ch6) covers the details of user defined operators and how
>the Kaleidoscope sample language uses ANTLR to implement them.)

## Basic Application Architecture

Generally speaking there are four main components to all of the sample chapter applications.
Generally speaking, there are four main components to most of the sample chapter applications.

1. The main driver application (e.g. program.cs)
2. The parser (e.g. Kaleidoscope.Grammar assembly)
3. Runtime support (e.g. Kaliedoscope.Runtime)
2. The Read-Evaluate-Print-Loop (e.g. ReplEngine.cs)
3. Runtime support (e.g. Kaliedoscope.Runtime and Kaleidoscope.Parser libraries)
4. The code generator (e.g. CodeGenerator.cs)

### Driver
While each chapter is a bit different from the others. Many of the chapters are virtually identical for
the driver. In particular Chapters 3-7 only really differ in the language level support.
the driver. In particular Chapters 3-7 only really differ in the name of the app and window title etc...

[!code-csharp[Program.cs](Program.cs)]

### Read, Evaluate, Print loop
The Kaleidoscope.Runtime library contains an abstract base class for building a standard REPL engine from an
input TextReader. The base class handles converting the input reader into a sequence of statements, and
parsing them into AST nodes. The nodes are provided to an application provided generator that produces the
output result. The REPL engine base uses the abstract ShowResults method to actually show the results.

[!code-csharp[Program.cs](../../../Samples/Kaleidoscope/Chapter2/Program.cs#generatorloop)]
[!code-csharp[Program.cs](ReplEngine.cs)]

### Runtime Support
The Parser contains the support for parsing the Kaleidoscope language from the REPL loop interactive
input. The parser stack also maintains the global state of the runtime, which controls the language features
enabled, and if user defined operators are enabled, contains the operators defined along with their
precedence.

After the parser is created an async enumerable sequence of statements is created for the parser to process.
After the parser is created an enumerable sequence of statements is created for the parser to process.
This results in a sequence of AST nodes. After construction, the sequence is used to iterate over all of
the nodes generated from the user input.

This use of an Async enumerator sequences is a bit of a different approach to things for running an interpreter Read,
This use of an enumerator sequences is a bit of a different approach to things for running an interpreter Read,
Evaluate Print Loop, but once you get your head around it, the sequence provides a nice clean and flexible
mechanism for building a pipeline of transformations from the text input into the result output.

### Processing generated results
The calling application will generally subscribe to the observable sequence with a `ShowResults` function to show the
results of the generation in some fashion. For the basic samples (Chapter 3-7) it indicates the value of any JITed
and executed top level expressions, or the name of any functions defined. Chapter 2 has additional support for
showing an XML representation of the tree but the same basic pattern applies. This, helps to keep the samples
### CodeGenerator
The code generator will transform the AST node into the final output for the program. For the basic samples
(Chapter 3-7) it indicates the value of any JITed and executed top level expressions, or the name of any functions
defined. Chapter 2 uses a generator that simply produces the node it was given as the app doesn't actually use LLVM
(it focuses on parsing the language only and the REPL infrastructure). This, helps to keep the samples
consistent and as similar as possible to allow direct file comparisons to show the changes for a particular feature.
The separation of concerns also aids in making the grammar, runtime and code generation unit-testable without the
driver. (Although that isn't implemented yet - it is intended for the future to help broaden testing of Ubiquity.NET.Llvm to
more scenarios and catch breaking issues quicker.)
driver.

[!code-csharp[ShowResults](../../../Samples/Kaleidoscope/Chapter2/Program.cs#ShowResults)]
[!code-csharp[ShowResults](CodeGenerator.cs)]

### Special case for Chapter 2
Chapter 2 sample code, while still following the general patterns used in all of the chapters, is a bit
unique, it doesn't actually use LUbiquity.NET.Llvm at all! Instead, it is only focused on the language and parsing.
unique, it doesn't actually use Ubiquity.NET.Llvm at all! Instead, it is only focused on the language and parsing.
This helps in understanding the basic patterns of the code. Furthermore, this chapter serves as an aid in
understanding the language itself. Of particular use is the ability to generate DGML and [blockdiag](http://blockdiag.com)
representations of the parse tree for a given parse.
Expand Down
Loading