-
Notifications
You must be signed in to change notification settings - Fork 21
Directory Structure
The root project directory looks something like this:
# build system files
autoconf/
cmake/
# self explanatory...
docs/
examples/
# public headers
include/
llvm/ # C++ headers
llvm-c/ # C API
# source code
lib/
# Target-specific code
Target/
# Our backend!
AVR/
# Other, less important backends
...
# Implementations of LLVM libraries and classes
...
# unit tests
test/
# other, unimportant stuff, such as build files
...
All of the code that concerns AVR is located in the subdirectory lib/Target/AVR
.
An ls
in this directory will yield something to the effect of:
# Folders
MCTargetDesc/
TargetInfo/
AsmParser/
InstPrinter/
# Files
AVR.h
AVR.td
AVRAsmPrinter.cpp
AVRBranchSelector.cpp
AVRCallingConv.td
AVRExpandPseudoInsts.cpp
AVRFrameLowering.cpp
AVRFrameLowering.h
AVRISelDAGToDAG.cpp
AVRISelLowering.cpp
AVRISelLowering.h
AVRInstrFormats.td
AVRInstrInfo.cpp
AVRInstrInfo.h
AVRInstrInfo.td
AVRMCInstLower.cpp
AVRMCInstLower.h
AVRMachineFunctionInfo.cpp
AVRMachineFunctionInfo.h
AVRRegisterInfo.cpp
AVRRegisterInfo.h
AVRRegisterInfo.td
AVRSelectionDAGInfo.cpp
AVRSelectionDAGInfo.h
AVRSubtarget.cpp
AVRSubtarget.h
AVRTargetMachine.cpp
AVRTargetMachine.h
AVRTargetObjectFile.cpp
AVRTargetObjectFile.h
General header file.
The backend's main 'target description (td) file'
This file "#include
"'s all of the other target description files for AVR.
TODO: link to page about .td files
Another target description file. This one describes the general format of AVR instructions.
In AVR, there is a class of instructions which take two register operands.
For example, ADD r2, r6
(add the values of r2
and r6
, placing the result in
the first operand, r2
. The instructions for binary AND
and OR
are also a part
of this family.
In AVR, instructions with the same operands share a common format.
For the aforementioned instructions, we call this format FRdRr
F for format, and RdRr for Register-destination, and
Register-... actually, I do not know why we call Rr
what it is.
Suffice to say, Rr
refers to the source register. Consult the AVR
architecture manual for examples of this notation.
This file describes the core instruction formats, and how they are
represented in machine code. We instansiate each format when defining
specific instructions in AVRInstrInfo.td
.
Target description file describing every single variant of every single
AVR instruction. This file defines each instruction as a subclass of
a specific format in AVRInstrFormats.td
, as well as giving each one
a mnemonic, and filling any instruction-specific fields from the instruction
family format (such as, we have an instruction format defined for branching
instructions, and the format class (yes, like a C++ class) has a template
argument for the sub-opcode (BRNE (branch-on-not-equal) will have a different
subopcode than BRIE (branch-on-interrupts-enabled)).
Code which takes AVR instructions which are defined pragmatically
(all AVR instructions are represented by an instance of llvm::MCInst
),
and converts it into a GCC assembler compatible file.
Not actually sure, I haven't looked into this file
Another target description file which describes the AVR calling convention.
An LLVM pass is created by defining a subclass of an llvm::MachineFunctionPass
.
This class is simply an instansiation of the visitor pattern - a MachineFunctionPass
simply enumerates all machine instructions (MCInst
objects) and performs whatever
analysis or transformations it wants to the code.
One could view an LLVM pass which recognised multiplication by factors of two, which then transforms it into a binary shift left, a faster, by semantically equivalent operation on most processors. Thankfully, LLVM provides this pass and uses it when optimisations are enabled.
TODO: link to page on pseudo instructions
I haven't looked into these files.
LLVM works by representing all code in a platform-independent language called LLVM IR (Intermediate Representation). This code is very similar to assembly language, but it is specific to LLVM.
The LLVM library performs transformations on the 'instruction graph', such as optimisations, to give slightly faster IR (by simplifying/inlining/etc), which does the same thing.
The purpose of this backend is to lower the LLVM IR graph into a graph which has
all IR instructions replaced with actual AVR instructions. For example, there is an LLVM IR
instruction which adds two variables together - the AVR LLVM backend would then be able
to convert this IR instruction into a ADD Rd, Rr
instruction.
For the computer science savvy, the instruction graph forms a Directed Acyclic Graph, a special case of a graph which has no directed cycles. See Wikipedia for more information.
The process of converting high-level, platform independent IR into low-level, platform dependent instructions is referred to as lowering.
The AVRISelDAGtoDAG
class performs instruction selection to choose AVR instructions to represent IR instructions, lowering the IR DAG
into an AVR assembly DAG. Hence the name, AVRISelDAGToDAG (AVR Instruction Selection DAG To DAG).
The instruction selection code uses information defined in the target description files to select the best AVR instruction
for any given IR instruction. For example, LLVM IR includes a jmp <label>
instruction. Our instruction selection code could
lower it to a jmp <label>
instruction or an rjmp <label>
instruction. It should choose the most suitable variant depending on the context.
I'm not sure specifically what these files are for, I haven't looked into them, but it's probably support code for the instruction selector.
TODO
TODO
A class to represent a function containing AVR-only instructions.
TODO
TODO
TODO
AVRSubtarget
defines details about a specific AVR CPU.
Not all AVR chips are created equal - the original chips had no RAM,
therefore no stack, and no PUSH
or POP
instructions, and no SP
(Stack Pointer) register. The later chips added support for breakpoints,
and the crippled dwarf they call the Tiny family removed the LPM
instruction and
changed the binary encoding of LD
.
The AVRSubtarget
class keeps track of what's what regarding the specific microcontroller
we want to target. If we want to lower IR into assembly for the newer XMega chips, AVR-LLVM
will be able to take advantage of the added registers, more efficent instructions, etc.
Any LLVM backend has two define at least two classes to function. One of them must
subclass llvm::TargetMachine
. This class provides virtual methods such as getRegisterInfo()
,
getSubtarget()
, and the like. Around the backend, functions generally take an LLVMTargetMachine
as an argument, so that they can then access information about the current target, and due to
getSubtarget()
, information about the specific AVR features supported.
Subclass of llvm::TargetLoweringObjectFileELF
, allowing the machine code descriptions
in AVRInstrFormats.td
and AVRInstrInfo.td
to then be used to output ELF object files.
Folder containing code for parsing GCC-compatible AVR assembly.
Folder containing code for printing AVR instructions in GCC-compatible AVR assembly.
MCTargetDesc abberviates Machine Code Target Description.
This library contains descriptions on such things as AVR-specific ELF relocation types, and support functions which get the binary encodings of instructions based on the target description files.
Contains a single function, llvm::LLVMInitializeAVRTargetInfo()
, for performing initialisation of the AVR backend.