Description
This issue tracks implementation of stack maps in LLVM. The goal is to have stack maps working for unwinding for 0.3 or 0.4.
The idea is to use precise stack maps instead of landing pads to run destructors of unique pointers and resources during task failure. This will allow unwinding to work on Windows and should also lead to dramatic code size reduction. Unwinding performance should be slightly improved as well. Eventually this can be extended to support accurate tracing garbage collection for shared boxes (either as a backup for reference counting or in lieu of reference counting).
This requires fairly invasive changes to LLVM. All optimization passes will be turned off at first, and will be reenabled one by one. They should not be hard to reenable.
At the moment, I believe the LLVM steps needed here are:
- ☑ Add a
llvm.gcregroot
intrinsic. Translate it in the fast instruction selector. This needs to be done during call lowering, to make sure that the register roots end up within the call sequence. - ☑ Add
llvm.gcregroot
intrinsics automatically after call sites based on the types of SSA values. This must be done very late in IR transformation, probably right around theCodeGenPrepare
pass. - ☑ Lower
llvm.gcregroot
intrinsics properly in the fast instruction selector. This requires creating a new, fakeMachineInstr
that tracks GC roots and their address spaces. - ☑ Add infrastructure to the GC strategy class to support register roots.
- ☑ Translate
llvm.gcroot
intrinsics in the fast instruction selector as well. - ☑ Add a generic GC strategy and metadata printer. We should be able to use this for Rust.
- ☑ Track callee-saved registers in the GC metadata object. Update the generic GC strategy to record the locations of these registers.
- ☑ Allow
getelementptr
instructions that reference analloca
instruction to be rooted withllvm.gcroot
. This allows structs containing pointers to be rooted without complex type encoding schemes. - ☑ Implement an LLVM pass that automatically roots allocas that contain traceable pointers with
llvm.gcroot
, and uses the metadata field to track the locations of the pointers therein. (I have a patch for this that needs to be resurrected.) - ☑ Implement a pass in LLVM that computes liveness on the SSA graph. This used to be present in LLVM but was removed due to the lack of use and poor performance. It should be rewritten.
- ☑ Using this new liveness pass, augment the
llvm.gcregroot
insertion pass to throw out dead register roots. - ☑ Modify the
llvm.gcregroot
insertion pass to root only pointer origins, not any derived pointers. Consider pointer origins and derived pointers a single value for the purposes of liveness. - ☐ Add support to the SelectionDAG-based instruction selector for GC register roots. This requires a new SDNode. We will need to ensure that it is inside the call sequence, so this requires changing the signature of
LowerCall
and/orLowerCallTo
. Thus all targets will need to be updated.
We will eventually need to have a comprehensive LLVM-level test suite before all of this can be sent upstream.
On the rustc
side, we will need to tag all shared and unique boxes with addrspace(1)
. (This currently happens for shared boxes, but not for uniques.) Unique boxes will need to become self-describing; this is not necessary in theory, but in practice it helps the LLVM side of things if all that needs to be tracked for each virtual register is a single address space value. Additionally, we will need to use the llvm.gcroot
intrinsics to root (a) every enum that contains shared pointers, unique pointers, or resources; and (b) every resource.
In the Rust runtime, we will need to implement the stack crawler. (I have a prototype of this written already.) We will then need to use it to locate all of the roots during unwinding and then run their destructors. This may require fixes to (or replacement of) the shape code.
I intend to keep this issue up to date with the latest changes to our strategy here.