-
Notifications
You must be signed in to change notification settings - Fork 129
Revised implementation of static allocation/initialization of globals #357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Previously, the initialization of an object was accomplished by a sequence of `Move` statements following the `Object` statement. This obscures the initialization and led to MLton#328 duplicating logic in `functor PackedRepresentation` and `functor Ssa2ToRssa`; see MLton#328 (comment).
Like `RssaTree.Statement.Object`, `RssaTree.Statement.Sequence` corresponds to a direct allocation of a sequence, including initialization of elements.
Like `SsaTree2.Exp.Object`, `SsaTree2.Exp.Sequence` corresponds to a direct allocation of a sequence, including initialization of elements. At `toSsa2`, the `Vector_vector` primitive is translated to `SsaTree2.Exp.Sequence`, rather than being translated to an `Array_alloc` `Array_update` `Array_toVector` sequence.
The `Array_array` primitive is like the `Vector_vector` primitive, although the latter has source syntax `#[e1,...,en]` and the former does not (yet). The intention is that compilation might find opportunities to optimize explicit array allocation and initialization into the `Array_array` primitive. Like the `Vector_vector` primitive, at `toSsa2`, the `Array_array` primitive is translated to `SsaTree2.Exp.Sequence`.
The `StaticHeap.Kind.t` enumeration defines three types of static heaps: `Immutable`, `Mutable`, and `Root`. The `Immutable` static heap will hold immutable objects (and will be marked `const` so as to be placed in a read-only section). The `Mutable` static heap will hold objects with mutable non-objptr fields; such objects need not be traced by the garbage collector (since they can only refer to other static heap objects). The `Root` static heap will hold objects with mutable objptr fields; such objects need to be traced by the garbage collector (since they can be updated to refer to dynamic heap objects). The `StaticHeap.Ref.t` type corresponds to a reference to an object in a static heap, represented by a byte offset.
…aps` The `Machine.StaticHeap.Object.t` datatype represents a static object (either `Normal` or `Sequence`), including initialization. The `Machine.Program.T#staticHeaps` is a function of type `StaticHeap.Kind.t -> StaticHeap.Object.vector`.
Extract the `Object` and `Sequence` variants of `RssaTree.Statement.t` to an `RssaTree.Object.t` datatype. Often times, both variants are treated similarly, so this reduces some code duplication. More importantly, only `RssaTree.Object.t` variants are suitable for static allocation/initialization and a subsequent commit will introduce a `RssaTree.Program.T#statics: RssaTree.Object.t vector` field.
`RssaTree.Program.T#statics: RssaTree.Object.t vector` will collect objects suitable for static allocation/initialization.
Quells gcc/clang warnings.
A cast is required when a tagged word is used as an objptr.
This is consistent with the type of `GC_sequenceLength`.
The `CollectStatics.WordXVectorConsts` pass replaces `WordXVector` constants with `Var` operands to equivalent static `Sequence` objects. The `CollectStatics.RealConsts` pass accumulates `Real` constants into static `Sequence` objects and replaces them with an appropriate `SequenceOffset` operands. This is an alternative to the lifting of `Real` constants to `Global` operands in `Backend`. (`Real` constants are not propagated through the Machine IR program because the native codegens do not support real literals (x86 and amd64 do not have convenient floating-point literal instructions; floating-point values must be loaded from memory) and the C and LLVM codegens incorrectly constant-fold floating-point operations when the rounding mode may change.) The `CollectStatics.Globals` pass lifts `Object` and `Sequence` objects in the `main` (`initGlobals`) function to statics.
This control determines whether or not RSSA `Static` operands are introduced at `toRssa` by `translateGlobalStatics`.
Using an empty string constant for initialization triggers an "internal compiler error: in output_constructor_regular_field, at varasm.c" in gcc-7 and gcc-8.
A static object that is updated with an `Objptr` would trigger a card marking, but the address of a static object would not map to a valid card slot.
A `Static` operand is not an lvalue.
Rather than using a collection of `bool ref`s implemented as mutable static objects, use a `bool array` implemented as an object in the mutable static heap.
When the mutator marks cards, `staticHeapR` cannot be used for objects with mutable objptr fields, because the address of an object in `staticHeapR` won't map to a valid card slot for the write barrier. Such global objects with mutable objptr fields must be placed in the dynamic heap and referenced indirectly via a `Global` operand. However, it is still possible to collect all such objects into a static heap, which is copied to the initial dynamic heap, rather than initializing them via the `initGlobals` function. See MLton#328 (comment).
This shares code for object representation between the RSSA and Machine IRs.
A "packed" `struct` may still have padding at the end, which is included in C's `sizeof` calculation. However, for `memcpy`-ing and `foreachObjptrInRange`-ing a static heap, we need a byte accurate size. We include a sentinel `struct {} end;` field in the static heap structs to explicitly compute the size of a static heap as `(pointer)&staticHeap.end - (pointer)&staticHeap`.
We check the following allowed references between heap objects: I(immutable) -> {I,M,R} M(mutable) -> {I,M,R} R(root) -> {I,M,R,H} H(runtime) -> {I,M,R,H}
MatthewFluet
added a commit
to MatthewFluet/mpl
that referenced
this pull request
Sep 24, 2020
MLton/mlton#357 (revising MLton/mlton#328) introduced a number of "static heaps" into the compilation. Essentially, many "global" objects can be fully evaluated at compile time and represented in the compiled program as statics. The "static" objects look like regular ML objects (with proper headers, etc.), but exist outside the MLton heap. This is a "path of least resistance" commit for MaPLe. The `collectStatics.Globals` and `collectStatics.RealConsts` passes are disabled, but the `collectStatics.WordXVectorConsts` pass is enabled. In the `backend` pass (translation of RSSA to Machine), all RSSA statics are forced to the `Dynamic` static heap (and assigned a corresponding global objptr slot); similarly, any remaining `WordXVector` constants are forced to the `Dynamic` static heap. At program startup, the `Dynamic` static heap is copied into the root hierarchical heap. (This is slightly more complicated than the copy of the `Dynamic` static heap into the initial heap in MLton, because in MaPLe the `Dynamic` static heap may need to be split across multiple chunks.) See MPLLang#127 for more discussion.
MatthewFluet
added a commit
to MatthewFluet/mpl
that referenced
this pull request
Sep 25, 2020
MLton/mlton#357 (revising MLton/mlton#328) introduced a number of "static heaps" into the compilation. Essentially, many "global" objects can be fully evaluated at compile time and represented in the compiled program as statics. The "static" objects look like regular ML objects (with proper headers, etc.), but exist outside the MLton heap. This is a "path of least resistance" commit for MaPLe. The `collectStatics.Globals` and `collectStatics.RealConsts` passes are disabled, but the `collectStatics.WordXVectorConsts` pass is enabled. In the `backend` pass (translation of RSSA to Machine), all RSSA statics are forced to the `Dynamic` static heap (and assigned a corresponding global objptr slot); similarly, any remaining `WordXVector` constants are forced to the `Dynamic` static heap. At program startup, the `Dynamic` static heap is copied into the root hierarchical heap. (This is slightly more complicated than the copy of the `Dynamic` static heap into the initial heap in MLton, because in MaPLe the `Dynamic` static heap may need to be split across multiple chunks.) See MPLLang#127 for more discussion.
MatthewFluet
added a commit
to MatthewFluet/mpl
that referenced
this pull request
Sep 25, 2020
MLton/mlton#357 (revising MLton/mlton#328) introduced a number of "static heaps" into the compilation. Essentially, many "global" objects can be fully evaluated at compile time and represented in the compiled program as statics. The "static" objects look like regular ML objects (with proper headers, etc.), but exist outside the MLton heap. This is a "path of least resistance" commit for MaPLe. The `collectStatics.Globals` and `collectStatics.RealConsts` passes are disabled, but the `collectStatics.WordXVectorConsts` pass is enabled. In the `backend` pass (translation of RSSA to Machine), all RSSA statics are forced to the `Dynamic` static heap (and assigned a corresponding global objptr slot); similarly, any remaining `WordXVector` constants are forced to the `Dynamic` static heap. At program startup, the `Dynamic` static heap is copied into the root hierarchical heap. (This is slightly more complicated than the copy of the `Dynamic` static heap into the initial heap in MLton, because in MaPLe the `Dynamic` static heap may need to be split across multiple chunks.) See MPLLang#127 for more discussion.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
#328 introduced static allocation/initialization of globals, but some complexities and issues with the implementation were noted during review:
This revised implementation tries to simplify the complexities and address the issues:
The RSSA IR loses the
Operand.Static {static: Var.t Static.t, ty: Type.t}
variant and gains astatics: {dst: Var.t * Type.t, obj: Object.t} vector
field inProgram.T
. ThePackedRepresentation
andSsa2ToRssa
passes are simplified, because the initial RSSA program is created with an emptystatics
field. TherssaShrink1
pass takes care of constant-folding and copy-propagating of object initialization. NewcollectStatics.{Globals,{WordXVector,Real}Consts}
passes introduce objects into thestatics
field.The Machine IR gains a
staticsHeaps: StaticHeap.Kind -> StaticHeap.Object.t vector
field inProgram.T
. Each "kind" of static heap is emitted to the main.c
file as a statically initialized data definition that "looks" like an ML heap. There are four kinds of heaps:Immutable
: for immutable objects; such objects need never be traversed by the GC. (Note that globalunit ref
objects can be placed in theImmutable
static heap, since they will never actually be mutated.)Mutable
: for objects with mutable non-objptr fields; such objects may be mutated, but need never be traversed by the GC. (Note that global empty mutable sequences can be placed in theMutable
static heap, since, even if they have mutable objptr fields, since the elements will never actually be mutated.)Root
: when the mutator does not mark cards, for objects with mutable objptr fields; such objects may be mutated and need to be traversed by the GC (because they may be updated to point to objects in the runtime heap). However, if card marking is used by the mutator, then theRoot
static heap cannot be used, because the write barrier with a base object in theRoot
static heap will attempt to write to an invalid card slot index. It would be possible to make the write barrier more expensive, by dynamically checking if the base is in theRoot
static heap.Dynamic
: when the mutator marks cards, for objects with mutable objptr fields, such objects may be mutated and need to be traversed by the GC. TheDynamic
static heap is copied to the initial runtime heap at runtime initialization.In
Backend
, each RSSAstatic
is placed in an appropriate "kind" of static heaps. For objects placed in theDynamic
static heap, they are accessed by the rest of the program viaGlobal
operands (and incur a level of indirection).The
Mutable
andRoot
heaps are properly saved and loaded byMLton.World
.Other notable aspects of the PR:
The SSA2 IR gains an
Exp.Sequence of {args: Var.t vector vector}
variant to represent direct allocation of arrays and vectors, including initialization of elements. AttoSsa2
, theVector_vector
primitive is translated toSsaTree2.Exp.Sequence
, rather than being translated to anArray_alloc
Array_update
Array_toVector
sequence. AtSsa2ToRssa
, aSsaTree2.Exp.Sequence
is translated to anRssa.Object.Sequence
(via updates toPackedRepresentation
). This allows globalVector_vector
objects to be collected to statics.A new
Array_array
primitive for literal arrays was introduced. The intention is that compilation might find opportunities to optimize explicit array allocation and initialization into theArray_array
primitive.Currently, there is not support for "empty" static objects. In the previous static allocation/initialization implementation, a global
Array_alloc
(necessarily with a constant length operand) would be translated to a special kind of static that would be placed in the BSS segment of the executable and dynamically initialized. A future PR could restore this functionality as follows:Introduce
MutableEmpty
,RootEmpty
, andDynamicEmpty
static heap kinds that simply specify a heap size, along withmutableEmptyInit
,rootEmptyInit
, anddynamicEmptyInit
data to properly initialize the headers.Don't lower
Array_alloc
prims inSsa2ToRssa
. AfterrssaShrink1
, it will be possible to read off theArray_alloc
's with constant size. All suchArray_alloc
s in theinitGlobals
function can be lifted to RSSAstatics
. Meanwhile, suchArray_alloc
s in other functions can be more cheaply implemented via direct allocation by the mutator, rather than via theGC_sequenceAllocate
runtime call (which induces a GC safe point).However, "empty" static objects are only created with the (non-default)
-globalize-arrays true
, and so weren't exercised by default in the previous implementation.