Documentation Update for Enhanced Orthogonal Persistence (#4670)

dfinity · Aug 26, 2024 · a8ffa98 · a8ffa98
1 parent 6b35781
commit a8ffa98
Show file tree

Hide file tree

Showing 34 changed files with 684 additions and 386 deletions.
diff --git a/design/Custom-Sections.md b/design/Custom-Sections.md
@@ -34,3 +34,7 @@ let hash : string -> int32 = fun s ->
       (Lib.String.explode s)
   )
 ```
+
+Motoko generates an additional `"enhanced-orthogonal-persistence"` private custom section to
+mark Motoko Wasm binaries that rely on IC's support to retain the main Wasm memory on an upgrade, 
+cf. [Orthogonal Persistence](OrthogonalPersistence.md).
diff --git a/design/DFX-Interface.md b/design/DFX-Interface.md
@@ -118,6 +118,10 @@ used only in very specific cases.
 The above metadata is stored in the Wasm module, and is only accessible by the controllers of the canister, unless the
 metadata name is specified in the `--public-metadata` flag.
 
+Moreover, the compiler generates a special marker custom section `"enhanced-orthogonal-persistence"` if the new orthogonal 
+persistence support is enabled, see [Orthogonal Persistence](OrthogonalPersistence.md). This section is always private and
+always emited independent of the compiler flags `--public-metadata` or `--public-metadata`.
+
 Checking stable type compatibility
 ----------------------------------
 
@@ -130,6 +134,10 @@ a type safe way without unintentional data loss.
 
 If the check succeeds, nothing will be printed. 
 If the check fails, the error message will be printed in stderr and the command returns with exit code 1.
+The check can also emit warning messages, e.g. if stable variables are dropped.
+
+With [enhanced orthogonal persistence](OrthogonalPersistence.md), the stable compatibility is also integrated in the runtime
+system, to atomically guarantee memory compatibility during an upgrade.
 
 Invoking the IDE
 ----------------

diff --git a/design/GraphCopyStabilization.md b/design/GraphCopyStabilization.md
@@ -1,15 +1,12 @@
 # Graph-Copy-Based Stabilization
 
-This is part of the enhanced orthogonal persistence support, see `OrthogonalPersistence.md`.
-It allows future potentially radical changes of the persistent memory layout, such as introducing a new GC, rearranging persistent metadata, or specializing arrays for small element types etc. 
+This is part of the enhanced orthogonal persistence support, see [Orthogonal Persistence](OrthogonalPersistence.md).
 
 ## Purpose
-
-This allows potentially radical changes of the persistent main memory layout, e.g. introducing a new GC or rearranging persistent metadata. 
+This allows future potentially radical changes of the persistent memory layout, such as introducing a new GC, rearranging persistent metadata, or specializing arrays for small element types etc. 
 This also relies on precise value tagging to allow more advanced changes that require value metadata, e.g. specializing arrays for small value element types or even downgrading to 32-bit heap layouts (provided that the amount of live data fits into a 32-bit memory).
 
 ## Design
-
 Graph copy of sub-graph of stable objects from main memory to stable memory and vice versa on upgrades.
 
 ## Properties
@@ -22,9 +19,7 @@ Graph copy of sub-graph of stable objects from main memory to stable memory and
 ## Memory Compatibility Check
 Apply a memory compatibility check analogous to the enhanced orthogonal persistence, since the upgrade compatibility of the graph copy is not identical to the Candid subtype relation.
 
-
 ## Incremental Upgrade
-
 Supporting arbitrarily large upgrades beyond the instruction limit:
 * Splitting the stabilization/destabilization in multiple asynchronous messages.
 * Limiting the stabilization work units to fit the update or upgrade messages.
@@ -35,7 +30,6 @@ Supporting arbitrarily large upgrades beyond the instruction limit:
 **Note**: Graph copying needs to be explicitly initiated as the usual upgrade engages enhanced orthogonal persistence, simply retaining main memory with compatibility check.
 
 ### Usage
-
 When upgrading to a Motoko version that is not compatible with the current enhanced orthogonal persistence:
 
 1. Initiate the explicit stabilization before the upgrade:
@@ -76,15 +70,23 @@ dfx canister call CANISTER_ID __motoko_destabilize_after_upgrade "()"
 * The GC is restarted.
 
 ### Remarks
-
-* Steps 2 (explicit destabilization) may not be needed if the corresponding operation fits into the upgrade message.
+* When receiving the `dfx` error "The request timed out." during explicit stabilization, upgrade, or destabilization, one can simply repeat the call until it completes.
+* Steps 3 (explicit destabilization) may not be needed if the corresponding operation fits into the upgrade message.
 * Stabilization and destabilization steps are limited to the increment limits:
 
     Operation | Message Type | IC Instruction Limit | **Increment Limit**
     ----------|--------------|----------------------|--------------------
     **Explicit (de)stabilization step** | Update | 20e9 | **16e9**
     **Actual upgrade** | Upgrade | 200e9 | **160e9**
 
+* The graph copy steps also limit the amount of processed stable data (read or write), in order not to exceed the 
+IC's stable memory access limits.
+
+    Operation | Message Type | IC Stable Access Limit | **Increment Limit**
+    ----------|--------------|----------------------|--------------------
+    **Explicit (de)stabilization step** | Update | 2 GB | **1 GB**
+    **Actual upgrade** | Upgrade | 8 GB | **6 GB**
+
 ## Graph-Copy Algorithm
 Applying Cheney’s algorithm [1, 2] for both serialization and deserialization:
 
@@ -117,6 +119,7 @@ The format is also versioned to allow future refinements of the graph copy algor
 * Incremental GC: Serialization needs to consider Brooks forwarding pointers (not to be confused with the Cheney's forwarding information), while deserialization can deal with partitioned heap that can have internal fragmentation (free space at partition ends).
 * The partitioned heap prevents linear scanning of the heap, especially in the presence of large objects that can be placed at a higher partition than subsequently allocated normal-sized objects. For this reason, a scan stack is allocated in the main memory, remembering the deserialized objects that still need to be scanned. With this, the deserialization does not need to make any assumptions of the heap structure (e.g. monotonically increasing allocations, free space markers, empty heap on deserialization start etc.).
 * If actor fields are promoted to the `Any` type in a new program version, their content is released in that variable to allow memory reclamation.
+* Both stabilization and destabilization read and write data linearly, which is beneficial for guarding a work set limit (number of accessed pages) per IC message. Destabilization is also linear because it deserializes objects in the same order back as they have been serialized.
 
 ## Open Aspects
 * Unused fields in stable records that are no longer declared in a new program versions should be removed. This could be done during garbage collection, when objects are moved/evacuated. This scenario equally applies to enhanced orthogonal persistence.

diff --git a/design/Implementation.md b/design/Implementation.md
@@ -9,25 +9,29 @@
 
 ## Heap
 
-* Uniform representation with 32 bit word size.
+* Uniform representation with a defined word size. 
+For [enhanced orthogonal persistence](OrthogonalPersistence.md), 64-bit words are used, while for classical persistence, the word size is 32-bit.
 
-* Use pointer tagging in LSB;.
-  - 0 for pointers, 1 for scalars.
-  - Scalars are real value shifted left by 1, lowest bit set.
-
-* Q: Allocation and GC strategies?
+* Use pointer tagging in the LSB:
+  - 1 for pointers, 0 for scalars.
+  - Scalars are real value shifted left by 1, lowest bit clear.
+  For [enhanced orthogonal persistence](OrthogonalPersistence.md), the types of scalars are additionally tagged.
+
+* Garbage collected.
 
 
 ## Primitive types
 
 * Nat and Int compile to heap-allocated big nums; unboxed for small numbers `<= 31` bit.
 
-* Nat8/16 compile to unboxed scalars; Nat32/64 are boxed.
+* Nat8/16 compile to unboxed scalars; 
+  On a 32-bit heap, Nat32/64 are boxed. 
+  On a 64-bit heap, only Nat64 is boxed, while Nat32 remains unboxed.
   - May unbox locally.
 
 * Characters are scalars (unicode code points).
 
-* Text is heap-allocated.
+* Text is heap-allocated. Using ropes for concatenations.
 
 
 ## Tuples
@@ -103,6 +107,11 @@ TODO
 
 TODO
 
+## Persistence
+
+Different * [persistence modes](OrthogonalPersistence.md):
+* [Enhanced orthogonal persistence](OrthogonalPersistence.md).
+* [Classical persistence](OldStableMemory.md).
 
 # Hypervisor Extensions needed
 

diff --git a/design/Memory.md b/design/Memory.md
@@ -29,8 +29,11 @@ In the future (with the GC proposal), Wasm will have a 4th form of mutable state
 
 The Heap is *not* an explicit entity that can be im/exported, only individual references to structures on the heap can be passed.
 
-Note: It is highly likely that most languages implemented on Wasm will eventually use Wasm GC.
-Various implementers are currently waiting for it to become available before they start porting their language to Wasm.
+Note: It is highly likely that several managed languages implemented on Wasm will eventually use Wasm GC.
+However, in our case, it would require snapshotting the Wasm managed heap which is currently not possible for `wasmtime`.
+Moreover, the GC implemented on the managed heap does probably not fit the IC with hard instruction limits. 
+A fully incremental GC would be needed, which is currently not implemented in any Wasm engine (often only using reference counting or a GC that has worst-case unbounded pauses).
+Conceptually, enhanced orthogonal persistence could be implemented on Wasm GC.
 
 ### Internet Computer (IC)
 
@@ -48,19 +51,16 @@ All references are *sharable*, i.e., can be passed between actors as message arg
 Other than actors, all reference types must be pure (immutable and without identity) to prevent shared state and allow transparent copying by the implementation.
 Element buffers can encode arbitrary object trees.
 
-Once Wasm GC is available, some of these types (esp. buffers) could be replaced by proper Wasm types.
-
-
-## Language Implementation
+## Language Implementation Rationales
 
 ### Representing Data Structures
 
 There are 3 possible ways of representing structured data in Wasm/IC.
 
-#### Using Wasm Memory
+#### Using Wasm Memory <- Chosen Design
 
-All data structures are laid out and managed in Memory by the compiler and the language runtime.
-References are stored via indirections through a Table.
+All data structures are laid out and managed in Wasm memory by the compiler and the runtime system.
+Function references are stored via indirections through a Wasm table.
 
    Pros:
    1. local data access maximally efficient
@@ -69,7 +69,7 @@ References are stored via indirections through a Table.
    Cons:
    1. message arguments require de/serialisation into IC buffers on both ends (in addition to the de/serialisation steps already performed by IC)
    2. each actor must ship its own instance of a GC (for both memory and table) and de/serialisation code
-   3. all references require an indirection
+   3. all function references require an indirection
    4. more implementation effort
 
 #### Using IC API
@@ -102,88 +102,8 @@ All data structures are represented as Wasm GCed objects.
    1. Wasm GC is 1-2 years out
    2. unclear how to implement transparent persistence (see below)
 
-
 ## Persistence
 
-### Persistence models
-
-There are at least 3 general models for providing persistence.
-
-#### *Explicit* persistence
-
-IC API provides explicit system calls to manage persistent data.
-Wasm state is volatile; each message received runs in a fresh instance of the actor's module.
-
-   Pros:
-   1. easy and efficient to implement
-   2. apps have maximal control over persistent data and its layout
-
-   Cons:
-   1. bifurcation of state space
-   2. programs need to load/store and de/serialise persistent data to/from local state
-
-#### *Transparent* persistence
-
-All Wasm state is implicitly made persistent.
-Conceptually, each message received runs in the same instance of the actor's module.
-
-   Pros:
-   1. "perfect" model of infinitely running program
-   2. programmers need to "think" less
-
-   Cons:
-   1. hard to implement efficiently without knowing neither language nor application
-   2. can easily lead to space leaks or inefficiencies if programmers aren't careful
-
-#### *Hybrid* persistence
-Wasm state entities can be marked as persistent selectively.
-Conceptually, each message received runs in the same instance of the actor's module,
-but Wasm is extended with some notion of volatile state and reinitialisation.
-
-   Pros:
-   1. compromise between other two models
-
-   Cons:
-   1. compromise between other two models
-   2. creates dangling references between bifurcated state parts
-   3. incoherent with Wasm semantics (segments, start function)
-
-### Implementing Transparent persistence
-
-#### *High-level* implementation of persistence
-
-Hypervisor walks data graph (wherever it lives), turns it into merkle tree.
-
-   Pros:
-   1. agnostic to implementation details of the engine
-   2. agnostic to GC (or subsumes GC)
-
-   Cons:
-   1. requires knowledge of and access to data graph
-   2. deep mutations result in deep changes in merkle tree (mutation cost is logarithmic in depth)
-   3. unclear how to detect changes efficiently
-
-#### *Low-level* implementation of persistence
-
-Hypervisor provides memory to Wasm engine, detects dirty pages; could be memory-mapped files.
-
-   Pros:
-   1. agnostic to language and data graph
-   2. fast when mutation patterns have good locality
-   3. can potentially offload much of the implementation to existing hardware/OS/library mechanisms
-
-   Cons:
-   1. bad interaction with language-internal GC (mutates large portions of the memory at once)
-   2. does not extend to tables (contain position-dependent physical pointers)
-   3. no obvious migration path to Wasm GC
-   4. dependent on VM specifics (and internals?)
-
-#### *Selectable* implementation of persistence
-
-Provide both previous options, possibly in a mutually exclusive fashion.
-
-   Pros:
-   1. choice for implementers
-
-   Cons:
-   1. maximal complexity for platform
+Different * [persistence modes](OrthogonalPersistence.md):
+* [Enhanced orthogonal persistence](OrthogonalPersistence.md).
+* [Classical persistence](OldStableMemory.md).
diff --git a/design/OldStableMemory.md b/design/OldStableMemory.md
@@ -124,10 +124,10 @@ module StableMemory {
 (I think the compiler will still optimize these nested calls to known
 function calls, but it would be worth checking).
 
-# Maintaining existing Stable Variables.
+# Maintaining existing Stable Variables (Legacy Persistence).
 
-Stable memory is currently hidden behind the abstraction of stable
-variables, which we will still need to maintain. The current
+In classical persistence, stable memory is hidden behind the abstraction of stable
+variables, which we will still need to maintain. This old
 implementation of stable variables stores all variables as a
 Candidish record of _stable_ fields, starting at stable memory address 0 with
 initial word encoding size (in bytes?) followed by contents.
@@ -170,6 +170,9 @@ and other metadata (so that initial reads after growing beyond page `size`  alwa
 This scheme avoids relocating most of StableMem and is constant time when
 there are no stable variables.
 
+[Enhanced orthogonal persistence](OrthogonalPersistence.md) introduces a new peristence implementation.
+The old mechanism is only supported for backwards compatibility.
+
 # Details:
 
 Stable memory layout (during execution):