Hierarchies

The following motivates and describes a change to the type section. (Instructions are unchanged.)

## Consolidating Too Much Information

Many (most?) well-known type systems run into problems when they have *too much* information. To illustrate, I'll pick on Kotlin because in Kotlin I can most directly recreate the examples I've discussed on this elsewhere, but be aware that the overarching issue can be demonstrated in most major languages.

[Here's a Kotlin program that type-checks.](https://pl.kotl.in/AQhH2tw1P) Notice the functions `asRed`, `asGreen`, and `asBlue`. These are simply the identify functions, so they seem like they should be no-ops. You should be able to just inline them, right? Well, unfortunately inlining is *not* type-preserving in Kotlin. These functions are serving an important purpose: they're telling the type-checker how to type-check the code by *downgrading* the information about the variable `c`'s type. By inlining them, you've removed those downgrades, and if you though inlining should be type-preserving then you've made type-checking NP-Hard per [this presentation](https://github.com/WebAssembly/meetings/blob/main/gc/2020/presentations/2020-12-01-tate-determinizing-fields.pdf).

The issue is that, in order to support common patterns, Kotlin deduces information about a variable's type within contexts where that immutable variable has been cast, and here that means that `c` is simultaneously a `Red`, a `Green`, and a `Blue`. The grammar for field accesses in Kotlin only specifies the field name (analogous to field offset in the MVP), leaving the type of the field to be synthesized from the type of the receiver. But here the receiver has an intersection type with each component specifying different types for the field, so the type-checker has *too much* information.

To address this and similar cases, Kotlin throws out subsumptive subtyping (i.e. [this core wasm rule](https://webassembly.github.io/spec/core/valid/instructions.html#non-empty-instruction-sequence-xref-syntax-instructions-syntax-instr-mathit-instr-ast-xref-syntax-instructions-syntax-instr-mathit-instr-n)) and raises an ambiguity type-checking error asking the programmer to resolve the error by downgrading types. Many languages take this approach, and if you've ever encountered an ambiguity error than you're using a language without subsumptive subtyping. This is reasonable in surface languages where programmers are interactively involved and can patch the code accordingly. But it doesn't work so well for low-level languages where programmers cannot so easily patch compiled code, and where program transformations/optimizations (such as inlining) are readily employed *but are not type-preserving without subsumption*.

Now, if you look at the Kotlin example, you'll notice that `Red`, `Green`, and `Blue` are all classes, and none of these inherit the others. Kotlin permits only single-class inheritance, which means unrelated classes can have no common instances. Because the type-checker has *complete* knowledge of the class-inheritance hierarchy, it could realize that `c` cannot exist and can be given the "[bottom](https://webassembly.github.io/reference-types/core/valid/instructions.html)" type `Nothing`. This amounts to *consolidating* information into something that is unambiguous for the type-checker. The "hierarchies" I describe here provide the means to always consolidate information and thereby avoid both ambiguity and exponential type-checking. (This doesn't work for Kotlin interfaces, but thankfully interfaces are a surface-level construct that gets lowered to distinct low-level constructs, and so the lowering process itself must resolve the ambiguity above the WebAssembly level.)

## Design

A hierarchy is defined as follows in the types section:
```
deftype ::= ... | <hierarchy>
hierarchy ::= hierarchy 1 <dependent> ; (more dependents in future)
dependent ::= fields <fieldtype>* | ... (more cases in future)
```
Notice that this is essentially a `struct` definition, but with room for more extensions, and with different import/export semantics (discussed shortly).

Besides adding this, the only other change to the (nominal) MVP would be to require `struct_subtype` and `array_subtype` to subtype only `hierarchy`/`struct_subtype`/`array_subtype` types (and extensible imported types, per below). This ensures all nominal types belong to some hierarchy. (Typically there will be only one hierarchy per module.)

### Imports and Exports

For simplicity, let's suppose that no field hiding is done by the exporting module. In that case, one exports hierarchies as follows:
```
exportdesc ::= ... | hierarchy <typeidx> | type <exportidx> <typeidx>
```
For `hierarchy <typeidex>`, the index must refer to a `hierarchy` definition.
For `type <exportidx>`, the index must refer to either an earlier `hierarchy` export or an earlier `type` export, in which case the `<typeidx>` must be a subtype of the type specified by that export.
The use of export indices specifies a subtyping relation between exported types.

One imports hierarchies and types as follows:
```
importdesc ::= ... | hierarchy extensible? 1 <dependent> | struct_subtype <importidx> extensible? <fieldtype>* | array_subtype <importidx> <fieldtype>
```
The field types in the `_subtype` imports are required to be subtypes of their counterparts in the `<importidx>`.
The use of import indices specifies a subtyping relation between imported types.

When matching individual exports with individual imports, one checks the following properties:
* Hierarchies must match with hierarchies, and the `<dependent>` of the export must be a subtype of (or, if the import is `extensible`, equivalent to) the import's `<dependent>`
* A `struct_subtype` import must match with a `type` export referencing a `struct_subtype` type, and the `<fieldtype>*` of the export must be a subtype of (or, if the import is `extensible`, equivalent to) the import's `<fieldtype>*`
* An `array_subtype` import must match with a `type` export referencing an `array_subtype` type, and the `<fieldtype>` of the export must be equivalent to the import's `<fieldtype>`
* For `_subtype` imports, the imported type referenced `<importidx>` must be a subtype in the exported hierarchy

Furthermore, one checks the following properties across imports and exports:
* For any two imports of types in the same import hierarchy, their corresponding exported types must be subtypes within the export hierarchy if *and only if* the imports are subtypes within the import hierarchy.

The "and only if" in this last step is critical as it means the importing module has *complete* knowledge of the subtyping relationships between the imported nominal types *within the same hierarchy*. This means we can employ the consolidation technique mentioned above: if ever an instance belongs to multiple unrelated nominal types, we can resolve the ambiguity by simply giving it the bottom type. And for module composition, it is important that types be grouped into hierarchies.

## Multiple Dependents

The design above leaves room for defining hierarchies with multiple dependents.
In the (current) nominal MVP, there's a common pattern where one generates two nominal hierarchies: one for instances and one for v-tables.
Furthermore, the hierarchy for instances constantly redeclares all the field types for the primary purpose of refining the v-table field to use its counterpart's nominal type in the hierarchy for v-tables.

What's really happening here is that there's one nominal hierarchy with *two* pieces of information that depend on it: the field types of instances and the field types of v-tables.
With hierarchies, we can encode this pattern more directly.
At the same time, we can use this to add support for a common feature: the v-table's contents are usually placed directly in the object descriptor rather than as a separate field.

First, let's generalize hierarchies a bit:
```
hierarchy ::= hierarchy <dependent>*
```
Then let's require that one defines a new nominal type, the definition provides a refinement/extension of *each* of the dependents on the hierarchy, e.g. specifying new instance fields and new v-table fields.

Second, we could add a new instruction `desc.get n : [(ref $t)] -> [tn]`. This loads the n'th field from the object descriptor (such as the n'th method in the v-table). The type of this field is given by the *second* dependent specified by the type `$t` (whereas instance fields are given by the first dependent). Or we could add a new heap type `desc <typeidx>` for "object descriptors", add an instruction `ref.get_descriptor : [(ref $t)] -> [(ref (desc $t))]` for getting an object's descriptor (i.e. the thing that the head of the object points to), and extend `get` instructions for accessing the fields of these descriptors.

Anyways, there's a lot of flexibility here, and we don't have to do any of this for the MVP if we don't want, so I won't go into more detail unless requested. The only thing that's pressing for the MVP is adding some sort of "hierarchy" structure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hierarchies #245

Consolidating Too Much Information

Design

Imports and Exports

Multiple Dependents

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Hierarchies #245

Description

Consolidating Too Much Information

Design

Imports and Exports

Multiple Dependents

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions