Proposal: change structural representation to avoid StructuralEmpty. #5

saeta · 2020-07-05T21:39:43Z

This PR represents a proposal to change the way Swift types are encoded into their structural
representation to enable more flexible applications of structural generic programming. This proposal
was inspired by a structural programming-based deep learning model composition
API.

tl;dr: Drop the StructuralEmpty from the end of the field list.

Today, we represent a struct as a "Cons-list" of its fields, terminated by StructuralEmpty. For
example:

struct Point2: Structural {
	var x: Int
	var y: Float
}

would have the following StructuralRepresentation associated type generated by the compiler:

extension Point2 {
	public typealias StructuralRepresentation =
		StructuralStruct<
			StructuralCons<StructuralProperty<Int>,
			StructuralCons<StructuralProperty<Float>,
			StructuralEmpty>>>

	// ...
}

or, alternatively written with a made-up syntax loosely inspired by Scala's HList type:

typealias StructuralRepresentation =
  StructuralStruct<
  	StructuralProperty<Int> :: StructuralProperty<Float> :: StructuralEmpty>

This proposal suggests we modify the representation to look as follows:

extension Point2 {
	public typealias StructuralRepresentation =
		StructuralStruct<
			StructuralCons<StructuralProperty<Int>,
			StructuralCons<StructuralProperty<Float>>>>

	// ...
}

or in the made-up syntax:

typealias StructuralRepresentation =
  StructuralStruct<
  	StructuralProperty<Int> :: StructuralProperty<Float>>

The advantage of such a shift in representation are three fold:

Simplifies common inductive cases: when providing a structural generic programming-based
automatic conformance, an extension for StructuralEmpty is required. In the examples in this
repository, most of them are benign empty implementations (e.g. DecodeJSON,
DefaultInitializable,
ScaleBy),
however some of them require more careful thought (e.g. CustomComparable).

With this change, only conformances that would like to support zero-sized types are required to
provide a conformance for StructuralEmpty. (See below.)
Special handling for zero-sized types. Some protocols might want to have distinct
handling for zero-sized types. For example, when encoding JSON, different applications might want
to alternatively serialize or not serialize the field. (For example, a sentinal marker field to
declare a version.)

If this proposal is adopted, the JSON library could drop the StructuralEmpty conformance to
StructuralEmpty, which would automatically remove automatic EncodeJSON conformance for zero-
sized types.

Composition of computation based on fields.

The included motivating example is a way of preserving static information when performing lazy
operations to sequences. In Swift today, escaping closures are heap allocated and type erased, which
effectively forms an optimization boundary. Additionally, they operate with reference semantics
and cannot have any additional properties / fields (unlike classes or structs) and thus cannot have
additional entry points to access the state.

By representing the transformation not as a closure, but instead as a struct, we can realize
better performance, and build additonal entry points to view and manipulate the stored state. This
PR includes a woefully incomplete step in this direction. See the tests for the usage, and the
benchmark numbers below for a comparison of performance. (tl;dr: At small sizes, the existing lazy
transforms are equivalent or faster; at large sizes, the completely unoptimized structural generic
programming implementation is about 2x faster.)

Neural networks can be thought of as differentiable functions with machine learned state.
Manipulating that state via value semantics works extremely well (as demonstrated by the existing
S4TF APIs). One additional important aspect of neural networks is a need to explicitly manipulate
the contained state (e.g. weight regularlization, resetting the learned parameters when doing
transfer learning). Being able to access that state explicitly with a convenient name is valuable
from an API ergonomic perspective.

Most neural network architectures are described as compositions of other "layers" (bottoming out in
a few common layer types, such as Linear (aka Dense), Convolution, and parameterless
nonlinearities such as ReLu). The most common form of composition is sequential composition.

By removing the StructuralEmpty "tail" of the HList of fields (and also extending structural
generic programming for differentiability), we can now begin to leverage the benefits of automatic
conformance implementations for this use case, such as sequential or parallel composition.

For example:

struct MyModel: Structural, StructuralLayer {
	var conv: Conv2D
	var flatten: Flatten
	var dense: Dense

	// The following explicit implementation becomes unnecessary with SGP.
	func callAsFunction(_ input: Tensor) -> Tensor {
		return dense(flatten(conv(input)))
	}
}

Because StructuralEmpty must define one and only one associatedtype Input, and because there are
many possible Input and Output types in neural networks (e.g. an Attention layer takes both a
key and query tensor), the presence of StructuralEmpty constrains the application of structural
generic programming.

There are other problems that have a similar flavor to neural networks. One can look at neural
networks through the lens of a program generating a graph of operations which are then executed with
as much parallelism as possible by a runtime. This lens can also be reapplied to a variety of other
applications, such as build systems. (e.g. Bazel has a Python-like syntax for build a graph, which
is then executed in parallel by the rest of the build system. The combination of CMake and Ninja
operates similarly. Hat-tip to clattner@.) In addition, the intermediate products of build systems
sometimes need to be named and explicitly referenced. (Perhaps the most sophisticated example of
this approach is sbt.)

One of the non-obvious implications of this change is that it unlocks important use cases that would
not be well served by different approaches to metaprogramming. (The first usecase is conditional
conformances.)

Concretely, when representing a sequential composition of computations produced and consumed by
fields of a struct, a non-HList-style representation of the corresponding types forces the generic
code to type-cast all the way to Any. For example, in some hypothetical hyper-specializing
compiler that would specialize reflection from runtime to compile time, a sequential composition
operation would look as follows:

extension MyProtocol {
	public func callAsFunction(_ input: FirstField.Input) -> LastField.Output {
		var intermediateResult: Any = input
		for field in self.allFields {
		    intermediateResult = field(intermediateResult)
		}
		return intermediateResult as! LastField.Output
	}
}

Instead, structural generic programming represents the iteration as recursion where the type of the
carry variable is explicitly represented as an induction [type] variable, ensuring type safety (at
the cognitive cost of a recursive representation).

Can we get rid of StructuralEmpty? In order to handle zero-sized types, I think we must
keep it around.
What other applications can we derive from SGP thanks to this change? Please feel free to
suggest some more! ;-)

Performance numbers comparing the explicit closure-allocating lazy operations vs a structural-based
approach.

name                                                             time        std        iterations
--------------------------------------------------------------------------------------------------
SequentialTransformer: swift lazy transform (count: 1)              149.0 ns ± 201.72 %    1000000
SequentialTransformer: swift lazy transform (count: 10)             211.0 ns ±  87.56 %    1000000
SequentialTransformer: swift lazy transform (count: 100)            736.0 ns ±  87.69 %    1000000
SequentialTransformer: swift lazy transform (count: 1000)          5727.0 ns ±  18.60 %     251236
SequentialTransformer: swift lazy transform (count: 10000)        53680.0 ns ±  11.02 %      26456
SequentialTransformer: swift lazy transform (count: 100000)      538643.0 ns ±   6.06 %       2507
SequentialTransformer: structural lazy transform (count: 1)         151.0 ns ± 157.65 %    1000000
SequentialTransformer: structural lazy transform (count: 10)        758.0 ns ±  64.62 %    1000000
SequentialTransformer: structural lazy transform (count: 100)      1501.0 ns ±  34.70 %     907048
SequentialTransformer: structural lazy transform (count: 1000)     4562.0 ns ±  19.75 %     305737
SequentialTransformer: structural lazy transform (count: 10000)   31690.0 ns ±   8.86 %      48379
SequentialTransformer: structural lazy transform (count: 100000) 287372.0 ns ±  13.87 %       4826

This PR represents a proposal to change the way Swift types are encoded into their structural representation to enable more flexible applications of structural generic programming. This proposal was inspired by [a structural programming-based deep learning model composition API](tensorflow/swift-models#613). > tl;dr: Drop the `StructuralEmpty` from the end of the field list. Today, we represent a `struct` as a "Cons-list" of its fields, terminated by `StructuralEmpty`. For example: ```swift struct Point2: Structural { var x: Int var y: Float } ``` would have the following `StructuralRepresentation` associated type generated by the compiler: ```swift extension Point2 { public typealias StructuralRepresentation = StructuralStruct< StructuralCons<StructuralProperty<Int>, StructuralCons<StructuralProperty<Float>, StructuralEmpty>>> // ... } ``` or, alternatively written with a made-up syntax loosely inspired by Scala's `HList` type: ```swift typealias StructuralRepresentation = StructuralStruct< StructuralProperty<Int> :: StructuralProperty<Float> :: StructuralEmpty> ``` This proposal suggests we modify the representation to look as follows: ```swift extension Point2 { public typealias StructuralRepresentation = StructuralStruct< StructuralCons<StructuralProperty<Int>, StructuralCons<StructuralProperty<Float>>>> // ... } ``` or in the made-up syntax: ``` typealias StructuralRepresentation = StructuralStruct< StructuralProperty<Int> :: StructuralProperty<Float>> ``` The advantage of such a shift in representation are three fold: 1. **Simplifies common inductive cases**: when providing a structural generic programming-based automatic conformance, an extension for `StructuralEmpty` is required. In the examples in this repository, most of them are benign empty implementations (e.g. [`DecodeJSON`](https://github.com/google/swift-structural/blob/043713b88913efe79bc5041af5ba3de3c1d74517/Sources/StructuralExamples/DecodeJSON.swift#L61), [`DefaultInitializable`](https://github.com/google/swift-structural/blob/043713b88913efe79bc5041af5ba3de3c1d74517/Sources/StructuralExamples/DefaultInitializable.swift#L48), [`ScaleBy`](https://github.com/google/swift-structural/blob/043713b88913efe79bc5041af5ba3de3c1d74517/Sources/StructuralExamples/ScaleBy.swift#L85)), however some of them require more careful thought (e.g. [`CustomComparable`](https://github.com/google/swift-structural/blob/043713b88913efe79bc5041af5ba3de3c1d74517/Sources/StructuralExamples/CustomComparable.swift#L102)). With this change, only conformances that would like to support zero-sized types are required to provide a conformance for `StructuralEmpty`. (See below.) 2. **Special handling for zero-sized types.** Some protocols might want to have distinct handling for zero-sized types. For example, when encoding JSON, different applications might want to alternatively serialize or not serialize the field. (For example, a sentinal marker field to declare a version.) If this proposal is adopted, the JSON library could drop the `StructuralEmpty` conformance to `StructuralEmpty`, which would automatically remove automatic `EncodeJSON` conformance for zero- sized types. 3. **Composition of computation based on fields.** The included motivating example is a way of preserving static information when performing lazy operations to sequences. In Swift today, escaping closures are heap allocated and type erased, which effectively forms an optimization boundary. Additionally, they operate with reference semantics and cannot have any additional properties / fields (unlike classes or structs) and thus cannot have additional entry points to access the state. By representing the transformation not as a closure, but instead as a `struct`, we can realize better performance, and build additonal entry points to view and manipulate the stored state. This PR includes a _woefully incomplete_ step in this direction. See the tests for the usage, and the benchmark numbers below for a comparison of performance. (tl;dr: At small sizes, the existing lazy transforms are equivalent or faster; at large sizes, the completely unoptimized structural generic programming implementation is about 2x faster.) Neural networks can be thought of as differentiable functions with machine learned state. Manipulating that state via value semantics works extremely well (as demonstrated by the existing S4TF APIs). One additional important aspect of neural networks is a need to explicitly manipulate the contained state (e.g. weight regularlization, resetting the learned parameters when doing transfer learning). Being able to access that state explicitly with a convenient name is valuable from an API ergonomic perspective. Most neural network architectures are described as compositions of other "layers" (bottoming out in a few common layer types, such as `Linear` (aka `Dense`), `Convolution`, and parameterless nonlinearities such as `ReLu`). The most common form of composition is sequential composition. By removing the `StructuralEmpty` "tail" of the HList of fields (and also extending structural generic programming for differentiability), we can now begin to leverage the benefits of automatic conformance implementations for this use case, such as sequential or parallel composition. For example: ```swift struct MyModel: Structural, StructuralLayer { var conv: Conv2D var flatten: Flatten var dense: Dense // The following explicit implementation becomes unnecessary with SGP. func callAsFunction(_ input: Tensor) -> Tensor { return dense(flatten(conv(input))) } } ``` Because `StructuralEmpty` must define one and only one `associatedtype Input`, and because there are many possible `Input` and `Output` types in neural networks (e.g. an Attention layer takes both a key and query tensor), the presence of `StructuralEmpty` constrains the application of structural generic programming. There are other problems that have a similar flavor to neural networks. One can look at neural networks through the lens of a program generating a graph of operations which are then executed with as much parallelism as possible by a runtime. This lens can also be reapplied to a variety of other applications, such as build systems. (e.g. Bazel has a Python-like syntax for build a graph, which is then executed in parallel by the rest of the build system. The combination of CMake and Ninja operates similarly. Hat-tip to clattner@.) In addition, the intermediate products of build systems sometimes need to be named and explicitly referenced. (Perhaps the most sophisticated example of this approach is [`sbt`](https://www.scala-sbt.org/).) One of the non-obvious implications of this change is that it unlocks important use cases that would not be well served by different approaches to metaprogramming. (The first usecase is conditional conformances.) Concretely, when representing a sequential composition of computations produced and consumed by fields of a struct, a non-HList-style representation of the corresponding types forces the generic code to type-cast all the way to `Any`. For example, in some hypothetical hyper-specializing compiler that would specialize reflection from runtime to compile time, a sequential composition operation would look as follows: ``` extension MyProtocol { public func callAsFunction(_ input: FirstField.Input) -> LastField.Output { var intermediateResult: Any = input for field in self.allFields { intermediateResult = field(intermediateResult) } return intermediateResult as! LastField.Output } } ``` Instead, structural generic programming represents the iteration as recursion where the type of the carry variable is explicitly represented as an induction [type] variable, ensuring type safety (at the cognitive cost of a recursive representation). - **Can we get rid of `StructuralEmpty`?** In order to handle zero-sized types, I think we must keep it around. - **What other applications can we derive from SGP thanks to this change?** Please feel free to suggest some more! ;-) Performance numbers comparing the explicit closure-allocating lazy operations vs a structural-based approach. ``` name time std iterations -------------------------------------------------------------------------------------------------- SequentialTransformer: swift lazy transform (count: 1) 149.0 ns ± 201.72 % 1000000 SequentialTransformer: swift lazy transform (count: 10) 211.0 ns ± 87.56 % 1000000 SequentialTransformer: swift lazy transform (count: 100) 736.0 ns ± 87.69 % 1000000 SequentialTransformer: swift lazy transform (count: 1000) 5727.0 ns ± 18.60 % 251236 SequentialTransformer: swift lazy transform (count: 10000) 53680.0 ns ± 11.02 % 26456 SequentialTransformer: swift lazy transform (count: 100000) 538643.0 ns ± 6.06 % 2507 SequentialTransformer: structural lazy transform (count: 1) 151.0 ns ± 157.65 % 1000000 SequentialTransformer: structural lazy transform (count: 10) 758.0 ns ± 64.62 % 1000000 SequentialTransformer: structural lazy transform (count: 100) 1501.0 ns ± 34.70 % 907048 SequentialTransformer: structural lazy transform (count: 1000) 4562.0 ns ± 19.75 % 305737 SequentialTransformer: structural lazy transform (count: 10000) 31690.0 ns ± 8.86 % 48379 SequentialTransformer: structural lazy transform (count: 100000) 287372.0 ns ± 13.87 % 4826 ```

shabalind · 2020-09-23T17:32:35Z

Closing this experimental PR for now similarly to #10. We might revisit it later.

shabalind closed this Sep 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal: change structural representation to avoid StructuralEmpty. #5

Proposal: change structural representation to avoid StructuralEmpty. #5

Uh oh!

saeta commented Jul 5, 2020

Uh oh!

shabalind commented Sep 23, 2020

Uh oh!

Uh oh!

Proposal: change structural representation to avoid StructuralEmpty. #5

Proposal: change structural representation to avoid StructuralEmpty. #5

Uh oh!

Conversation

saeta commented Jul 5, 2020

Uh oh!

shabalind commented Sep 23, 2020

Uh oh!

Uh oh!