This repository was archived by the owner on Jan 10, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 9
Proposal: change structural representation to avoid StructuralEmpty. #5
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This PR represents a proposal to change the way Swift types are encoded into their structural representation to enable more flexible applications of structural generic programming. This proposal was inspired by [a structural programming-based deep learning model composition API](tensorflow/swift-models#613). > tl;dr: Drop the `StructuralEmpty` from the end of the field list. Today, we represent a `struct` as a "Cons-list" of its fields, terminated by `StructuralEmpty`. For example: ```swift struct Point2: Structural { var x: Int var y: Float } ``` would have the following `StructuralRepresentation` associated type generated by the compiler: ```swift extension Point2 { public typealias StructuralRepresentation = StructuralStruct< StructuralCons<StructuralProperty<Int>, StructuralCons<StructuralProperty<Float>, StructuralEmpty>>> // ... } ``` or, alternatively written with a made-up syntax loosely inspired by Scala's `HList` type: ```swift typealias StructuralRepresentation = StructuralStruct< StructuralProperty<Int> :: StructuralProperty<Float> :: StructuralEmpty> ``` This proposal suggests we modify the representation to look as follows: ```swift extension Point2 { public typealias StructuralRepresentation = StructuralStruct< StructuralCons<StructuralProperty<Int>, StructuralCons<StructuralProperty<Float>>>> // ... } ``` or in the made-up syntax: ``` typealias StructuralRepresentation = StructuralStruct< StructuralProperty<Int> :: StructuralProperty<Float>> ``` The advantage of such a shift in representation are three fold: 1. **Simplifies common inductive cases**: when providing a structural generic programming-based automatic conformance, an extension for `StructuralEmpty` is required. In the examples in this repository, most of them are benign empty implementations (e.g. [`DecodeJSON`](https://github.com/google/swift-structural/blob/043713b88913efe79bc5041af5ba3de3c1d74517/Sources/StructuralExamples/DecodeJSON.swift#L61), [`DefaultInitializable`](https://github.com/google/swift-structural/blob/043713b88913efe79bc5041af5ba3de3c1d74517/Sources/StructuralExamples/DefaultInitializable.swift#L48), [`ScaleBy`](https://github.com/google/swift-structural/blob/043713b88913efe79bc5041af5ba3de3c1d74517/Sources/StructuralExamples/ScaleBy.swift#L85)), however some of them require more careful thought (e.g. [`CustomComparable`](https://github.com/google/swift-structural/blob/043713b88913efe79bc5041af5ba3de3c1d74517/Sources/StructuralExamples/CustomComparable.swift#L102)). With this change, only conformances that would like to support zero-sized types are required to provide a conformance for `StructuralEmpty`. (See below.) 2. **Special handling for zero-sized types.** Some protocols might want to have distinct handling for zero-sized types. For example, when encoding JSON, different applications might want to alternatively serialize or not serialize the field. (For example, a sentinal marker field to declare a version.) If this proposal is adopted, the JSON library could drop the `StructuralEmpty` conformance to `StructuralEmpty`, which would automatically remove automatic `EncodeJSON` conformance for zero- sized types. 3. **Composition of computation based on fields.** The included motivating example is a way of preserving static information when performing lazy operations to sequences. In Swift today, escaping closures are heap allocated and type erased, which effectively forms an optimization boundary. Additionally, they operate with reference semantics and cannot have any additional properties / fields (unlike classes or structs) and thus cannot have additional entry points to access the state. By representing the transformation not as a closure, but instead as a `struct`, we can realize better performance, and build additonal entry points to view and manipulate the stored state. This PR includes a _woefully incomplete_ step in this direction. See the tests for the usage, and the benchmark numbers below for a comparison of performance. (tl;dr: At small sizes, the existing lazy transforms are equivalent or faster; at large sizes, the completely unoptimized structural generic programming implementation is about 2x faster.) Neural networks can be thought of as differentiable functions with machine learned state. Manipulating that state via value semantics works extremely well (as demonstrated by the existing S4TF APIs). One additional important aspect of neural networks is a need to explicitly manipulate the contained state (e.g. weight regularlization, resetting the learned parameters when doing transfer learning). Being able to access that state explicitly with a convenient name is valuable from an API ergonomic perspective. Most neural network architectures are described as compositions of other "layers" (bottoming out in a few common layer types, such as `Linear` (aka `Dense`), `Convolution`, and parameterless nonlinearities such as `ReLu`). The most common form of composition is sequential composition. By removing the `StructuralEmpty` "tail" of the HList of fields (and also extending structural generic programming for differentiability), we can now begin to leverage the benefits of automatic conformance implementations for this use case, such as sequential or parallel composition. For example: ```swift struct MyModel: Structural, StructuralLayer { var conv: Conv2D var flatten: Flatten var dense: Dense // The following explicit implementation becomes unnecessary with SGP. func callAsFunction(_ input: Tensor) -> Tensor { return dense(flatten(conv(input))) } } ``` Because `StructuralEmpty` must define one and only one `associatedtype Input`, and because there are many possible `Input` and `Output` types in neural networks (e.g. an Attention layer takes both a key and query tensor), the presence of `StructuralEmpty` constrains the application of structural generic programming. There are other problems that have a similar flavor to neural networks. One can look at neural networks through the lens of a program generating a graph of operations which are then executed with as much parallelism as possible by a runtime. This lens can also be reapplied to a variety of other applications, such as build systems. (e.g. Bazel has a Python-like syntax for build a graph, which is then executed in parallel by the rest of the build system. The combination of CMake and Ninja operates similarly. Hat-tip to clattner@.) In addition, the intermediate products of build systems sometimes need to be named and explicitly referenced. (Perhaps the most sophisticated example of this approach is [`sbt`](https://www.scala-sbt.org/).) One of the non-obvious implications of this change is that it unlocks important use cases that would not be well served by different approaches to metaprogramming. (The first usecase is conditional conformances.) Concretely, when representing a sequential composition of computations produced and consumed by fields of a struct, a non-HList-style representation of the corresponding types forces the generic code to type-cast all the way to `Any`. For example, in some hypothetical hyper-specializing compiler that would specialize reflection from runtime to compile time, a sequential composition operation would look as follows: ``` extension MyProtocol { public func callAsFunction(_ input: FirstField.Input) -> LastField.Output { var intermediateResult: Any = input for field in self.allFields { intermediateResult = field(intermediateResult) } return intermediateResult as! LastField.Output } } ``` Instead, structural generic programming represents the iteration as recursion where the type of the carry variable is explicitly represented as an induction [type] variable, ensuring type safety (at the cognitive cost of a recursive representation). - **Can we get rid of `StructuralEmpty`?** In order to handle zero-sized types, I think we must keep it around. - **What other applications can we derive from SGP thanks to this change?** Please feel free to suggest some more! ;-) Performance numbers comparing the explicit closure-allocating lazy operations vs a structural-based approach. ``` name time std iterations -------------------------------------------------------------------------------------------------- SequentialTransformer: swift lazy transform (count: 1) 149.0 ns ± 201.72 % 1000000 SequentialTransformer: swift lazy transform (count: 10) 211.0 ns ± 87.56 % 1000000 SequentialTransformer: swift lazy transform (count: 100) 736.0 ns ± 87.69 % 1000000 SequentialTransformer: swift lazy transform (count: 1000) 5727.0 ns ± 18.60 % 251236 SequentialTransformer: swift lazy transform (count: 10000) 53680.0 ns ± 11.02 % 26456 SequentialTransformer: swift lazy transform (count: 100000) 538643.0 ns ± 6.06 % 2507 SequentialTransformer: structural lazy transform (count: 1) 151.0 ns ± 157.65 % 1000000 SequentialTransformer: structural lazy transform (count: 10) 758.0 ns ± 64.62 % 1000000 SequentialTransformer: structural lazy transform (count: 100) 1501.0 ns ± 34.70 % 907048 SequentialTransformer: structural lazy transform (count: 1000) 4562.0 ns ± 19.75 % 305737 SequentialTransformer: structural lazy transform (count: 10000) 31690.0 ns ± 8.86 % 48379 SequentialTransformer: structural lazy transform (count: 100000) 287372.0 ns ± 13.87 % 4826 ```
Closing this experimental PR for now similarly to #10. We might revisit it later. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR represents a proposal to change the way Swift types are encoded into their structural
representation to enable more flexible applications of structural generic programming. This proposal
was inspired by a structural programming-based deep learning model composition
API.
Today, we represent a
struct
as a "Cons-list" of its fields, terminated byStructuralEmpty
. Forexample:
would have the following
StructuralRepresentation
associated type generated by the compiler:or, alternatively written with a made-up syntax loosely inspired by Scala's
HList
type:This proposal suggests we modify the representation to look as follows:
or in the made-up syntax:
The advantage of such a shift in representation are three fold:
Simplifies common inductive cases: when providing a structural generic programming-based
automatic conformance, an extension for
StructuralEmpty
is required. In the examples in thisrepository, most of them are benign empty implementations (e.g.
DecodeJSON
,DefaultInitializable
,ScaleBy
),however some of them require more careful thought (e.g.
CustomComparable
).With this change, only conformances that would like to support zero-sized types are required to
provide a conformance for
StructuralEmpty
. (See below.)Special handling for zero-sized types. Some protocols might want to have distinct
handling for zero-sized types. For example, when encoding JSON, different applications might want
to alternatively serialize or not serialize the field. (For example, a sentinal marker field to
declare a version.)
If this proposal is adopted, the JSON library could drop the
StructuralEmpty
conformance toStructuralEmpty
, which would automatically remove automaticEncodeJSON
conformance for zero-sized types.
The included motivating example is a way of preserving static information when performing lazy
operations to sequences. In Swift today, escaping closures are heap allocated and type erased, which
effectively forms an optimization boundary. Additionally, they operate with reference semantics
and cannot have any additional properties / fields (unlike classes or structs) and thus cannot have
additional entry points to access the state.
By representing the transformation not as a closure, but instead as a
struct
, we can realizebetter performance, and build additonal entry points to view and manipulate the stored state. This
PR includes a woefully incomplete step in this direction. See the tests for the usage, and the
benchmark numbers below for a comparison of performance. (tl;dr: At small sizes, the existing lazy
transforms are equivalent or faster; at large sizes, the completely unoptimized structural generic
programming implementation is about 2x faster.)
Neural networks can be thought of as differentiable functions with machine learned state.
Manipulating that state via value semantics works extremely well (as demonstrated by the existing
S4TF APIs). One additional important aspect of neural networks is a need to explicitly manipulate
the contained state (e.g. weight regularlization, resetting the learned parameters when doing
transfer learning). Being able to access that state explicitly with a convenient name is valuable
from an API ergonomic perspective.
Most neural network architectures are described as compositions of other "layers" (bottoming out in
a few common layer types, such as
Linear
(akaDense
),Convolution
, and parameterlessnonlinearities such as
ReLu
). The most common form of composition is sequential composition.By removing the
StructuralEmpty
"tail" of the HList of fields (and also extending structuralgeneric programming for differentiability), we can now begin to leverage the benefits of automatic
conformance implementations for this use case, such as sequential or parallel composition.
For example:
Because
StructuralEmpty
must define one and only oneassociatedtype Input
, and because there aremany possible
Input
andOutput
types in neural networks (e.g. an Attention layer takes both akey and query tensor), the presence of
StructuralEmpty
constrains the application of structuralgeneric programming.
There are other problems that have a similar flavor to neural networks. One can look at neural
networks through the lens of a program generating a graph of operations which are then executed with
as much parallelism as possible by a runtime. This lens can also be reapplied to a variety of other
applications, such as build systems. (e.g. Bazel has a Python-like syntax for build a graph, which
is then executed in parallel by the rest of the build system. The combination of CMake and Ninja
operates similarly. Hat-tip to clattner@.) In addition, the intermediate products of build systems
sometimes need to be named and explicitly referenced. (Perhaps the most sophisticated example of
this approach is
sbt
.)One of the non-obvious implications of this change is that it unlocks important use cases that would
not be well served by different approaches to metaprogramming. (The first usecase is conditional
conformances.)
Concretely, when representing a sequential composition of computations produced and consumed by
fields of a struct, a non-HList-style representation of the corresponding types forces the generic
code to type-cast all the way to
Any
. For example, in some hypothetical hyper-specializingcompiler that would specialize reflection from runtime to compile time, a sequential composition
operation would look as follows:
Instead, structural generic programming represents the iteration as recursion where the type of the
carry variable is explicitly represented as an induction [type] variable, ensuring type safety (at
the cognitive cost of a recursive representation).
StructuralEmpty
? In order to handle zero-sized types, I think we mustkeep it around.
suggest some more! ;-)
Performance numbers comparing the explicit closure-allocating lazy operations vs a structural-based
approach.