Structural generic layers 2 #650

saeta · 2020-07-27T16:33:48Z

Use structural generic programming to build an HParam object.

This change explores using structural generic programming to build (at compile time)
a type-safe "hyper-parameter" object that can subsequently be manipulated and
subsequently used to initialize a NN with the minimum of fuss.

See HParamInitExample.swift for an example usage (partially replicated, and simultaneously idealized here -- there are a few bugs / issues that must be resolved first before we can achieve this):

public struct MyModel: Structural, HParamInitLayer, SequentialLayer {
    var conv: Conv2D<Float>
    var flatten: Flatten<Float>
    var dense: Dense<Float>
}

// Usage:
func makeModel() -> MyModel {
    var hparams = MyModel.HParam()
    hparams.conv = .init(height: 3, width: 3, channels: 10)  // Fully typesafe!
    hparams.dense = .init(size: 10)

    return hparams.build(for: Tensor(zeros: [5, 28, 28, 1]))
}

Note: in order to build the full API, I needed to make some modifications to the proposed
Structural APIs. (Note: this is a quick-hack, and deserves much more refinement!)

For comparisons, writing this out using the existing APIs would look something like the following and require (1) explicit shape calculations, (2) custom definition of hyperparameters, (3) redundant specification of the forward pass + initialization logic:

public struct MyModel: Layer {
  var conv: Conv2D<Float>
  var flatten: Flatten<Float>
  var dense: Dense<Float>

  // TODO: Pass through the other hyperparameters to `conv` and `dense`!
  /// - Precondition: The size of the input to the denseShape must exactly correspond to the size of the input fed to it after running through the convolution. In this case, it will be `floor((inputTensor.shape[1] - convShape.1 + 1) * (inputTensor.shape[2] - convShape.2 + 1) * convShape.4)` as no padding is being used, and strides are (1, 1). (Note: Checked only when trying to actually use the model to perform inference or training).
  /// - Precondition: convShape.3 must be exactly equal to `input.shape[-1]` from the `callAsFunction` parameter.
  public init(convShape: (Int, Int, Int, Int), denseShape: (Int, Int)) {
    conv = Conv2D(convShape)
    flatten = Flatten()
    dense = Dense(inputSize: denseShape.0, outputSize: denseShape.1)
  }

  @differentiable
  func callAsFunction(_ input: Tensor<Float>) -> Tensor<Float> {
    return input.sequenced(through: conv, flatten, dense)
  }
}

Note: for an accurate comparison the initializer should provide a variety of additional hyperparameters, such as strides / etc. The issue with struct & init composition is that default arguments don't compose, and instead must be specified at every level of composition. By representing the information within an explicit alternative type, we sidestep this issue.

Further ideas to explore include:

using extensions to add convenience initializers for the HParams objects, such as:

extension MyModel.HParam {
  public enum Flavor {
    case mnist
    case imageNet
  }
  public init(_ flavor: Flavor) {
    self.init()
    self.conv = .init(height: 3, width: 3, channels: 10)
    switch flavor {
    case .mnist:
      self.dense = .init(size: 10)
    case .imageNet:
      self.dense = .init(size: 1000)
    }    
  }
}

with example usage:

var model = MyModel.HParam(.mnist).build(for: Tensor(zeros: [10, 28, 28, 1]))

Using a combination of data structures & types to represent skip connections / repeated layers / etc.

Note: of course, we can always fall back to explicitly writing init and func callAsFunction as desired for arbitrarily complicated control flow.

Using structural generic programming on the HParam types themselves (e.g. for easy serialization / deserialization, checkpointing, flag parsing, etc).
Take an explicit random number generator when building the model from the HParam type to allow random number generation to be made deterministic.

Related PR: #613

This change is a first attempt at leveraging structural generic programming to implement some sugar for a higher-level API.

This change explores using structural generic programming to build (at compile time) a type-safe "hyper-parameter" object that can subsequently be manipulated and subsequently used to initialize a NN with the minimum of fuss. See HParamInitExample.swift for an example usage (partially replicated here): ```swift public struct MyInitModel { var conv: Conv2D<Float> var flatten: Flatten<Float> var dense: Dense<Float> } // Thanks to `DifferentiableStructural` conformances, we can derive these protocols automagically! extension MyInitModel: HParamInitLayer, Layer, SequentialLayer { // Must specify typealiases because they are not inferred automatically. :-( public typealias Input = Tensor<Float> public typealias Output = Tensor<Float> public typealias SequentialInput = Input public typealias SequentialOutput = Output public typealias HParam = StaticStructuralRepresentation.HParam } // Usage: func makeExplicitModel() -> MyInitModel { var hparams = MyInitModel.HParam() hparams.conv = .init(height: 3, width: 3, channels: 10) // Fully typesafe! hparams.dense = .init(size: 10) return hparams.build(for: Tensor<Float>(zeros: [5, 28, 28, 1])) } ``` Note: in order to build the full API, I needed to make some modifications to the proposed Structural APIs. (Note: this is a quick-hack, and deserves much more refinement!) Related PR: #613

saeta · 2020-07-27T16:34:10Z

CC @shadaj @shabalind @dabrahams

@shabalind

Based on explorations of structural generic programming in tensorflow/swift-models#650 we would like to support smooth interactions between KeyPaths and structural generic programming. Based on discussions with @shabalind, I have put together a quick proof-of-concept demonstrating how to unify a "static structural" implementation with the "instance"-level structural generic programming we have been exploring so far. This is definitely an incomplete implementation, but this would allow us to easily implement KeyPathIterable and its ilk on top of structural generic programming (something not possible previously), in addition to unifying with the existing KeyPath world. There are a couple further extensions: 1. **Solution for `HNil`** To use a proper cons-list construction, this will require a bottom type (`Never`) that conforms to every protocol, and multiple conformances. Alternatively, an encoding that appears to work is a cons-list construction that doesn't use a special end type. 2. **Typealias inference**: Swift does not currently infer typealiases for these kinds of derived conformances. 3. **Enum support** This current implementation doesn't include enum support although this is straight forward to add. 4. **Simplier structural representations**: When the single static structural derived conformance, we can easily implement multiple simpler "structural" encodings (e.g. a simple Cons-list of values in addition to the current structural representation). 5. Extensions to KeyPaths could enable StaticStructural to simply be the key paths themselves. (i.e. if you could retrieve the field name corresponding to the key path.)

dabrahams · 2020-09-02T17:16:14Z

Closing until structural GP is available in the upstream compiler.

saeta added 6 commits June 22, 2020 23:50

Implement SequentialLayer based on Structural Generic Programming.

2f41358

This change is a first attempt at leveraging structural generic programming to implement some sugar for a higher-level API.

Add residual connections as a property wrapper.

80ba2f2

Add beginnings of a ShapePropagatingLayer configuration.

c265a12

Merge branch 'master' into structural-generic-layers

c63c7f3

WIP exploring structural type providers & the resulting API.

ad1de9c

saeta mentioned this pull request Jul 31, 2020

Proof-of-concept implementation of Static Structural google/swift-structural#9

Closed

dabrahams closed this Sep 2, 2020

saeta deleted the structural-generic-layers2 branch September 2, 2020 17:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Structural generic layers 2 #650

Structural generic layers 2 #650

Uh oh!

saeta commented Jul 27, 2020 •

edited

Loading

Uh oh!

saeta commented Jul 27, 2020

Uh oh!

dabrahams commented Sep 2, 2020

Uh oh!

Uh oh!

Structural generic layers 2 #650

Structural generic layers 2 #650

Uh oh!

Conversation

saeta commented Jul 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

saeta commented Jul 27, 2020

Uh oh!

dabrahams commented Sep 2, 2020

Uh oh!

Uh oh!

saeta commented Jul 27, 2020 •

edited

Loading