[AutoDiff] Directly SILGen `@derivative` attributes to diff witnesses. #28621

dan-zheng · 2019-12-06T21:35:44Z

Previously, @derivative attribute type-checking created implicit
@differentiable attributes on the original declaration. This was a
longstanding hack powering @derivative attribute derivative registration.

#28608 made these changes:

Derivative function configurations (from @differentiable and @derivative
attributes) are serialized in modules and are loaded from imported modules.
The differentiation transform uses these derivative function configurations
for derivative function lookup instead of @differentiable attributes.

Now, @derivative attributes are directly lowered to differentiability
witnesses during SILGen, and implicit @differentiable attribute generation
is removed.

Resolves TF-835.

Unblocks TF-1021: lifting the "same-file derivative registration only"
limitation in @derivative attribute type-checking. This should be possible
without much work, but needs testing!

Exposes TF-1037: crash due to no derivative generic signature mangling (TF-680).
Exposes TF-1040: @differentiable attribute limitations for class methods.
Exposes TF-1041: untested protocol requirement @differentiable attribute
type-checking logic.

Previously, `@derivative` attribute type-checking created implicit `@differentiable` attributes on the original declaration. This was a longstanding hack powering `@derivative` attribute derivative registration. #28608 made these changes: - Derivative function configurations (from `@differentiable` and `@derivative` attributes) are serialized in modules and are loaded from imported modules. - The differentiation transform uses these derivative function configurations for derivative function lookup instead of `@differentiable` attributes. Now, `@derivative` attributes are directly lowered to differentiability witnesses during SILGen, and implicit `@differentiable` attribute generation is removed. Type-checking changes: - "Overlapping" `@differentiable` and `@derivative` attributes (for the same original declaration and parameter indices) are now disallowed. They semantically conflict because the first "requests derivative generation" while the second "registers a derivative". - "Overlapping" `@differentiable` and `@derivative` attributes are allowed for protocol requirements. Requirement `@differentiable` attributes mean "add JVP/VJP witness table entries" - not "request derivative generation", because there is no function body. - Note that relaxing the "overlapping" condition to consider derivative generic signatures is possible after derivative generic signature mangling for derivative functions: TF-680. Resolves TF-835. Unblocks TF-1021: lifting the "same-file derivative registration only" limitation in `@derivative` attribute type-checking. This should be possible without much work, but needs testing! Exposes TF-1040: `@differentiable` attribute limitations for class methods. Exposes TF-1041: untested protocol requirement `@differentiable` attribute type-checking logic.

Fix "a derivative already exists" error for `Layer.inferring(from:)`.

lib/Sema/TypeCheckAttr.cpp

rxwei · 2019-12-06T21:54:01Z

"Overlapping" @differentiable and @derivative attributes (for the same
original declaration and parameter indices) are now disallowed.

This doesn't make sense to me.

In protocol conformances, it is perfectly legal to declare a conformance and then provide implementations in a different extension in the same module. Differentiable functions should behave the same way.

rxwei · 2019-12-06T21:57:22Z

They semantically conflict because the first "requests derivative generation" while the second "registers a derivative".

The semantics of @differentiable is not requesting derivative generation. It is declaring a function to be differentiable. With a declaration, the compiler then tries to verify it, either by differentiating it, or by checking whether there's a derivative (@derivative(of:)) already.

dan-zheng · 2019-12-06T22:33:50Z

The semantics of @differentiable is not requesting derivative generation. It is declaring a function to be differentiable. With a declaration, the compiler then tries to verify it, either by differentiating it, or by checking whether there's a derivative (@derivative(of:)) already.

That's a good point, I think I agree! This type-checking change seems misguided - the original motivation was to avoid a crash regarding "@differentiable + @derivative attribute for same original declaration and parameter indices but different derivative generic signature".

Reproducer:

protocol P: Differentiable {}
extension P {
  @differentiable
  func foo() -> Float { 1 }
}

extension P {
  @derivative(of: foo)
  func vjpFoo() -> (value: Float, pullback: (Float) -> (TangentVector)) {
    fatalError()
  }
}

Previously, @derivative attribute type-checking looked for a @differentiable attribute with the same parameter indices and filled in the JVP/VJP.

// differentiability witness for P.foo()
sil_differentiability_witness hidden [parameters 0] [results 0] @$s1p1PPAAE3fooSfyF : $@convention(method) <Self where Self : P> (@in_guaranteed Self) -> Float {
  vjp: @AD__$s1p1PPAAE3fooSfyF__vjp_src_0_wrt_0 : $@convention(method) <τ_0_0 where τ_0_0 : P> (@in_guaranteed τ_0_0) -> (Float, @owned @callee_guaranteed (Float) -> @out τ_0_0.TangentVector)
}

This generated a single differentiability witness, no problem.

However, with the current logic in this patch: the @differentiable attribute has no derivative generic signature, but the @derivative attribute has a derivative generic signature (the generic signature of the @derivative declaration is used, which is <Self where Self : P>).

This causes two differentiability witnesses to be SILGen'd, each with a different derivative generic signature:

// differentiability witness for P.foo()
sil_differentiability_witness hidden [parameters 0] [results 0] @$s1p1PPAAE3fooSfyF : $@convention(method) <Self where Self : P> (@in_guaranteed Self) -> Float {
}

// differentiability witness for P.foo()
sil_differentiability_witness hidden [parameters 0] [results 0] <Self where Self : P> @$s1p1PPAAE3fooSfyF : $@convention(method) <Self where Self : P> (@in_guaranteed Self) -> Float {
  vjp: @AD__$s1p1PPAAE3fooSfyF__vjp_src_0_wrt_0 : $@convention(method) <τ_0_0 where τ_0_0 : P> (@in_guaranteed τ_0_0) -> (Float, @owned @callee_guaranteed (Float) -> @out τ_0_0.TangentVector)
}

Since derivative generic signature mangling is not yet implemented for derivative functions (TF-680), there's a name conflict issue for derivative functions with the same parameter indices but different derivative generic signatures.

SILGen generates a VJP thunk for the second differentiability witness above called s1p1PPAAE3fooSfyF__vjp_src_0_wrt_0. The differentiation transform attempts to create an empty VJP function for the first differentiability witness above, but encounters a name clash:

// AD__$s1p1PPAAE3fooSfyF__vjp_src_0_wrt_0
sil hidden [thunk] [always_inline] [ossa] @AD__$s1p1PPAAE3fooSfyF__vjp_src_0_wrt_0 : $@convention(method) <τ_0_0 where τ_0_0 : P> (@in_guaranteed τ_0_0) -> (Float, @owned @callee_guaranteed (Float) -> @out τ_0_0.TangentVector) {
// %0                                             // user: %2
bb0(%0 : $*τ_0_0):
  // function_ref P.vjpFoo()
  %1 = function_ref @$s1p1PPAAE6vjpFooSf5value_13TangentVectorQzSfc8pullbacktyF : $@convention(method) <τ_0_0 where τ_0_0 : P> (@in_guaranteed τ_0_0) -> (Float, @owned @callee_guaranteed (Float) -> @out τ_0_0.TangentVector) // user: %2
  %2 = apply %1<τ_0_0>(%0) : $@convention(method) <τ_0_0 where τ_0_0 : P> (@in_guaranteed τ_0_0) -> (Float, @owned @callee_guaranteed (Float) -> @out τ_0_0.TangentVector) // user: %3
  return %2 : $(Float, @callee_guaranteed (Float) -> @out τ_0_0.TangentVector) // id: %3
} // end sil function 'AD__$s1p1PPAAE3fooSfyF__vjp_src_0_wrt_0'

Assertion failed: (!entry->getValue() && "function already exists"), function create, file /Users/danielzheng/swift-bart/swift/lib/SIL/SILFunction.cpp, line 74.
Stack dump:
0.	Program arguments: /Users/danielzheng/swift-bart/build/Ninja-ReleaseAssert+stdlib-Release/swift-macosx-x86_64/bin/swift -frontend -interpret p.swift -enable-objc-interop -sdk /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -color-diagnostics -module-name p
1.	Swift version 5.1.1-dev (Swift f70940798d)
2.	While running pass #31 SILModuleTransform "Differentiation".
3.	While processing // differentiability witness for P.foo()
sil_differentiability_witness hidden [parameters 0] [results 0] @$s1p1PPAAE3fooSfyF : $@convention(method) <Self where Self : P> (@in_guaranteed Self) -> Float {
}

According to the semantics of @differentiable, the correct fix seems to be to "find a way to continue generating a single differentiability witness" for this program. I briefly tried various techniques without success. I can try again, if we agree this is the right approach.

Tangents:

Derivative generic signature mangling (TF-860) is not a robust fix for this program because, with the current logic, it would actually trigger VJP generation for the first differentiability witness, which seems surprising (P.foo may not even be differentiable).
Current crashers like TF-1037 should be fixed by creating multiple differentiability witnesses when @differentiable and @derivative derivative generic signatures are truly different.

rxwei · 2019-12-06T22:39:10Z

If there's a parameter indices mismatch or generic signature mismatch between a @differentiable and a @derivative(of:), they should be treated as being completely separate.

If parameter indices and generic signatures do match, then preventively check whether such a differentiability witness already exists before adding a new one.

marcrasi · 2019-12-06T22:42:36Z

include/swift/AST/ASTContext.h

+  llvm::DenseMap<
+      std::tuple<Decl *, IndexSubset *, AutoDiffDerivativeFunctionKind>,
+      DerivativeAttr *>
+      DerivativeAttrs;


Would it be possible to use the derivative config list that you created in the previous serialization PR, so that we do not have to maintain another list?

Could the derivative config list actually also replace DifferentiableAttrs?

I wondered the same. The derivative function configuration doesn't currently store whether the configuration came from (a @differentiable attribute or @derivative attribute), and that information is important for users of DifferentiableAttrs and DerivativeAttrs, which check duplicate attributes of a specific kind, not just configurations.

I haven't thought deeply about this. I'll file an issue tracking this question if it isn't resolved by the time this PR is merged.

Edit: filed TF-1042 to track Investigate removing/moving ASTContext::{Differentiable,Derivative}Attrs.

Using AbstractFunctionDecl::getDerivativeFunctionConfigurations to detect duplicate @differentiable and @derivative attributes may be significant for cross-file duplicate derivative registration (TF-1021).

If you import a derivative for func foo, you shouldn't be able to register a new derivative for func foo with the same configuration.

This requires changing AbstractFunctionDecl::getDerivativeFunctionConfigurations to return more information than ArrayRef<AutoDiffConfig>.

Minimally, it needs to return an OptionSet per AutoDiffConfig, specifying where the AutoDiffConfig came from:

@differentiable attribute

@derivative JVP

@derivative VJP

Any combination of the above (three bits)

For good "duplicate attribute" diagnostics, it also needs to store sth from which we can get an @differentiable/@derivative attribute SourceLoc.

dup.swift:1:2: error: duplicate '@differentiable' attribute with same parameters @differentiable ~^~~~~~~~~~~~~~ dup.swift:2:2: note: other attribute declared here << need SourceLoc to generate this note @differentiable ^

lib/TBDGen/TBDGen.cpp

- Revert `@differentiable` + `@derivative` attribute type-checking changes. `@differentiable` + `@derivative` attribute with the same original declaration and parameter indices are not in conflict. - Simplify TBDGen for AutoDiff symbols using `AbstractFunctionDecl::getDerivativeFunctionConfigurations`. - Add TF-1037 negative tests.

dan-zheng · 2019-12-07T02:48:47Z

Extended test suite fails weirdly:

/swift-models/FastStyleTransfer/Models/TransformerNet.swift:49:32: error: type of expression is ambiguous without more context
        let convolved1 = input.sequenced(through: conv1, in1, relu)
                         ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/swift-models/FastStyleTransfer/Models/TransformerNet.swift:91:22: error: type of expression is ambiguous without more context
        return input.sequenced(through: reflectionPad, conv2d)
               ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/swift-models/FastStyleTransfer/Models/TransformerNet.swift:125:30: error: type of expression is ambiguous without more context
        return input + input.sequenced(through: conv1, in1, relu, conv2, in2)
                       ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/swift-models/FastStyleTransfer/Models/TransformerNet.swift:174:29: error: type of expression is ambiguous without more context
        return resizedInput.sequenced(through: reflectionPad, conv2d)
               ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This may be because this PR updates the commit hash for tensorflow/swift-apis. The type of expression is ambiguous without more context errors may be unrelated to this patch - I'll create a PR that just updates tensorflow/swift-apis to verify.

…ary files.

dan-zheng · 2019-12-08T21:26:11Z

type of expression is ambiguous without more context errors fixed in 571e9d9.

The fix is: protocol witness @differentiable attributes in non-primary-files need to be type-checked via DifferentiableAttr::getParameterIndices so that the witness derivative configurations are updated. This makes sense, and similar logic exists in SILFunctionBuilder::addFunctionAttributes.

Ideally, the diagnostics should have read candidate is missing attribute '@differentiable'. Poor diagnostics involving associated type inference are tracked by TF-1014. I investigated TF-1014 and left some details in a comment.

dan-zheng · 2019-12-08T21:29:06Z

lib/Sema/TypeCheckProtocol.cpp

-        auto implicitDiffAttr = false;
-        if (reqDiffAttrSupersetMatch) {
-          auto *witnessAFD = cast<AbstractFunctionDecl>(witness);
+      (void)reqDiffAttr->getParameterIndices();


Note: I found that calling DifferentiableAttr::getParameterIndices on witness attributes was sufficient to make test/AutoDiff/differentiable_attr_type_checking_primary_file.swift pass. For good measure, I called it on requirement attributes too.

dan-zheng · 2019-12-09T04:55:29Z

@swift-ci Please clean test tensorflow

Fix test/Serialization/derivative_attr.swift.

dan-zheng · 2019-12-09T18:24:20Z

@swift-ci Please test tensorflow

lib/Sema/TypeCheckAttr.cpp

dan-zheng · 2019-12-09T19:32:07Z

@swift-ci Please test tensorflow

rxwei · 2019-12-09T19:55:16Z

Does this implementation handle duplicate but private-to-separate-files derivatives?

dan-zheng · 2019-12-09T20:18:47Z

Does this implementation handle duplicate but private-to-separate-files derivatives?

I believe the @derivative attribute SILGen logic should work correctly for that case: the SILModule for each file should have its own differentiability witness.

The same-file derivative registration limitation hasn't been lifted from Sema yet (TF-1021), but I verified that it can be lifted trivially after this patch:

dan-zheng@enable-retrodiff (three file test)
dan-zheng@enable-retrodiff-duplicate (four file test, duplicate derivatives for the same original function)

Lifting the restriction requires more testing and will be the next follow-up after this patch.

dan-zheng · 2019-12-09T21:04:35Z

This patch causes a fastai/fastai_dev breakage:

// AD__$s27FastaiNotebook_07_batchnorm22LearningPhaseDependentPAAE7forwardy6OutputQz5InputQzF__vjp_src_0_wrt_0_1
sil [always_inline] [ossa] @AD__$s27FastaiNotebook_07_batchnorm22LearningPhaseDependentPAAE7forwardy6OutputQz5InputQzF__vjp_src_0_wrt_0_1 : $@convention(method) <Ï„_0_0 where Ï„_0_0 : LearningPhaseDependent> (@in_guaranteed Ï„_0_0.Input, @in_guaranteed Ï„_0_0) -> (@out Ï„_0_0.Output, @owned @callee_guaranteed (@in_guaranteed Ï„_0_0.Output.TangentVector) -> (@out Ï„_0_0.Input.TangentVector, @out Ï„_0_0.TangentVector)) {
// %0                                             // user: %4
// %1                                             // user: %4
// %2                                             // user: %4
bb0(%0 : $*Ï„_0_0.Output, %1 : $*Ï„_0_0.Input, %2 : $*Ï„_0_0):
  // function_ref LearningPhaseDependent.gradForward(_:)
  %3 = function_ref @$s27FastaiNotebook_07_batchnorm22LearningPhaseDependentPAAE11gradForwardy6OutputQz5value_13TangentVectorQz_5Input_AHQZtAE_AHQZc8pullbacktAJQzF : $@convention(method) <Ï„_0_0 where Ï„_0_0 : LearningPhaseDependent> (@in_guaranteed Ï„_0_0.Input, @in_guaranteed Ï„_0_0) -> (@out Ï„_0_0.Output, @owned @callee_guaranteed (@in_guaranteed Ï„_0_0.Output.TangentVector) -> (@out Ï„_0_0.TangentVector, @out Ï„_0_0.Input.TangentVector)) // user: %4
  %4 = apply %3<Ï„_0_0>(%0, %1, %2) : $@convention(method) <Ï„_0_0 where Ï„_0_0 : LearningPhaseDependent> (@in_guaranteed Ï„_0_0.Input, @in_guaranteed Ï„_0_0) -> (@out Ï„_0_0.Output, @owned @callee_guaranteed (@in_guaranteed Ï„_0_0.Output.TangentVector) -> (@out Ï„_0_0.TangentVector, @out Ï„_0_0.Input.TangentVector)) // user: %6
  // function_ref AD__$s6Output_13TangentVectorQZABQz5Input_ABQZIegnrr_AcfDIegnrr_27FastaiNotebook_07_batchnorm22LearningPhaseDependentRzlTR_pullback_self_reordering_thunk
  %5 = function_ref @AD__$s6Output_13TangentVectorQZABQz5Input_ABQZIegnrr_AcfDIegnrr_27FastaiNotebook_07_batchnorm22LearningPhaseDependentRzlTR_pullback_self_reordering_thunk : $@convention(thin) <Ï„_0_0 where Ï„_0_0 : LearningPhaseDependent> (@in_guaranteed Ï„_0_0.Output.TangentVector, @guaranteed @callee_guaranteed (@in_guaranteed Ï„_0_0.Output.TangentVector) -> (@out Ï„_0_0.TangentVector, @out Ï„_0_0.Input.TangentVector)) -> (@out Ï„_0_0.Input.TangentVector, @out Ï„_0_0.TangentVector) // user: %6
  %6 = partial_apply [callee_guaranteed] %5<Ï„_0_0>(%4) : $@convention(thin) <Ï„_0_0 where Ï„_0_0 : LearningPhaseDependent> (@in_guaranteed Ï„_0_0.Output.TangentVector, @guaranteed @callee_guaranteed (@in_guaranteed Ï„_0_0.Output.TangentVector) -> (@out Ï„_0_0.TangentVector, @out Ï„_0_0.Input.TangentVector)) -> (@out Ï„_0_0.Input.TangentVector, @out Ï„_0_0.TangentVector) // user: %7
  return %6 : $@callee_guaranteed (@in_guaranteed Ï„_0_0.Output.TangentVector) -> (@out Ï„_0_0.Input.TangentVector, @out Ï„_0_0.TangentVector) // id: %7
} // end sil function 'AD__$s27FastaiNotebook_07_batchnorm22LearningPhaseDependentPAAE7forwardy6OutputQz5InputQzF__vjp_src_0_wrt_0_1'

swift: /swift-base/swift/lib/SIL/SILFunction.cpp:74: static swift::SILFunction *swift::SILFunction::create(swift::SILModule &, swift::SILLinkage, llvm::StringRef, swift::CanSILFunctionType, swift::GenericEnvironment *, Optional<swift::SILLocation>, swift::IsBare_t, swift::IsTransparent_t, swift::IsSerialized_t, swift::ProfileCounter, swift::IsDynamicallyReplaceable_t, swift::IsExactSelfClass_t, swift::IsThunk_t, swift::SubclassScope, swift::Inline_t, swift::EffectsKind, swift::SILFunction *, const swift::SILDebugScope *): Assertion `!entry->getValue() && "function already exists"' failed.
Stack dump:
0.	Program arguments: /swift-tensorflow-toolchain/usr/bin/swift -frontend -c /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/00_load_data.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/01_matmul.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/01a_fastai_layers.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/02_fully_connected.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/02a_why_sqrt5.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/03_minibatch_training.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/04_callbacks.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/05_anneal.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/05b_early_stopping.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/06_cuda.swift -primary-file /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/07_batchnorm.swift -emit-module-path /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/FastaiNotebook_07_batchnorm.build/07_batchnorm~partial.swiftmodule -emit-module-doc-path /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/FastaiNotebook_07_batchnorm.build/07_batchnorm~partial.swiftdoc -emit-module-source-info-path /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/FastaiNotebook_07_batchnorm.build/07_batchnorm~partial.swiftsourceinfo -emit-dependencies-path /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/FastaiNotebook_07_batchnorm.build/07_batchnorm.d -emit-reference-dependencies-path /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/FastaiNotebook_07_batchnorm.build/07_batchnorm.swiftdeps -target x86_64-unknown-linux-gnu -disable-objc-interop -I /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug -enable-testing -g -module-cache-path /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/ModuleCache -swift-version 4.2 -Onone -D SWIFT_PACKAGE -D DEBUG -enable-anonymous-context-mangled-names -parse-as-library -module-name FastaiNotebook_07_batchnorm -o /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/FastaiNotebook_07_batchnorm.build/07_batchnorm.swift.o -index-store-path /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/index/store -index-system-modules 
1.	Swift version 5.1.1-dev (Swift 571e9d9692)
2.	While running pass #4630 SILModuleTransform "Differentiation".
3.	While processing // differentiability witness for LearningPhaseDependent.forward(_:)
sil_differentiability_witness [serialized] [parameters 0 1] [results 0] @$s27FastaiNotebook_07_batchnorm22LearningPhaseDependentPAAE7forwardy6OutputQz5InputQzF : $@convention(method) <Self where Self : LearningPhaseDependent> (@in_guaranteed Self.Input, @in_guaranteed Self) -> @out Self.Output {
}

 on SIL function "@$s27FastaiNotebook_07_batchnorm22LearningPhaseDependentPAAE7forwardy6OutputQz5InputQzF".
 for 'forward(_:)' (at /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/07_batchnorm.swift:32:12)
4.	While creating SIL function "@AD__$s27FastaiNotebook_07_batchnorm22LearningPhaseDependentPAAE7forwardy6OutputQz5InputQzF__vjp_src_0_wrt_0_1".
 for 'gradForward(_:)' (at /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/07_batchnorm.swift:40:5)

The issue seems similar to #28621 (comment) and should be fixable with a patch to fastai/fastai_dev. I'll investigate.

Workaround in fastai/fastai_dev#309.
TF-1037 tracks the underlying issue (@differentiable and @derivative attributes with different derivative generic signatures); negative tests for TF-1037 are added in this patch.

Work around issues with `@differentiable` + `@derivative` attributes with different derivative generic signatures. Related discussion: swiftlang/swift#28621 (comment) https://bugs.swift.org/browse/TF-1037 tracks this issue.

…_:)`. Work around issues with `@differentiable` + `@derivative` attributes with different derivative generic signatures. Automatic differentiation can handle this enum `switch` now, so a custom derivative is no longer necessary. https://bugs.swift.org/browse/TF-1037 tracks this issue. Related discussion: swiftlang/swift#28621 (comment)

…_:)`. (#309) Work around issues with `@differentiable` + `@derivative` attributes with different derivative generic signatures. Automatic differentiation can handle this enum `switch` now, so a custom derivative is no longer necessary. https://bugs.swift.org/browse/TF-1037 tracks this issue. Related discussion: swiftlang/swift#28621 (comment)

rxwei · 2019-12-09T23:18:26Z

Your test cases didn’t cover the scenario where duplicate ‘@Derivative(of:)’s for the same function are defined in different files in the same Swift module.

rxwei · 2019-12-09T23:28:20Z

Once the restriction is lifted, the following should compile without issues:

// File A.swift
func foo(x: Float) -> Float
// File B.swift
@derivative(of: foo)
private func foo(x: Float) -> (value: Float, differential: ...)
// File C.swift
@derivative(of: foo)
private func foo(x: Float) -> (value: Float, differential: ...)

Only derivative registrations in the same Swift module that have overlapping access levels should be diagnosed.

dan-zheng · 2019-12-09T23:35:36Z

Only derivative registrations in the same Swift module that have overlapping access levels should be diagnosed.

Thanks for sharing the nuance, that makes sense! Let's definitely add a test for this in the proper patch for TF-1021.

dan-zheng · 2019-12-10T01:17:33Z

Confirmed that extended test suite passes. Merging!

dan-zheng added the tensorflow This is for "tensorflow" branch PRs. label Dec 6, 2019

dan-zheng requested review from rxwei and marcrasi December 6, 2019 21:35

Update checkout for tensorflow-swift-apis.

8415cbc

Fix "a derivative already exists" error for `Layer.inferring(from:)`.

dan-zheng commented Dec 6, 2019

View reviewed changes

lib/Sema/TypeCheckAttr.cpp Outdated Show resolved Hide resolved

Clarify comment in TBDGenVisitor::addDerivativeAttr.

f709407

marcrasi reviewed Dec 6, 2019

View reviewed changes

marcrasi mentioned this pull request Dec 7, 2019

[AutoDiff] simplify SILGen thunking and set correct thunk linkage #28616

Merged

dan-zheng added 2 commits December 6, 2019 17:37

Improve test name.

a26b443

dan-zheng requested a review from marcrasi December 7, 2019 02:07

dan-zheng mentioned this pull request Dec 7, 2019

Update checkout for tensorflow-swift-apis. #28631

Merged

Fix @differentiable attribute requirement type-checking in non-prim…

571e9d9

…ary files.

dan-zheng commented Dec 8, 2019

View reviewed changes

Implement @derivative attribut serialization.

979914e

Fix test/Serialization/derivative_attr.swift.

marcrasi approved these changes Dec 9, 2019

View reviewed changes

lib/Sema/TypeCheckAttr.cpp Outdated Show resolved Hide resolved

Address review feedback.

f9ced25

dan-zheng mentioned this pull request Dec 9, 2019

[Swift] Remove custom derivative for LearningPhaseDependent.forward. fastai/fastai_dev#309

Merged

dan-zheng merged commit c82b9f3 into tensorflow Dec 10, 2019

dan-zheng deleted the derivative-attr-silgen branch December 10, 2019 01:25

dan-zheng mentioned this pull request Dec 10, 2019

[SR-13670] Sema/SILGen: lower @derivative attribute directly to SIL differentiability witnesses #53952

Closed

[AutoDiff] Directly SILGen @derivative attributes to diff witnesses. #28621

[AutoDiff] Directly SILGen @derivative attributes to diff witnesses. #28621

Uh oh!

Conversation

dan-zheng commented Dec 6, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

rxwei commented Dec 6, 2019

Uh oh!

rxwei commented Dec 6, 2019

Uh oh!

dan-zheng commented Dec 6, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rxwei commented Dec 6, 2019

Uh oh!

marcrasi Dec 6, 2019

Choose a reason for hiding this comment

Uh oh!

dan-zheng Dec 6, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dan-zheng commented Dec 7, 2019

Uh oh!

dan-zheng commented Dec 8, 2019

Uh oh!

dan-zheng Dec 8, 2019

Choose a reason for hiding this comment

Uh oh!

dan-zheng commented Dec 9, 2019

Uh oh!

dan-zheng commented Dec 9, 2019

Uh oh!

Uh oh!

dan-zheng commented Dec 9, 2019

Uh oh!

rxwei commented Dec 9, 2019

Uh oh!

dan-zheng commented Dec 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dan-zheng commented Dec 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rxwei commented Dec 9, 2019

Uh oh!

rxwei commented Dec 9, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dan-zheng commented Dec 9, 2019

Uh oh!

dan-zheng commented Dec 10, 2019

Uh oh!

Uh oh!

[AutoDiff] Directly SILGen `@derivative` attributes to diff witnesses. #28621

[AutoDiff] Directly SILGen `@derivative` attributes to diff witnesses. #28621

dan-zheng commented Dec 6, 2019 •

edited

Loading

dan-zheng commented Dec 6, 2019 •

edited

Loading

dan-zheng Dec 6, 2019 •

edited

Loading

dan-zheng commented Dec 9, 2019 •

edited

Loading

dan-zheng commented Dec 9, 2019 •

edited

Loading

rxwei commented Dec 9, 2019 •

edited

Loading