Skip to content

[AutoDiff] Directly SILGen @derivative attributes to diff witnesses. #28621

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Dec 10, 2019

Conversation

dan-zheng
Copy link
Contributor

@dan-zheng dan-zheng commented Dec 6, 2019

Previously, @derivative attribute type-checking created implicit
@differentiable attributes on the original declaration. This was a
longstanding hack powering @derivative attribute derivative registration.

#28608 made these changes:

  • Derivative function configurations (from @differentiable and @derivative
    attributes) are serialized in modules and are loaded from imported modules.
  • The differentiation transform uses these derivative function configurations
    for derivative function lookup instead of @differentiable attributes.

Now, @derivative attributes are directly lowered to differentiability
witnesses during SILGen, and implicit @differentiable attribute generation
is removed.

Resolves TF-835.

Unblocks TF-1021: lifting the "same-file derivative registration only"
limitation in @derivative attribute type-checking. This should be possible
without much work, but needs testing!

Exposes TF-1037: crash due to no derivative generic signature mangling (TF-680).
Exposes TF-1040: @differentiable attribute limitations for class methods.
Exposes TF-1041: untested protocol requirement @differentiable attribute
type-checking logic.

Previously, `@derivative` attribute type-checking created implicit
`@differentiable` attributes on the original declaration. This was a
longstanding hack powering `@derivative` attribute derivative registration.

#28608 made these changes:
- Derivative function configurations (from `@differentiable` and `@derivative`
  attributes) are serialized in modules and are loaded from imported modules.
- The differentiation transform uses these derivative function configurations
  for derivative function lookup instead of `@differentiable` attributes.

Now, `@derivative` attributes are directly lowered to differentiability
witnesses during SILGen, and implicit `@differentiable` attribute generation
is removed.

Type-checking changes:
- "Overlapping" `@differentiable` and `@derivative` attributes (for the same
  original declaration and parameter indices) are now disallowed. They
  semantically conflict because the first "requests derivative generation"
  while the second "registers a derivative".
  - "Overlapping" `@differentiable` and `@derivative` attributes are allowed
    for protocol requirements. Requirement `@differentiable` attributes
    mean "add JVP/VJP witness table entries" - not "request derivative
    generation", because there is no function body.
  - Note that relaxing the "overlapping" condition to consider derivative
    generic signatures is possible after derivative generic signature mangling
    for derivative functions: TF-680.

Resolves TF-835.

Unblocks TF-1021: lifting the "same-file derivative registration only"
limitation in `@derivative` attribute type-checking. This should be possible
without much work, but needs testing!

Exposes TF-1040: `@differentiable` attribute limitations for class methods.
Exposes TF-1041: untested protocol requirement `@differentiable` attribute
type-checking logic.
@dan-zheng dan-zheng added the tensorflow This is for "tensorflow" branch PRs. label Dec 6, 2019
@dan-zheng dan-zheng requested review from rxwei and marcrasi December 6, 2019 21:35
Fix "a derivative already exists" error for `Layer.inferring(from:)`.
@rxwei
Copy link
Contributor

rxwei commented Dec 6, 2019

"Overlapping" @differentiable and @derivative attributes (for the same
original declaration and parameter indices) are now disallowed.

This doesn't make sense to me.

In protocol conformances, it is perfectly legal to declare a conformance and then provide implementations in a different extension in the same module. Differentiable functions should behave the same way.

@rxwei
Copy link
Contributor

rxwei commented Dec 6, 2019

They semantically conflict because the first "requests derivative generation" while the second "registers a derivative".

The semantics of @differentiable is not requesting derivative generation. It is declaring a function to be differentiable. With a declaration, the compiler then tries to verify it, either by differentiating it, or by checking whether there's a derivative (@derivative(of:)) already.

@dan-zheng
Copy link
Contributor Author

dan-zheng commented Dec 6, 2019

The semantics of @differentiable is not requesting derivative generation. It is declaring a function to be differentiable. With a declaration, the compiler then tries to verify it, either by differentiating it, or by checking whether there's a derivative (@derivative(of:)) already.

That's a good point, I think I agree! This type-checking change seems misguided - the original motivation was to avoid a crash regarding "@differentiable + @derivative attribute for same original declaration and parameter indices but different derivative generic signature".

Reproducer:

protocol P: Differentiable {}
extension P {
  @differentiable
  func foo() -> Float { 1 }
}

extension P {
  @derivative(of: foo)
  func vjpFoo() -> (value: Float, pullback: (Float) -> (TangentVector)) {
    fatalError()
  }
}

Previously, @derivative attribute type-checking looked for a @differentiable attribute with the same parameter indices and filled in the JVP/VJP.

// differentiability witness for P.foo()
sil_differentiability_witness hidden [parameters 0] [results 0] @$s1p1PPAAE3fooSfyF : $@convention(method) <Self where Self : P> (@in_guaranteed Self) -> Float {
  vjp: @AD__$s1p1PPAAE3fooSfyF__vjp_src_0_wrt_0 : $@convention(method) <τ_0_0 where τ_0_0 : P> (@in_guaranteed τ_0_0) -> (Float, @owned @callee_guaranteed (Float) -> @out τ_0_0.TangentVector)
}

This generated a single differentiability witness, no problem.


However, with the current logic in this patch: the @differentiable attribute has no derivative generic signature, but the @derivative attribute has a derivative generic signature (the generic signature of the @derivative declaration is used, which is <Self where Self : P>).

This causes two differentiability witnesses to be SILGen'd, each with a different derivative generic signature:

// differentiability witness for P.foo()
sil_differentiability_witness hidden [parameters 0] [results 0] @$s1p1PPAAE3fooSfyF : $@convention(method) <Self where Self : P> (@in_guaranteed Self) -> Float {
}

// differentiability witness for P.foo()
sil_differentiability_witness hidden [parameters 0] [results 0] <Self where Self : P> @$s1p1PPAAE3fooSfyF : $@convention(method) <Self where Self : P> (@in_guaranteed Self) -> Float {
  vjp: @AD__$s1p1PPAAE3fooSfyF__vjp_src_0_wrt_0 : $@convention(method) <τ_0_0 where τ_0_0 : P> (@in_guaranteed τ_0_0) -> (Float, @owned @callee_guaranteed (Float) -> @out τ_0_0.TangentVector)
}

Since derivative generic signature mangling is not yet implemented for derivative functions (TF-680), there's a name conflict issue for derivative functions with the same parameter indices but different derivative generic signatures.

SILGen generates a VJP thunk for the second differentiability witness above called s1p1PPAAE3fooSfyF__vjp_src_0_wrt_0. The differentiation transform attempts to create an empty VJP function for the first differentiability witness above, but encounters a name clash:

// AD__$s1p1PPAAE3fooSfyF__vjp_src_0_wrt_0
sil hidden [thunk] [always_inline] [ossa] @AD__$s1p1PPAAE3fooSfyF__vjp_src_0_wrt_0 : $@convention(method) <τ_0_0 where τ_0_0 : P> (@in_guaranteed τ_0_0) -> (Float, @owned @callee_guaranteed (Float) -> @out τ_0_0.TangentVector) {
// %0                                             // user: %2
bb0(%0 : $*τ_0_0):
  // function_ref P.vjpFoo()
  %1 = function_ref @$s1p1PPAAE6vjpFooSf5value_13TangentVectorQzSfc8pullbacktyF : $@convention(method) <τ_0_0 where τ_0_0 : P> (@in_guaranteed τ_0_0) -> (Float, @owned @callee_guaranteed (Float) -> @out τ_0_0.TangentVector) // user: %2
  %2 = apply %1<τ_0_0>(%0) : $@convention(method) <τ_0_0 where τ_0_0 : P> (@in_guaranteed τ_0_0) -> (Float, @owned @callee_guaranteed (Float) -> @out τ_0_0.TangentVector) // user: %3
  return %2 : $(Float, @callee_guaranteed (Float) -> @out τ_0_0.TangentVector) // id: %3
} // end sil function 'AD__$s1p1PPAAE3fooSfyF__vjp_src_0_wrt_0'

Assertion failed: (!entry->getValue() && "function already exists"), function create, file /Users/danielzheng/swift-bart/swift/lib/SIL/SILFunction.cpp, line 74.
Stack dump:
0.	Program arguments: /Users/danielzheng/swift-bart/build/Ninja-ReleaseAssert+stdlib-Release/swift-macosx-x86_64/bin/swift -frontend -interpret p.swift -enable-objc-interop -sdk /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -color-diagnostics -module-name p
1.	Swift version 5.1.1-dev (Swift f70940798d)
2.	While running pass #31 SILModuleTransform "Differentiation".
3.	While processing // differentiability witness for P.foo()
sil_differentiability_witness hidden [parameters 0] [results 0] @$s1p1PPAAE3fooSfyF : $@convention(method) <Self where Self : P> (@in_guaranteed Self) -> Float {
}

According to the semantics of @differentiable, the correct fix seems to be to "find a way to continue generating a single differentiability witness" for this program. I briefly tried various techniques without success. I can try again, if we agree this is the right approach.


Tangents:

  • Derivative generic signature mangling (TF-860) is not a robust fix for this program because, with the current logic, it would actually trigger VJP generation for the first differentiability witness, which seems surprising (P.foo may not even be differentiable).

  • Current crashers like TF-1037 should be fixed by creating multiple differentiability witnesses when @differentiable and @derivative derivative generic signatures are truly different.

@rxwei
Copy link
Contributor

rxwei commented Dec 6, 2019

If there's a parameter indices mismatch or generic signature mismatch between a @differentiable and a @derivative(of:), they should be treated as being completely separate.

If parameter indices and generic signatures do match, then preventively check whether such a differentiability witness already exists before adding a new one.

llvm::DenseMap<
std::tuple<Decl *, IndexSubset *, AutoDiffDerivativeFunctionKind>,
DerivativeAttr *>
DerivativeAttrs;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to use the derivative config list that you created in the previous serialization PR, so that we do not have to maintain another list?

Could the derivative config list actually also replace DifferentiableAttrs?

Copy link
Contributor Author

@dan-zheng dan-zheng Dec 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered the same. The derivative function configuration doesn't currently store whether the configuration came from (a @differentiable attribute or @derivative attribute), and that information is important for users of DifferentiableAttrs and DerivativeAttrs, which check duplicate attributes of a specific kind, not just configurations.

I haven't thought deeply about this. I'll file an issue tracking this question if it isn't resolved by the time this PR is merged.


Edit: filed TF-1042 to track Investigate removing/moving ASTContext::{Differentiable,Derivative}Attrs.

Using AbstractFunctionDecl::getDerivativeFunctionConfigurations to detect duplicate @differentiable and @derivative attributes may be significant for cross-file duplicate derivative registration (TF-1021).

If you import a derivative for func foo, you shouldn't be able to register a new derivative for func foo with the same configuration.

This requires changing AbstractFunctionDecl::getDerivativeFunctionConfigurations to return more information than ArrayRef<AutoDiffConfig>.

  • Minimally, it needs to return an OptionSet per AutoDiffConfig, specifying where the AutoDiffConfig came from:
    • @differentiable attribute
    • @derivative JVP
    • @derivative VJP
    • Any combination of the above (three bits)
  • For good "duplicate attribute" diagnostics, it also needs to store sth from which we can get an @differentiable/@derivative attribute SourceLoc.
dup.swift:1:2: error: duplicate '@differentiable' attribute with same parameters
@differentiable
~^~~~~~~~~~~~~~
dup.swift:2:2: note: other attribute declared here << need SourceLoc to generate this note
@differentiable
 ^

- Revert `@differentiable` + `@derivative` attribute type-checking changes.
  `@differentiable` + `@derivative` attribute with the same original declaration
  and parameter indices are not in conflict.
- Simplify TBDGen for AutoDiff symbols using
  `AbstractFunctionDecl::getDerivativeFunctionConfigurations`.
- Add TF-1037 negative tests.
@dan-zheng dan-zheng requested a review from marcrasi December 7, 2019 02:07
@dan-zheng
Copy link
Contributor Author

Extended test suite fails weirdly:

/swift-models/FastStyleTransfer/Models/TransformerNet.swift:49:32: error: type of expression is ambiguous without more context
        let convolved1 = input.sequenced(through: conv1, in1, relu)
                         ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/swift-models/FastStyleTransfer/Models/TransformerNet.swift:91:22: error: type of expression is ambiguous without more context
        return input.sequenced(through: reflectionPad, conv2d)
               ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/swift-models/FastStyleTransfer/Models/TransformerNet.swift:125:30: error: type of expression is ambiguous without more context
        return input + input.sequenced(through: conv1, in1, relu, conv2, in2)
                       ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/swift-models/FastStyleTransfer/Models/TransformerNet.swift:174:29: error: type of expression is ambiguous without more context
        return resizedInput.sequenced(through: reflectionPad, conv2d)
               ~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This may be because this PR updates the commit hash for tensorflow/swift-apis. The type of expression is ambiguous without more context errors may be unrelated to this patch - I'll create a PR that just updates tensorflow/swift-apis to verify.

@dan-zheng
Copy link
Contributor Author

type of expression is ambiguous without more context errors fixed in 571e9d9.

The fix is: protocol witness @differentiable attributes in non-primary-files need to be type-checked via DifferentiableAttr::getParameterIndices so that the witness derivative configurations are updated. This makes sense, and similar logic exists in SILFunctionBuilder::addFunctionAttributes.


Ideally, the diagnostics should have read candidate is missing attribute '@differentiable'. Poor diagnostics involving associated type inference are tracked by TF-1014. I investigated TF-1014 and left some details in a comment.

auto implicitDiffAttr = false;
if (reqDiffAttrSupersetMatch) {
auto *witnessAFD = cast<AbstractFunctionDecl>(witness);
(void)reqDiffAttr->getParameterIndices();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I found that calling DifferentiableAttr::getParameterIndices on witness attributes was sufficient to make test/AutoDiff/differentiable_attr_type_checking_primary_file.swift pass. For good measure, I called it on requirement attributes too.

@dan-zheng
Copy link
Contributor Author

@swift-ci Please clean test tensorflow

Fix test/Serialization/derivative_attr.swift.
@dan-zheng
Copy link
Contributor Author

@swift-ci Please test tensorflow

@dan-zheng
Copy link
Contributor Author

@swift-ci Please test tensorflow

@rxwei
Copy link
Contributor

rxwei commented Dec 9, 2019

Does this implementation handle duplicate but private-to-separate-files derivatives?

@dan-zheng
Copy link
Contributor Author

dan-zheng commented Dec 9, 2019

Does this implementation handle duplicate but private-to-separate-files derivatives?

I believe the @derivative attribute SILGen logic should work correctly for that case: the SILModule for each file should have its own differentiability witness.

The same-file derivative registration limitation hasn't been lifted from Sema yet (TF-1021), but I verified that it can be lifted trivially after this patch:

Lifting the restriction requires more testing and will be the next follow-up after this patch.

@dan-zheng
Copy link
Contributor Author

dan-zheng commented Dec 9, 2019

This patch causes a fastai/fastai_dev breakage:

// AD__$s27FastaiNotebook_07_batchnorm22LearningPhaseDependentPAAE7forwardy6OutputQz5InputQzF__vjp_src_0_wrt_0_1
sil [always_inline] [ossa] @AD__$s27FastaiNotebook_07_batchnorm22LearningPhaseDependentPAAE7forwardy6OutputQz5InputQzF__vjp_src_0_wrt_0_1 : $@convention(method) <τ_0_0 where τ_0_0 : LearningPhaseDependent> (@in_guaranteed τ_0_0.Input, @in_guaranteed τ_0_0) -> (@out τ_0_0.Output, @owned @callee_guaranteed (@in_guaranteed τ_0_0.Output.TangentVector) -> (@out τ_0_0.Input.TangentVector, @out τ_0_0.TangentVector)) {
// %0                                             // user: %4
// %1                                             // user: %4
// %2                                             // user: %4
bb0(%0 : $*τ_0_0.Output, %1 : $*τ_0_0.Input, %2 : $*τ_0_0):
  // function_ref LearningPhaseDependent.gradForward(_:)
  %3 = function_ref @$s27FastaiNotebook_07_batchnorm22LearningPhaseDependentPAAE11gradForwardy6OutputQz5value_13TangentVectorQz_5Input_AHQZtAE_AHQZc8pullbacktAJQzF : $@convention(method) <τ_0_0 where τ_0_0 : LearningPhaseDependent> (@in_guaranteed τ_0_0.Input, @in_guaranteed τ_0_0) -> (@out τ_0_0.Output, @owned @callee_guaranteed (@in_guaranteed τ_0_0.Output.TangentVector) -> (@out τ_0_0.TangentVector, @out τ_0_0.Input.TangentVector)) // user: %4
  %4 = apply %3<τ_0_0>(%0, %1, %2) : $@convention(method) <τ_0_0 where τ_0_0 : LearningPhaseDependent> (@in_guaranteed τ_0_0.Input, @in_guaranteed τ_0_0) -> (@out τ_0_0.Output, @owned @callee_guaranteed (@in_guaranteed τ_0_0.Output.TangentVector) -> (@out τ_0_0.TangentVector, @out τ_0_0.Input.TangentVector)) // user: %6
  // function_ref AD__$s6Output_13TangentVectorQZABQz5Input_ABQZIegnrr_AcfDIegnrr_27FastaiNotebook_07_batchnorm22LearningPhaseDependentRzlTR_pullback_self_reordering_thunk
  %5 = function_ref @AD__$s6Output_13TangentVectorQZABQz5Input_ABQZIegnrr_AcfDIegnrr_27FastaiNotebook_07_batchnorm22LearningPhaseDependentRzlTR_pullback_self_reordering_thunk : $@convention(thin) <τ_0_0 where τ_0_0 : LearningPhaseDependent> (@in_guaranteed τ_0_0.Output.TangentVector, @guaranteed @callee_guaranteed (@in_guaranteed τ_0_0.Output.TangentVector) -> (@out τ_0_0.TangentVector, @out τ_0_0.Input.TangentVector)) -> (@out τ_0_0.Input.TangentVector, @out τ_0_0.TangentVector) // user: %6
  %6 = partial_apply [callee_guaranteed] %5<τ_0_0>(%4) : $@convention(thin) <τ_0_0 where τ_0_0 : LearningPhaseDependent> (@in_guaranteed τ_0_0.Output.TangentVector, @guaranteed @callee_guaranteed (@in_guaranteed τ_0_0.Output.TangentVector) -> (@out τ_0_0.TangentVector, @out τ_0_0.Input.TangentVector)) -> (@out τ_0_0.Input.TangentVector, @out τ_0_0.TangentVector) // user: %7
  return %6 : $@callee_guaranteed (@in_guaranteed τ_0_0.Output.TangentVector) -> (@out τ_0_0.Input.TangentVector, @out τ_0_0.TangentVector) // id: %7
} // end sil function 'AD__$s27FastaiNotebook_07_batchnorm22LearningPhaseDependentPAAE7forwardy6OutputQz5InputQzF__vjp_src_0_wrt_0_1'

swift: /swift-base/swift/lib/SIL/SILFunction.cpp:74: static swift::SILFunction *swift::SILFunction::create(swift::SILModule &, swift::SILLinkage, llvm::StringRef, swift::CanSILFunctionType, swift::GenericEnvironment *, Optional<swift::SILLocation>, swift::IsBare_t, swift::IsTransparent_t, swift::IsSerialized_t, swift::ProfileCounter, swift::IsDynamicallyReplaceable_t, swift::IsExactSelfClass_t, swift::IsThunk_t, swift::SubclassScope, swift::Inline_t, swift::EffectsKind, swift::SILFunction *, const swift::SILDebugScope *): Assertion `!entry->getValue() && "function already exists"' failed.
Stack dump:
0.	Program arguments: /swift-tensorflow-toolchain/usr/bin/swift -frontend -c /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/00_load_data.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/01_matmul.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/01a_fastai_layers.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/02_fully_connected.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/02a_why_sqrt5.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/03_minibatch_training.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/04_callbacks.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/05_anneal.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/05b_early_stopping.swift /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/06_cuda.swift -primary-file /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/07_batchnorm.swift -emit-module-path /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/FastaiNotebook_07_batchnorm.build/07_batchnorm~partial.swiftmodule -emit-module-doc-path /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/FastaiNotebook_07_batchnorm.build/07_batchnorm~partial.swiftdoc -emit-module-source-info-path /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/FastaiNotebook_07_batchnorm.build/07_batchnorm~partial.swiftsourceinfo -emit-dependencies-path /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/FastaiNotebook_07_batchnorm.build/07_batchnorm.d -emit-reference-dependencies-path /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/FastaiNotebook_07_batchnorm.build/07_batchnorm.swiftdeps -target x86_64-unknown-linux-gnu -disable-objc-interop -I /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug -enable-testing -g -module-cache-path /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/ModuleCache -swift-version 4.2 -Onone -D SWIFT_PACKAGE -D DEBUG -enable-anonymous-context-mangled-names -parse-as-library -module-name FastaiNotebook_07_batchnorm -o /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/FastaiNotebook_07_batchnorm.build/07_batchnorm.swift.o -index-store-path /fastai_dev/swift/FastaiNotebook_11_imagenette/.build/x86_64-unknown-linux-gnu/debug/index/store -index-system-modules 
1.	Swift version 5.1.1-dev (Swift 571e9d9692)
2.	While running pass #4630 SILModuleTransform "Differentiation".
3.	While processing // differentiability witness for LearningPhaseDependent.forward(_:)
sil_differentiability_witness [serialized] [parameters 0 1] [results 0] @$s27FastaiNotebook_07_batchnorm22LearningPhaseDependentPAAE7forwardy6OutputQz5InputQzF : $@convention(method) <Self where Self : LearningPhaseDependent> (@in_guaranteed Self.Input, @in_guaranteed Self) -> @out Self.Output {
}

 on SIL function "@$s27FastaiNotebook_07_batchnorm22LearningPhaseDependentPAAE7forwardy6OutputQz5InputQzF".
 for 'forward(_:)' (at /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/07_batchnorm.swift:32:12)
4.	While creating SIL function "@AD__$s27FastaiNotebook_07_batchnorm22LearningPhaseDependentPAAE7forwardy6OutputQz5InputQzF__vjp_src_0_wrt_0_1".
 for 'gradForward(_:)' (at /fastai_dev/swift/FastaiNotebook_07_batchnorm/Sources/FastaiNotebook_07_batchnorm/07_batchnorm.swift:40:5)

The issue seems similar to #28621 (comment) and should be fixable with a patch to fastai/fastai_dev. I'll investigate.


Workaround in fastai/fastai_dev#309.
TF-1037 tracks the underlying issue (@differentiable and @derivative attributes with different derivative generic signatures); negative tests for TF-1037 are added in this patch.

dan-zheng added a commit to fastai/fastai_dev that referenced this pull request Dec 9, 2019
Work around issues with `@differentiable` + `@derivative` attributes with
different derivative generic signatures.

Related discussion: swiftlang/swift#28621 (comment)
https://bugs.swift.org/browse/TF-1037 tracks this issue.
dan-zheng added a commit to fastai/fastai_dev that referenced this pull request Dec 9, 2019
…_:)`.

Work around issues with `@differentiable` + `@derivative` attributes with
different derivative generic signatures.

Automatic differentiation can handle this enum `switch` now, so a custom
derivative is no longer necessary.

https://bugs.swift.org/browse/TF-1037 tracks this issue.
Related discussion: swiftlang/swift#28621 (comment)
dan-zheng added a commit to fastai/fastai_dev that referenced this pull request Dec 9, 2019
…_:)`.

Work around issues with `@differentiable` + `@derivative` attributes with
different derivative generic signatures.

Automatic differentiation can handle this enum `switch` now, so a custom
derivative is no longer necessary.

https://bugs.swift.org/browse/TF-1037 tracks this issue.
Related discussion: swiftlang/swift#28621 (comment)
dan-zheng added a commit to fastai/fastai_dev that referenced this pull request Dec 9, 2019
…_:)`.

Work around issues with `@differentiable` + `@derivative` attributes with
different derivative generic signatures.

Automatic differentiation can handle this enum `switch` now, so a custom
derivative is no longer necessary.

https://bugs.swift.org/browse/TF-1037 tracks this issue.
Related discussion: swiftlang/swift#28621 (comment)
dan-zheng added a commit to fastai/fastai_dev that referenced this pull request Dec 9, 2019
…_:)`. (#309)

Work around issues with `@differentiable` + `@derivative` attributes with
different derivative generic signatures.

Automatic differentiation can handle this enum `switch` now, so a custom
derivative is no longer necessary.

https://bugs.swift.org/browse/TF-1037 tracks this issue.
Related discussion: swiftlang/swift#28621 (comment)
@rxwei
Copy link
Contributor

rxwei commented Dec 9, 2019

Your test cases didn’t cover the scenario where duplicate ‘@Derivative(of:)’s for the same function are defined in different files in the same Swift module.

@rxwei
Copy link
Contributor

rxwei commented Dec 9, 2019

Once the restriction is lifted, the following should compile without issues:

// File A.swift
func foo(x: Float) -> Float
// File B.swift
@derivative(of: foo)
private func foo(x: Float) -> (value: Float, differential: ...)
// File C.swift
@derivative(of: foo)
private func foo(x: Float) -> (value: Float, differential: ...)

Only derivative registrations in the same Swift module that have overlapping access levels should be diagnosed.

@dan-zheng
Copy link
Contributor Author

Only derivative registrations in the same Swift module that have overlapping access levels should be diagnosed.

Thanks for sharing the nuance, that makes sense! Let's definitely add a test for this in the proper patch for TF-1021.

@dan-zheng
Copy link
Contributor Author

Confirmed that extended test suite passes. Merging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tensorflow This is for "tensorflow" branch PRs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants