Skip to content

[AutoDiff] Serialize derivative function configurations per module. #28608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 6, 2019

Conversation

dan-zheng
Copy link
Contributor

@dan-zheng dan-zheng commented Dec 6, 2019

@differentiable and @derivative attributes register derivatives for
AbstractFunctionDecls for a particular "derivative function configuration":
parameter indices and derivative generic signature.

To find @derivative functions registered in other Swift modules, derivative
function configurations must be serialized per module. When configurations for
a AbstractFunctionDecl are requested, all configurations from imported
modules are deserialized. This module serialization technique has precedent: it
is used for protocol conformances (e.g. extension declarations for a nominal
type) and Obj-C members for a class type.

Add AbstractFunctionDecl::getDerivativeFunctionConfigurations entry point
for accessing derivative function configurations.

Use AbstractFunctionDecl::getDerivativeFunctionConfigurations to
implement findMinimalDerivativeConfiguration for canonical derivative
function configuration lookup, replacing getMinimalASTDifferentiableAttr.

Unblocks TF-835: lowering @derivative attributes directly to SIL
differentiability witnesses without generating implicit @differentiable
attributes.

@dan-zheng dan-zheng added the tensorflow This is for "tensorflow" branch PRs. label Dec 6, 2019
});
}

ArrayRef<AutoDiffConfig>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: even though @differentiable and @derivative attributes do not currently store result indices, I decided to use AutoDiffConfig (which contains result indices) to represent AST derivative function configurations. I also tried creating a new "ASTAutoDiffConfig" struct containing just parameter indices and derivative generic signature, but decided against the code duplication.

I think the direction is to eventually plumb result indices through the differentiation system (TF-1038), so preemptively using result indices (AutoDiffConfig) seems good.

@dan-zheng dan-zheng force-pushed the derivative-config-lookup branch 2 times, most recently from d38c72b to 7ab66b9 Compare December 6, 2019 14:08
`@differentiable` and `@derivative` attributes register derivatives for
`AbstractFunctionDecl`s for a particular "derivative function configuration":
parameter indices and dervative generic signature.

To find `@derivative` functions registered in other Swift modules, derivative
function configurations must be serialized per module. When configurations for
a `AbstractFunctionDecl` are requested, all configurations from imported
modules are deserialized. This module serialization technique has precedent: it
is used for protocol conformances (e.g. extension declarations for a nominal
type) and Obj-C members for a class type.

Add `AbstractFunctionDecl::getDerivativeFunctionConfigurations` entry point
for accessing derivative function configurations.

Use `AbstractFunctionDecl::getDerivativeFunctionConfigurations` to
implement `findMinimalDerivativeConfiguration` for canonical derivative
function configuration lookup, replacing `getMinimalASTDifferentiableAttr`.

Unblocks TF-815: lowering `@derivative` attributes directly to SIL
differentiability witnesses without generating implicit `@differentiable`
attributes.
@dan-zheng dan-zheng force-pushed the derivative-config-lookup branch from 7ab66b9 to ceb8ce0 Compare December 6, 2019 14:55
Copy link

@marcrasi marcrasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cooool!

Change `findMinimalDerivativeConfiguration` to return `Optional<AutoDiffConfig>`.
@dan-zheng
Copy link
Contributor Author

@swift-ci Please test tensorflow

@dan-zheng
Copy link
Contributor Author

Extended test suite passed. Merging to unblock progress.

@dan-zheng dan-zheng merged commit ee7644d into swiftlang:tensorflow Dec 6, 2019
@dan-zheng dan-zheng deleted the derivative-config-lookup branch December 6, 2019 21:09
@rxwei
Copy link
Contributor

rxwei commented Dec 6, 2019

Very nice. Thanks for pushing this through :)

dan-zheng added a commit to dan-zheng/swift that referenced this pull request Dec 6, 2019
Previously, `@derivative` attribute type-checking created implicit
`@differentiable` attributes on the original declaration. This was a
longstanding hack powering `@derivative` attribute derivative registration.

swiftlang#28608 made these changes:
- Derivative function configurations (from `@differentiable` and `@derivative`
  attributes) are serialized in modules and are loaded from imported modules.
- The differentiation transform uses these derivative function configurations
  for derivative function lookup instead of `@differentiable` attributes.

Now, `@derivative` attributes are directly lowered to differentiability
witnesses during SILGen, and implicit `@differentiable` attribute generation
is removed.

Type-checking changes:
- "Overlapping" `@differentiable` and `@derivative` attributes (for the same
  original declaration and parameter indices) are now disallowed. They
  semantically conflict because the first "requests derivative generation"
  while the second "registers a derivative".
  - "Overlapping" `@differentiable` and `@derivative` attributes are allowed
    for protocol requirements. Requirement `@differentiable` attributes
    mean "add JVP/VJP witness table entries" - not "request derivative
    generation", because there is no function body.
  - Note that relaxing the "overlapping" condition to consider derivative
    generic signatures is possible after derivative generic signature mangling
    for derivative functions: TF-680.

Resolves TF-835.

Unblocks TF-1021: lifting the "same-file derivative registration only"
limitation in `@derivative` attribute type-checking. This should be possible
without much work, but needs testing!

Exposes TF-1040: `@differentiable` attribute limitations for class methods.
Exposes TF-1041: untested protocol requirement `@differentiable` attribute
type-checking logic.
dan-zheng added a commit that referenced this pull request Dec 6, 2019
Previously, `@derivative` attribute type-checking created implicit
`@differentiable` attributes on the original declaration. This was a
longstanding hack powering `@derivative` attribute derivative registration.

#28608 made these changes:
- Derivative function configurations (from `@differentiable` and `@derivative`
  attributes) are serialized in modules and are loaded from imported modules.
- The differentiation transform uses these derivative function configurations
  for derivative function lookup instead of `@differentiable` attributes.

Now, `@derivative` attributes are directly lowered to differentiability
witnesses during SILGen, and implicit `@differentiable` attribute generation
is removed.

Type-checking changes:
- "Overlapping" `@differentiable` and `@derivative` attributes (for the same
  original declaration and parameter indices) are now disallowed. They
  semantically conflict because the first "requests derivative generation"
  while the second "registers a derivative".
  - "Overlapping" `@differentiable` and `@derivative` attributes are allowed
    for protocol requirements. Requirement `@differentiable` attributes
    mean "add JVP/VJP witness table entries" - not "request derivative
    generation", because there is no function body.
  - Note that relaxing the "overlapping" condition to consider derivative
    generic signatures is possible after derivative generic signature mangling
    for derivative functions: TF-680.

Resolves TF-835.

Unblocks TF-1021: lifting the "same-file derivative registration only"
limitation in `@derivative` attribute type-checking. This should be possible
without much work, but needs testing!

Exposes TF-1040: `@differentiable` attribute limitations for class methods.
Exposes TF-1041: untested protocol requirement `@differentiable` attribute
type-checking logic.
dan-zheng added a commit that referenced this pull request Dec 10, 2019
#28621)

Previously, `@derivative` attribute type-checking created implicit
`@differentiable` attributes on the original declaration. This was a
longstanding hack powering `@derivative` attribute derivative registration.

#28608 made these changes:
- Derivative function configurations (from `@differentiable` and `@derivative`
  attributes) are serialized in modules and are loaded from imported modules.
- The differentiation transform uses these derivative function configurations
  for derivative function lookup instead of `@differentiable` attributes.

Now, `@derivative` attributes are directly lowered to differentiability
witnesses during SILGen, and implicit `@differentiable` attribute generation
is removed. `@derivative` attributes are now serialized.

Resolves TF-835.

Unblocks TF-1021: lifting the "same-file derivative registration only"
limitation in `@derivative` attribute type-checking. This should be trivially
possible but requires more testing.

Exposes TF-1037: crash due to `@differentiable` + `@derivative` attribute with
different derivative generic signatures.

Exposes TF-1040: `@differentiable` attribute limitations for class methods.
Exposes TF-1041: untested protocol requirement `@differentiable` attribute
type-checking logic.

Tracks TF-1042: remove `ASTContext::{Differentiable,Derivative}Attrs` and use
`AbstractFunctionDecl::getDerivativeFunctionConfigurations` to detect duplicate
`@differentiable` + `@derivative` attributes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tensorflow This is for "tensorflow" branch PRs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants