Skip to content

functorch-based per sample gradients don't match with vanilla #511

Open
@ffuuugor

Description

@ffuuugor

See #510

One of the new tests introduced in PR #510 fails. When running a module with two different custom implementations of a "linear-like" layer, per sample gradients computed by functorch-based hooks don't match with per sample gradients obtained by microbatching.

Interesting observations:

  • gradients are mismatched for only one parameter tensor (out of 5)
  • gradients differs by the factor of 2 (with batch_size=64, so it's not it)

I've verified and I think the test is working correctly and the problem is likely genuine

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions