Add opacus grad_sampler compatibility with torch.cat #448

mvnelson422 · 2022-07-06T17:55:57Z

🚀 Feature

Please make the opacus grad_sampler compatible with torch.cat operations in activation functions

Motivation

I've been trying to use the grad_sampler module with networks containing the CReLU activation function. However, the CReLU activation functions concatenates the output of the layer with the negative of itself, thus doubling the effective output size of the layer. This can be very useful and space-saving in networks that tend to develop mirrored filters (see https://arxiv.org/pdf/1603.05201v2.pdf).

Furthermore, using the CReLU activation functions it is possible to initialize fully connected networks so that they appear linear at initialization (see photo in additional context). This has been shown to be an extremely powerful initialization pattern, allowing fully connected networks to be trained with over 200 layers. That's incredible! Typical fully connected networks often struggle to learn appreciably with only 20+ layers (see https://arxiv.org/pdf/1702.08591.pdf).

Because of the symmetric initialization pattern the discontinuities in the CReLU activation function (after symmetric initialization) are dramatically smaller than in comparable networks with ReLU other activation functions. I've been studying gradient conditioning and stability in a variety of architectures using opacus, but it's broken for activation functions that use torch.cat. In the case of CReLU weight.grad_sample returns something that is half the size of the weight itself (ignoring the batch size).

Pitch

Implementing (or fixing) opacus grad_sampler compatibility with torch.cat would allow it to be used with a wider variety of activation functions, including CReLU, which would be really cool (see motivation section).

I didn't file this as a bug report because I'm not sure that torch.cat compatibility was ever intentionally implemented.

Alternatives

I can't think of any alternatives

Additional context

ashkan-software · 2022-07-13T00:06:40Z

Hello,

Thank you for filing this issue and explaining it really well.

Can you please provide more details on the error you're getting? Specifically, can you provide a minimal reproducing example? We have collab templates for minimal example when you create the issue

ashkan-software added the enhancement New feature or request label Jul 12, 2022

ffuuugor self-assigned this Jul 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add opacus grad_sampler compatibility with torch.cat #448

Add opacus grad_sampler compatibility with torch.cat #448

mvnelson422 commented Jul 6, 2022 •

edited

Loading

ashkan-software commented Jul 13, 2022

Add opacus grad_sampler compatibility with torch.cat #448

Add opacus grad_sampler compatibility with torch.cat #448

Comments

mvnelson422 commented Jul 6, 2022 • edited Loading

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

ashkan-software commented Jul 13, 2022

mvnelson422 commented Jul 6, 2022 •

edited

Loading