Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To allow batchnorm layers with frozen running stats #472

Open
EmmyFang opened this issue Aug 17, 2022 · 3 comments
Open

To allow batchnorm layers with frozen running stats #472

EmmyFang opened this issue Aug 17, 2022 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@EmmyFang
Copy link

🚀 Feature

Is it possible for the privacy engine to allow for batchnorm layers when we freeze their running stats (i.e., all batchnorm layers in .eval() mode)?

Motivation

When we want to use transfer learning and the pretrained model has batchnorm layers, it would be helpful if we can still use the privacy engine by freezing their running stats and treating the stats as constants.

Pitch

It would be helpful if the privacy engine allows for modules like batchnorm with frozen running stats, especially when we want to use some pretrained models with batchnorm layers. Thank you in advance!

@pierrestock
Copy link
Contributor

Hey EmmyFang,

Thanks for your interest. I think that's a good idea of enhancement to include in a future release.

However, one question remains: how do we exactly initialize the (frozen) running statistics? If we use the defaults (mean to 0, std to 1), then we might as well replace this layer by a channel-wise affine transform (and either define its grad sample as usual or leverage functorch.

Pierre

@EmmyFang
Copy link
Author

Hi Pierre,

I'm not sure if I understood your question correctly. I think using channel-wise affine transform to replace BN layers is a good idea since the key is just to make sure we can (1) load the pretrained information, and (2) the running stats will not get further updated

I'm imaging something as follows:

  • initialize a model (with the BN modules) and load the pretrained weights
  • for each BN module in the original model, initialize channel-wise affine transformation module, load the mean and variance as well as the weight and bias from the BN module
  • replace the BN layers in the original model with the channel-wise affine transformation

The user can perform the first step to load the pretrained weights and then maybe ModuleValidator.fix() can provide the option to perform the second and third steps.

Thanks,
Emmy

@pierrestock
Copy link
Contributor

Hey Emmy,

Apologies for the late reply. Yes, you understood my point correctly. Will add this in our roadmap.

Thanks,
Pierre

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants