Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GoogleNet and Inception return different types depending on train / eval mode #1273

Open
fmassa opened this issue Aug 29, 2019 · 3 comments
Open

Comments

@fmassa
Copy link
Member

fmassa commented Aug 29, 2019

Googlenet and Inception have two different return types, depending on train / eval mode.

This is necessary only during training, because the original models have auxiliary classifiers, which are not used for inference.

The current structure of the models might be seen as an artifact on how they were trained.
One other way of seeing it would be that, if we could return arbitrary intermediate outputs, then during training we could plug the auxiliary classifiers outside of the model, and perform training there, but the model by itself doesn't have those auxiliary classifiers.

We should consider if we would be willing to change this, which would be a BC-breaking change.

Advantages:

  • simplifies torchscript support for those models
  • unifies the return value of the model during training and testing

Disadvantages:

  • BC-breaking
  • training those models would require custom code to add the auxiliary heads. But this is not a real disadvantage because we haven't reproduced those models with the current training code anyway.
@eellison
Copy link
Contributor

eellison commented Aug 29, 2019

+1 from a Torchscript perspective.

The only way to enable scripting currently it would be to introduce a divergence in behavior from eager & scripted model, which isn't ideal.

Maybe we could add a separate train_with_logits method to return the tuple? or something along those lines.

@eellison
Copy link
Contributor

Would it be possible to make a decision on this ?

@fmassa
Copy link
Member Author

fmassa commented Sep 19, 2019

Given that there is no way of have a soft transition period, we will not have the hard BC-breaking change that this issue proposes.

Instead, in order to support torchscript, we might need have a different behavior for those models depending on torchscript or not.
This is not ideal, but would be a compromise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants