Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vision encoder decoder model in exporters #588

Merged

Conversation

mht-sharma
Copy link
Contributor

@mht-sharma mht-sharma commented Dec 14, 2022

What does this PR do?

Supports vision-encoder-decoder export to onnx in optimum

Changes

  • Adds generic class for exporting Encoder-Decoder type models doc
  • Adds export for vision-encoder-decoder models using the above class

Limitation

  • Past key value export not supported with TROcr
  • Donut model is not supported

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Dec 14, 2022

The documentation is not available anymore as the PR was closed or merged.

@mht-sharma mht-sharma force-pushed the add-support-vision-encoder-decoder-models branch from d6fef2f to 7e5932e Compare January 23, 2023 12:17
@mht-sharma mht-sharma changed the title Add support vision encoder decoder models Add vision encoder decoder model in exporters Jan 23, 2023
@mht-sharma mht-sharma marked this pull request as ready for review January 23, 2023 12:35
@mht-sharma mht-sharma force-pushed the add-support-vision-encoder-decoder-models branch from d333eb4 to 7f3b18f Compare January 23, 2023 12:45
@@ -110,6 +110,7 @@
"speech-to-text": "hf-internal-testing/tiny-random-Speech2TextModel",
"xlm": "hf-internal-testing/tiny-random-XLMModel",
"xlm-roberta": "hf-internal-testing/tiny-xlm-roberta",
"vision-encoder-decoder": "hf-internal-testing/tiny-random-VisionEncoderDecoderModel-vit-gpt2",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test for trocr and donut-swin as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added for trocr, donut-skin has threshold issue while exporting would look into it separately

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then could we not add it in tasks.py? And handle adding donut-swin in a separate PR altogether?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test model has high atol requirement (4e-2). The export for the model in doc is slow but works with 1e-3 threshold, so not putting that here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by the model in doc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just like that what is said to be supported in tasks.py is actually supported, and we can make sure of it by adding an unit test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented Donut. Will handle this separately

behavior=behavior,
)

from ..tasks import TasksManager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would we have it at the top level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It causes circular export on top

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could mean that some refactor is needed, I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferring import would still be ok IMO. But if there is a better way would look into it.

Comment on lines 104 to 118
def __getattr__(self, attr_name):
if (
self.ENCODER_CONFIG is not None
and self.ENCODER_NORMALIZED_CONFIG_CLASS is not None
and attr_name.upper() in dir(self.ENCODER_NORMALIZED_CONFIG_CLASS)
):
return self.ENCODER_NORMALIZED_CONFIG_CLASS.__getattr__(attr_name)
if (
self.DECODER_CONFIG is not None
and self.DECODER_NORMALIZED_CONFIG_CLASS is not None
and attr_name.upper() in dir(self.DECODER_NORMALIZED_CONFIG_CLASS)
):
return self.DECODER_NORMALIZED_CONFIG_CLASS.__getattr__(attr_name)

return super().__getattr__(attr_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These type of models can have any decoder and encoder head and it is difficult to manage every normalised class. So created this helper class to work with these models

optimum/exporters/onnx/model_configs.py Outdated Show resolved Hide resolved
optimum/utils/normalized_config.py Outdated Show resolved Hide resolved
DummySeq2SeqDecoderTextInputGenerator,
DummyPastKeyValuesGenerator,
)

Copy link
Member

@michaelbenayoun michaelbenayoun Jan 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add the nromalized config creation here instead of setting attributes in the if branches. I would also add a type check on it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to set the values in constructor than setting attribute? Plus why would type check be necessary?

optimum/exporters/onnx/config.py Show resolved Hide resolved
optimum/exporters/onnx/config.py Outdated Show resolved Hide resolved
optimum/exporters/onnx/config.py Outdated Show resolved Hide resolved
@mht-sharma mht-sharma force-pushed the add-support-vision-encoder-decoder-models branch from d949251 to 9b32631 Compare January 30, 2023 11:01
Comment on lines +104 to +118
class NormalizedEncoderDecoderConfig(NormalizedConfig):
ENCODER_NORMALIZED_CONFIG_CLASS = None
DECODER_NORMALIZED_CONFIG_CLASS = None

def __getattr__(self, attr_name):
if self.ENCODER_NORMALIZED_CONFIG_CLASS is not None and attr_name.upper() in dir(
self.ENCODER_NORMALIZED_CONFIG_CLASS
):
return self.ENCODER_NORMALIZED_CONFIG_CLASS.__getattr__(attr_name)
if self.DECODER_NORMALIZED_CONFIG_CLASS is not None and attr_name.upper() in dir(
self.DECODER_NORMALIZED_CONFIG_CLASS
):
return self.DECODER_NORMALIZED_CONFIG_CLASS.__getattr__(attr_name)

return super().__getattr__(attr_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't the attr_name be in both the encoder and decoder config? e.g. bos_token_id in https://huggingface.co/microsoft/trocr-small-handwritten/blob/main/config.json . Could this result in unexpected behavior? We should rather be able to access attributes recursively no, like normalized_config.encoder.bos_token_id or normalized_config.decoder.bos_token_id, if this makes sense!

Copy link
Contributor Author

@mht-sharma mht-sharma Jan 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it could be possible. This seems to be a common problem for the other hybrid configs also. Would make a note of it and handle it separately.
Having something like this normalized_config.encoder.bos_token_id or normalized_config.decoder.bos_token_id would require change in the entire input_generator so would need to think of a better way.
Currently, this would not affect the ORT export (where we export the encoder and decoder separately) as one of the configs would always be None. So considering it low priority.

Copy link
Contributor

@fxmarty fxmarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@mht-sharma mht-sharma merged commit b2bba71 into huggingface:main Jan 31, 2023
@Mir-Umar
Copy link

Hi @mht-sharma can you kindly provide some insights about how to remove the following limitation from the above issue:

  1. Past key value export not supported with TROcr
    I am actually working with a custom trained TROCR model and want to convert it to ONNX for faster inference.

@fxmarty
Copy link
Contributor

fxmarty commented Apr 11, 2023

@Mir-Umar Feel free to open an issue as well.

@mht-sharma
Copy link
Contributor Author

@Mir-Umar There is an existing issue for this #744. I did not got the time to work on this yet.

The use_cache argument for the TrOCR model does not work in Transformers so the export was failing. Probably the TROcr code in Transformers needs to be looked at to debug the issue further. Let me know if you would like to work on the above issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants