Remove masked image modeling from BEIT ONNX export #16980

lewtun · 2022-04-28T05:16:16Z

What does this PR do?

This PR removes masked image modeling from the list of supported features in the ONNX exporter. As explained by @NielsRogge, BEiT cannot be loaded with the AutoModelForMaskedImageModeling class due to:

Well yeah that's because BEiT does masked image modeling by predicting visual tokens of a VQ-VAE, whereas the other ones predict pixel values (RGB) as in the SimMIM paper. So I'm afraid BEiT cannot be added to this auto class.

I've also added a note in the BEiT docs to help users who don't know these details. I've also checked that the slow tests pass for ONNX with

RUN_SLOW=1 pytest tests/onnx/test_onnx_v2.py -s

Edit: we should merge this after #16981 to ensure the RoFormer tests pass first

HuggingFaceDocBuilderDev · 2022-04-28T05:32:22Z

The documentation is not available anymore as the PR was closed or merged.

NielsRogge · 2022-04-28T06:29:50Z

Hi,

There's a reason I haven't added BEiT to the auto classes. It's because it can't be used with the run_mim.py script, because BEiT handles masked image modeling differently compared to the other ones (which do it similar to the way it's defined in SimMIM paper). So this may confuse users, maybe we should properly document it that BEiT is not the same as the other ones

lewtun · 2022-04-28T06:56:35Z

Ah I see, but isn't a bit odd to exclude BEiT just because it isn't compatible with our example scripts?

For instance, is there anything fundamentally wrong with loading BeitForMaskedImageModeling via the autoclass if I'm rolling my own masked image modeling code?

If not, I'd prefer to keep BEIT in the autoclasses and put the warning inside the run_mim.py script if a user tries to run it with this architecture

lewtun · 2022-04-28T06:58:50Z

Hmm maybe there is a fundamental issue with using BEiT in the autoclasses as I'm seeing the torch tests fail with:

self = BeitEmbeddings(
  (patch_embeddings): PatchEmbeddings(
    (projection): Conv2d(3, 32, kernel_size=(2, 2), stride=(2, 2))
  )
  (dropout): Dropout(p=0.1, inplace=False)
)
pixel_values = tensor([[[[7.7614e-01, 1.7656e-01, 6.0460e-01,  ..., 3.9106e-01,
           5.2019e-01, 8.9339e-01],
          [2.7568...1, 9.9367e-01],
          [9.4963e-01, 1.6943e-01, 9.7946e-01,  ..., 1.9085e-01,
           1.9910e-01, 4.6059e-02]]]])
bool_masked_pos = tensor([[0, 0, 0,  ..., 0, 0, 0],
        [0, 0, 0,  ..., 0, 0, 0],
        [0, 0, 0,  ..., 0, 0, 0],
        ...,
        [0, 0, 0,  ..., 0, 0, 0],
        [0, 0, 0,  ..., 0, 0, 0],
        [0, 0, 0,  ..., 0, 0, 0]])

    def forward(self, pixel_values: torch.Tensor, bool_masked_pos: Optional[torch.BoolTensor] = None) -> torch.Tensor:
    
        embeddings = self.patch_embeddings(pixel_values)
        batch_size, seq_len, _ = embeddings.size()
    
        cls_tokens = self.cls_token.expand(batch_size, -1, -1)
        if bool_masked_pos is not None:
>           mask_tokens = self.mask_token.expand(batch_size, seq_len, -1)
E           AttributeError: 'NoneType' object has no attribute 'expand'

NielsRogge · 2022-04-28T07:23:08Z

Well yeah that's because BEiT does masked image modeling by predicting visual tokens of a VQ-VAE, whereas the other ones predict pixel values (RGB) as in the SimMIM paper. So I'm afraid BEiT cannot be added to this auto class.

lewtun · 2022-04-28T07:25:54Z

OK thanks for the clarification. I'll remove this feature from the ONNX export and add a note to the BEiT docs :)

This reverts commit c1c4c0a.

lewtun · 2022-04-28T07:55:29Z

docs/source/en/model_doc/beit.mdx

@@ -59,6 +59,12 @@ Tips:
  `use_relative_position_bias` attribute of [`BeitConfig`] to `True` in order to add
  position embeddings.

+<Tip warning={true}>
+
+BEiT does masked image modeling by predicting visual tokens of a Vector-Quantize Variational Autoencoder (VQ-VAE), whereas other vision models like ViT and DeiT predict RGB pixel values. The [`AutoModelForMaskedImageModeling`] class supports pixel-based image modeling, so you will need to use [`BeitForMaskedImageModeling`] directly if you wish to do masked image modeling with BEiT.


Not sure if this should be on the main doc page or within the docstring for the BeitForMaskedImageModeling class. Happy to move it if you want!

Edit: decided it made more sense to put this in the docstring itself in eca26be

lewtun · 2022-04-28T07:57:47Z

src/transformers/onnx/config.py

        "default": OrderedDict({"last_hidden_state": {0: "batch", 1: "sequence"}}),
+        "image-classification": OrderedDict({"logits": {0: "batch", 1: "sequence"}}),
+        "masked-im": OrderedDict({"logits": {0: "batch", 1: "sequence"}}),


Since I had to add this feature to support masked image modeling in general, I also went ahead and rearranged these features alphabetically as it was getting annoying to inspect what was available

lewtun · 2022-04-28T07:58:44Z

src/transformers/onnx/features.py

-            onnx_config_cls=MBartOnnxConfig,
-        ),
+        # BEiT cannot be used with the masked image modeling autoclass, so this feature is excluded here
+        "beit": supported_features_mapping("default", "image-classification", onnx_config_cls=BeitOnnxConfig),


Since I had to edit the features here, I also went ahead and reordered all the models alphabetically since the list is now quite long and annoying to navigate

This is hard to review as a diff, so I'll trust you didn't forget any of them ;-)

Yeah sorry about that. I did a sanity check that all the features agree with those on the main branch:

from transformers.onnx import FeaturesManager # From current branch features_new = FeaturesManager._SUPPORTED_MODEL_TYPE # From main brach features_old = FeaturesManager._SUPPORTED_MODEL_TYPE for k,v in features_new.items(): # Skip beit since it's features are different on `main` if k == "beit": continue assert features_old[k].keys() == v.keys()

sgugger

Thanks for working on this!

src/transformers/models/beit/modeling_beit.py

sgugger · 2022-04-28T11:25:40Z

src/transformers/onnx/features.py

-            onnx_config_cls=MBartOnnxConfig,
-        ),
+        # BEiT cannot be used with the masked image modeling autoclass, so this feature is excluded here
+        "beit": supported_features_mapping("default", "image-classification", onnx_config_cls=BeitOnnxConfig),


This is hard to review as a diff, so I'll trust you didn't forget any of them ;-)

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add masked image modelling to task mapping * Refactor ONNX features to be listed alphabetically * Add warning about BEiT masked image modeling Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

lewtun added 2 commits April 28, 2022 07:01

Add BEIT to masked image modeling autoclasses

c1c4c0a

Add masked image modelling to task mapping

5843256

lewtun requested review from sgugger and NielsRogge April 28, 2022 05:16

Fix order

0c3f23b

lewtun added 3 commits April 28, 2022 09:44

Refactor ONNX features to be listed alphabetically

e6808e7

Revert "Add BEIT to masked image modeling autoclasses"

66d4205

This reverts commit c1c4c0a.

Add warning about BEiT masked image modeling

e009f12

lewtun commented Apr 28, 2022

View reviewed changes

lewtun changed the title ~~Add BEIT to masked image modeling autoclasses~~ Remove masked image modeling from BEIT ONNX export Apr 28, 2022

lewtun commented Apr 28, 2022

View reviewed changes

lewtun added 2 commits April 28, 2022 11:11

Migrate warning to docstring

eca26be

Fix grammar

cc74965

sgugger approved these changes Apr 28, 2022

View reviewed changes

lewtun and others added 3 commits April 28, 2022 13:35

Incorporate review feedback

97864fd

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Use multiple lines in docstring

1edb5b4

Merge branch 'main' into fix-beit-masked-im

6201190

lewtun merged commit 675e2d1 into main May 4, 2022

lewtun deleted the fix-beit-masked-im branch May 4, 2022 08:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove masked image modeling from BEIT ONNX export #16980

Remove masked image modeling from BEIT ONNX export #16980

Uh oh!

lewtun commented Apr 28, 2022 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 28, 2022 •

edited

Loading

Uh oh!

NielsRogge commented Apr 28, 2022

Uh oh!

lewtun commented Apr 28, 2022

Uh oh!

lewtun commented Apr 28, 2022 •

edited

Loading

Uh oh!

NielsRogge commented Apr 28, 2022 •

edited

Loading

Uh oh!

lewtun commented Apr 28, 2022

Uh oh!

lewtun Apr 28, 2022 •

edited

Loading

Uh oh!

lewtun Apr 28, 2022

Uh oh!

lewtun Apr 28, 2022

Uh oh!

sgugger Apr 28, 2022

Uh oh!

lewtun Apr 28, 2022 •

edited

Loading

Uh oh!

sgugger left a comment

Uh oh!

Uh oh!

sgugger Apr 28, 2022

Uh oh!

Uh oh!

Remove masked image modeling from BEIT ONNX export #16980

Remove masked image modeling from BEIT ONNX export #16980

Uh oh!

Conversation

lewtun commented Apr 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Apr 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NielsRogge commented Apr 28, 2022

Uh oh!

lewtun commented Apr 28, 2022

Uh oh!

lewtun commented Apr 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NielsRogge commented Apr 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lewtun commented Apr 28, 2022

Uh oh!

lewtun Apr 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lewtun Apr 28, 2022

Choose a reason for hiding this comment

Uh oh!

lewtun Apr 28, 2022

Choose a reason for hiding this comment

Uh oh!

sgugger Apr 28, 2022

Choose a reason for hiding this comment

Uh oh!

lewtun Apr 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sgugger Apr 28, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lewtun commented Apr 28, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 28, 2022 •

edited

Loading

lewtun commented Apr 28, 2022 •

edited

Loading

NielsRogge commented Apr 28, 2022 •

edited

Loading

lewtun Apr 28, 2022 •

edited

Loading

lewtun Apr 28, 2022 •

edited

Loading