Skip to content

Conversation

@NormXU
Copy link
Contributor

@NormXU NormXU commented Sep 13, 2023

What does this PR do?

The current implementation of calculating 1d_pos_bias/2d_pos_bias in LayoutLMv2, v3 is VRAM-consuming due to the large one-hot matrix.

Considering the idea of 1d_pos_bias/2d_pos_bias is to categorize all relative positions into several buckets, assign each position id to a specific bucket based on its relative distance to another token, and embed the position id into a feature, we can drop the large one-hot matrix and directly use the Linear weight features like an nn.Embedding.

In my tests, as for an input sequence of $[10, 1024]$ (bz, nseq), this can save 3 Gb VRAM for 2d_pos_bias calculations

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests? # This PR can reuse previous tests

Who can review?

@ArthurZucker and @younesbelkada

@ArthurZucker
Copy link
Collaborator

Hey! Thanks for opening a PR, pinging @rafaelpadilla for a review here 😉

Copy link
Contributor

@rafaelpadilla rafaelpadilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch @NormXU !
Sorry for the delay on reviewing it. I had problems with the pytesseract dependency.
I was able to dig into the code and saw that your changes make the process simpler and produce the same outputs.
Just make sure to make the other tests pass.

@NormXU
Copy link
Contributor Author

NormXU commented Sep 28, 2023

@rafaelpadilla I've reformatted the codes. It's ready to be merged.

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your PR @NormXU and for your review @rafaelpadilla

@LysandreJik LysandreJik merged commit a7e0ed8 into huggingface:main Sep 28, 2023
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

ArthurZucker pushed a commit that referenced this pull request May 24, 2024
…ating pos_bias in LayoutLM v2, v3 (#26139)" (#30988)

* Revert "optimize VRAM for calculating pos_bias in LayoutLM v2, v3 (#26139)"

This reverts commit a7e0ed8.

* Instead of reverting commit, wrap indexing in torch.no_grad context

* Apply wrapping in LayoutLMv2

* Add comments explaining reason for no_grad

* Fix code format

---------

Co-authored-by: Kevin Koehncke <kevin.koehncke@uipath.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants