-
Notifications
You must be signed in to change notification settings - Fork 27.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Nystromformer #14659
Add Nystromformer #14659
Conversation
Added Nystromformer-specific attributes to config and removed all decoder functionality from modelling.
Added Nystrom approximation and removed decoder tests.
Initial commits to conversion script, modeling changes.
…formers into add_nystromformer
Hey @novice03 , really cool PR, can't wait to fine-tune models with it :) I have some questions about the tokenization part: it seems that the ALBERT tokenizer (and an albert spm model) is used and uploaded to the model hub, so I think in But I've looked at the original pre-processing and pre-training code and it seems that the RoBERTa tokenizer is used for pre-training a Nyströmformer model, why is an albert spm model used here 🤔 |
Hello, thanks for taking a look at this PR. I've added the tokenizer entry to the config.json file. It is true that the code seems to use RoBERTa tokenizer. However, the model checkpoint released is of a model with |
src/transformers/models/nystromformer/modeling_nystromformer.py
Outdated
Show resolved
Hide resolved
src/transformers/models/nystromformer/configuration_nystromformer.py
Outdated
Show resolved
Hide resolved
src/transformers/models/nystromformer/configuration_nystromformer.py
Outdated
Show resolved
Hide resolved
src/transformers/models/nystromformer/configuration_nystromformer.py
Outdated
Show resolved
Hide resolved
src/transformers/models/nystromformer/configuration_nystromformer.py
Outdated
Show resolved
Hide resolved
src/transformers/models/nystromformer/configuration_nystromformer.py
Outdated
Show resolved
Hide resolved
Very cool to have this new model @novice03 🤗 . Let me share with you a little information about the tokenizers of @NielsRogge showed me that there is currently an inconsistency in the treatment of special tokens with the slow and fast versions of I think that to generate into the from transformers import AlbertTokenizer
tokenizer = AlbertTokenizer.from_pretrained("uw-madison/nystromformer-512")
tokenizer.save_pretrained("slow_tokenizer")
fast_tokenizer = AlbertTokenizerFast.from_pretrained(
"slow_tokenizer",
bos_token="[CLS]",
eos_token="[SEP]",
unk_token="<unk>",
sep_token="[SEP]",
pad_token="<pad>",
cls_token="[CLS]",
mask_token="[MASK]"
)
fast_tokenizer.save_pretrained("slow_and_fast") I hope this is helpful! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Only left a few comments.
Would you be open to also contribute the TensorFlow version of the model, or would you rather we start with the PyTorch model?
src/transformers/models/nystromformer/configuration_nystromformer.py
Outdated
Show resolved
Hide resolved
src/transformers/models/nystromformer/modeling_nystromformer.py
Outdated
Show resolved
Hide resolved
src/transformers/models/nystromformer/modeling_nystromformer.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
…formers into add_nystromformer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for working on this model, this PR is in great shape! I just left styling suggestions.
src/transformers/models/nystromformer/configuration_nystromformer.py
Outdated
Show resolved
Hide resolved
...formers/models/nystromformer/convert_nystromformer_original_pytorch_checkpoint_to_pytorch.py
Outdated
Show resolved
Hide resolved
src/transformers/models/nystromformer/modeling_nystromformer.py
Outdated
Show resolved
Hide resolved
src/transformers/models/nystromformer/modeling_nystromformer.py
Outdated
Show resolved
Hide resolved
src/transformers/models/nystromformer/modeling_nystromformer.py
Outdated
Show resolved
Hide resolved
src/transformers/models/nystromformer/modeling_nystromformer.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
…formers into add_nystromformer
What does this PR do?
This PR adds the Nystromformer transformer model to the repository.
Paper: https://arxiv.org/abs/2102.03902
Code: https://github.com/mlpen/Nystromformer
Checkpoint: Nystromformer sequence length 512
Who can review?
@NielsRogge