-
Notifications
You must be signed in to change notification settings - Fork 27.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for XLM-R XL and XXL models by modeling_xlm_roberta_xl.py #13727
Conversation
Thanks for the PR @Soonhwan-Kwon! Could you also add a test file and some integration tests? :-) |
@patrickvonplaten I started to work on test file, It seems test needs models uploaded on official repo. but how I can upload model files for xlm-roberta-xl or xlm-roberta-xxl to the official repo? |
Hey @Soonhwan-Kwon, Thanks a lot for working and this and sorry to reply so late! Let me know if you need some help fixing the last steps :-) |
Thank you for the reply, I'm middle of uploading models, but it takes time for xxlarge(over 24GB) model. |
@patrickvonplaten I have uploaded all models, but I have no idea how to fix last steps because I'm kind of newbie here. How can I fix last steps? Thank you in advance. |
@Soonhwan-Kwon, could you maybe also add a configuration file (just copy the xlm-roberta one) and also add a full test suite for the model? :-) |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
sorry for the late response. I already added config.json, so is it the tokenizer.json you're talking about? And I added simple test for models(tests/test_modeling_xlm_roberta_xl.py) but where can I find the full test suite? |
Hey @Soonhwan-Kwon, I meant more a new @sgugger @LysandreJik - This PR adds the checkpoints of https://ai.facebook.com/blog/-xlm-r-state-of-the-art-cross-lingual-understanding-through-self-supervision/ to @Soonhwan-Kwon - there are some failing tests which I think can partly be solved by rebasing to current master. Otherwise, if ok for you I'm also happy to dive in the PR and help you finish the last parts of it. Let me know what you prefer :-) |
Hey @Soonhwan-Kwon, thanks a lot for your PR!! @patrickvonplaten, I prefer |
Also, since it falls into our "new architecture test", there should be a new folder regrouping this modeling file and configuration file instead of putting everything in the xlm-roberta folder. |
@patrickvonplaten Sure, I will be glad if you help the last parts and feel free to dive in this PR. |
@patrickvonplaten I added you as collaborator in my repo, perhaps you might need access. |
Thanks @Soonhwan-Kwon - I'll try to tackle this tomorrow :-) |
src/transformers/models/xlm_roberta_xl/configuration_xlm_roberta_xl.py
Outdated
Show resolved
Hide resolved
src/transformers/models/xlm_roberta_xl/configuration_xlm_roberta_xl.py
Outdated
Show resolved
Hide resolved
Thanks for working on that @Soonhwan-Kwon . I made some minor suggestions and will look at the tokenization part now (to check if there are any differences between XLM-R and XLM-R-XL/XXL :) |
The tokenization part is working as expected. Here are some details:
transformers/src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py Lines 163 to 167 in 2e9af29
Fairseq comes with an own dictionary file ( The dictionary file for XLM-R-XL is the same as XLM-R, but it contains |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Careful with some `Copied from statements that haven't been adapted to the model name.
src/transformers/models/xlm_roberta_xl/modeling_xlm_roberta_xl.py
Outdated
Show resolved
Hide resolved
src/transformers/models/xlm_roberta_xl/modeling_xlm_roberta_xl.py
Outdated
Show resolved
Hide resolved
src/transformers/models/xlm_roberta_xl/modeling_xlm_roberta_xl.py
Outdated
Show resolved
Hide resolved
src/transformers/models/xlm_roberta_xl/modeling_xlm_roberta_xl.py
Outdated
Show resolved
Hide resolved
not require `lang` tensors to understand which language is used, and should be able to determine the correct | ||
language from the input ids. | ||
|
||
This model was contributed by [Soonhwan-Kwon](https://github.com/Soonhwan-Kwon) and [stefan-it](https://huggingface.co/stefan-it). The original code can be found [here](https://github.com/pytorch/fairseq/tree/master/examples/xlmr). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioned you here @Soonhwan-Kwon and @stefan-it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UPDATE: The PR should be ready for merge now. The checkpoints have been moved to the Facebook's org: https://huggingface.co/models?other=xlm-roberta-xl and added some model cards. @Soonhwan-Kwon, I've made you the "main" contributor for this model.
@stefan-it - it would also be great if you could do a final review :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
src/transformers/models/xlm_roberta_xl/configuration_xlm_roberta_xl.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Really cool! I'm currently running experiments on token classification with that new model 🤗 |
@patrickvonplaten @sgugger @stefan-it Thank you for the merge, it was a great experience, and I came to respect committers of transformers. And below revert is just miss click, sorry. |
This PR adds support for the newly released XL and XXL models for XLM-R, and . These models are described in the "Larger-Scale Transformers for Multilingual Masked Language Modeling" paper.
And thank you for @patrickvonplaten and @stefan-it, as the review I got from the #13210,
I added modeling_xlm_roberta_xl.py and convert_xlm_roberta_xl_original_pytorch_checkpoint_to_pytorch.py for conversion script.
I compared fairseq and transformers side by side,
and managed output same.
torch.Size([1, 11, 250880]) torch.Size([1, 11, 250880])
max_absolute_diff = 0.000186920166015625
Do both models output the same tensors? 🔥
Saving model to converted_xlmr_xl
Configuration saved in converted_xlmr_xl/config.json
Model weights saved in converted_xlmr_xl/pytorch_model.bin
Since fairseq roberta to transformers conversion was made a long time ago,
transformers architecture differs far from fairseq which originally started from,
and it makes quite confusion to write right code.
I synced transformers code to allow fairseq model structure.