-
Notifications
You must be signed in to change notification settings - Fork 29.5k
Add support for XLM-R XL and XXL models by modeling_xlm_roberta_xl.py #13727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
70f0f04
add xlm roberta xl
Soonhwan-Kwon b831d60
add convert xlm xl fairseq checkpoint to pytorch
Soonhwan-Kwon 123aab4
fix init and documents for xlm-roberta-xl
Soonhwan-Kwon 1514f67
fix indention
Soonhwan-Kwon 57d72ca
add test for XLM-R xl,xxl
Soonhwan-Kwon bd19941
fix model hub name
Soonhwan-Kwon d2d2715
fix some stuff
patrickvonplaten 6be7307
Merge branch 'master' of https://github.com/huggingface/transformers …
patrickvonplaten 9b4203f
up
patrickvonplaten 5fca25a
correct init
patrickvonplaten df499c7
fix more
patrickvonplaten 6a9c09f
fix as suggestions
Soonhwan-Kwon 83852d1
add torch_device
Soonhwan-Kwon 21bcebb
fix default values of doc strings
Soonhwan-Kwon 7b058be
fix leftovers
patrickvonplaten 351ada4
Merge branch 'master' of https://github.com/huggingface/transformers …
patrickvonplaten c4af533
merge to master
patrickvonplaten 864620b
up
patrickvonplaten a9b13b8
correct hub names
patrickvonplaten 1525c94
fix docs
patrickvonplaten 4762f20
fix model
patrickvonplaten 316a750
up
patrickvonplaten f9ad5ff
finalize
patrickvonplaten b9b80f4
last fix
patrickvonplaten 1d49d20
Apply suggestions from code review
patrickvonplaten ad3e260
add copied from
patrickvonplaten 9142af2
make style
patrickvonplaten File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
<!--Copyright 2022 The HuggingFace Team. All rights reserved. | ||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations under the License. | ||
--> | ||
|
||
# XLM-RoBERTa-XL | ||
|
||
## Overview | ||
|
||
The XLM-RoBERTa-XL model was proposed in [Larger-Scale Transformers for Multilingual Masked Language Modeling](https://arxiv.org/abs/2105.00572) by Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau. | ||
|
||
The abstract from the paper is the following: | ||
|
||
*Recent work has demonstrated the effectiveness of cross-lingual language model pretraining for cross-lingual understanding. In this study, we present the results of two larger multilingual masked language models, with 3.5B and 10.7B parameters. Our two new models dubbed XLM-R XL and XLM-R XXL outperform XLM-R by 1.8% and 2.4% average accuracy on XNLI. Our model also outperforms the RoBERTa-Large model on several English tasks of the GLUE benchmark by 0.3% on average while handling 99 more languages. This suggests pretrained models with larger capacity may obtain both strong performance on high-resource languages while greatly improving low-resource languages. We make our code and models publicly available.* | ||
|
||
Tips: | ||
|
||
- XLM-RoBERTa-XL is a multilingual model trained on 100 different languages. Unlike some XLM multilingual models, it does | ||
not require `lang` tensors to understand which language is used, and should be able to determine the correct | ||
language from the input ids. | ||
|
||
This model was contributed by [Soonhwan-Kwon](https://github.com/Soonhwan-Kwon) and [stefan-it](https://huggingface.co/stefan-it). The original code can be found [here](https://github.com/pytorch/fairseq/tree/master/examples/xlmr). | ||
|
||
|
||
## XLMRobertaXLConfig | ||
|
||
[[autodoc]] XLMRobertaXLConfig | ||
|
||
## XLMRobertaXLModel | ||
|
||
[[autodoc]] XLMRobertaXLModel | ||
- forward | ||
|
||
## XLMRobertaXLForCausalLM | ||
|
||
[[autodoc]] XLMRobertaXLForCausalLM | ||
- forward | ||
|
||
## XLMRobertaXLForMaskedLM | ||
|
||
[[autodoc]] XLMRobertaXLForMaskedLM | ||
- forward | ||
|
||
## XLMRobertaXLForSequenceClassification | ||
|
||
[[autodoc]] XLMRobertaXLForSequenceClassification | ||
- forward | ||
|
||
## XLMRobertaXLForMultipleChoice | ||
|
||
[[autodoc]] XLMRobertaXLForMultipleChoice | ||
- forward | ||
|
||
## XLMRobertaXLForTokenClassification | ||
|
||
[[autodoc]] XLMRobertaXLForTokenClassification | ||
- forward | ||
|
||
## XLMRobertaXLForQuestionAnswering | ||
|
||
[[autodoc]] XLMRobertaXLForQuestionAnswering | ||
- forward |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -119,6 +119,7 @@ | |
xlm, | ||
xlm_prophetnet, | ||
xlm_roberta, | ||
xlm_roberta_xl, | ||
xlnet, | ||
yoso, | ||
) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioned you here @Soonhwan-Kwon and @stefan-it