Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Perceiver IO #14487

Merged
merged 147 commits into from
Dec 8, 2021
Merged
Show file tree
Hide file tree
Changes from 146 commits
Commits
Show all changes
147 commits
Select commit Hold shift + click to select a range
beef8c1
First draft
NielsRogge Aug 2, 2021
28f9541
Style and remove mlm
NielsRogge Sep 6, 2021
7f70799
Make forward pass work
NielsRogge Sep 6, 2021
7574fc0
More improvements
NielsRogge Sep 6, 2021
77d55ec
More improvements
NielsRogge Sep 7, 2021
bdccd62
Fix bug
NielsRogge Sep 7, 2021
7b7dcd2
More improvements
NielsRogge Sep 7, 2021
25d7725
More improvements
NielsRogge Sep 7, 2021
4a804b6
Add PerceiverTokenizer first draft
NielsRogge Sep 8, 2021
9a84428
Improve conversion script
NielsRogge Sep 8, 2021
65e4edd
More improvements
NielsRogge Sep 8, 2021
649c66a
Make conversion script work for the encoder
NielsRogge Sep 8, 2021
df1c0c9
Make conversion script work with local pickle files
NielsRogge Sep 8, 2021
6a8a981
Style & quality, fix-copies
NielsRogge Sep 8, 2021
79b3f9d
Add dummy input to conversion script
NielsRogge Sep 8, 2021
6d1fb56
Add absolute position embeddings to TextPreProcessor
NielsRogge Sep 8, 2021
9ef09dc
Make forward pass of encoder work
NielsRogge Sep 9, 2021
8e15a42
More improvements
NielsRogge Sep 10, 2021
8852bd6
Move text preprocessor to separate script
NielsRogge Sep 10, 2021
e003753
More improvements
NielsRogge Sep 10, 2021
cfe4d01
More improvements
NielsRogge Sep 10, 2021
2eb4869
Add post processor
NielsRogge Sep 10, 2021
091903e
Make MLM model work
NielsRogge Sep 10, 2021
4f6c31d
Style
NielsRogge Sep 10, 2021
edaf54d
Add PerceiverForMaskedLM
NielsRogge Sep 10, 2021
5a1dea3
Add PerceiverImagePreprocessor
NielsRogge Sep 13, 2021
af33282
Make style
NielsRogge Sep 13, 2021
63b556a
Make PerceiverForImageClassification work
NielsRogge Sep 13, 2021
54d5335
More improvements
NielsRogge Sep 14, 2021
853268e
More improvements
NielsRogge Sep 14, 2021
d579251
Use tokenizer in conversion script
NielsRogge Sep 14, 2021
e8a8772
Use PerceiverForMaskedLM in conversion script
NielsRogge Sep 14, 2021
f8293b9
Define custom PerceiverModelOutput
NielsRogge Sep 14, 2021
3a62362
Improve PerceiverAttention to make it work for both MLM and image cla…
NielsRogge Sep 14, 2021
7795f6d
More improvements
NielsRogge Sep 14, 2021
2c3342f
More improvements
NielsRogge Sep 15, 2021
3151607
More improvements to the conversion script
NielsRogge Sep 15, 2021
a2e6b0e
Make conversion script work for both MLM and image classification
NielsRogge Sep 15, 2021
c1dbe7c
Add PerceiverFeatureExtractor
NielsRogge Sep 15, 2021
e6d9122
More improvements
NielsRogge Sep 15, 2021
cfd32c6
Style and quality
NielsRogge Sep 15, 2021
07b090f
Add center cropping
NielsRogge Sep 15, 2021
4cd722c
Fix bug
NielsRogge Sep 15, 2021
4ed297e
Small fix
NielsRogge Sep 15, 2021
8d4b748
Add print statement
NielsRogge Sep 15, 2021
2bb92b7
Fix bug in image preprocessor
NielsRogge Sep 15, 2021
4248229
Fix bug with conversion script
NielsRogge Sep 15, 2021
a7f75a2
Make output position embeddings an nn.Parameter layer instead of nn.E…
NielsRogge Sep 15, 2021
4592338
Comment out print statements
NielsRogge Sep 16, 2021
dd91215
Add position encoding classes
NielsRogge Sep 16, 2021
ac82fce
More improvements
NielsRogge Sep 16, 2021
b369c09
Use position_encoding_kwargs
NielsRogge Sep 17, 2021
7d1863f
Add PerceiverForImageClassificationFourier
NielsRogge Sep 17, 2021
e77c6b4
Make style & quality
NielsRogge Sep 17, 2021
0a7c3f0
Add PerceiverForImageClassificationConvProcessing
NielsRogge Sep 17, 2021
d3bcf09
Style & quality
NielsRogge Sep 17, 2021
0e4241c
Add flow model
NielsRogge Sep 18, 2021
92c7c62
Move processors to modeling file
NielsRogge Sep 20, 2021
9933942
Make position encodings modular
NielsRogge Sep 20, 2021
00d2ce3
Make basic decoder use modular position encodings
NielsRogge Sep 20, 2021
f1276f8
Add PerceiverForOpticalFlow to conversion script
NielsRogge Sep 20, 2021
15ded27
Add AudioPreprocessor
NielsRogge Sep 21, 2021
1347c20
Make it possible for the basic decoder to use Fourier position embedd…
NielsRogge Sep 21, 2021
8bb1289
Add PerceiverForMultimodalAutoencoding
NielsRogge Sep 21, 2021
8c5d100
Improve model for optical flow
NielsRogge Sep 22, 2021
5dbea95
Improve _build_network_inputs method
NielsRogge Sep 22, 2021
5472500
Add print statement
NielsRogge Sep 22, 2021
fea12e6
Fix device issue
NielsRogge Sep 22, 2021
3daed24
Fix device of Fourier embeddings
NielsRogge Sep 23, 2021
a45c064
Add print statements for debugging
NielsRogge Sep 23, 2021
1e7b1c9
Add another print statement
NielsRogge Sep 23, 2021
8c0f886
Add another print statement
NielsRogge Sep 23, 2021
32cca82
Add another print statement
NielsRogge Sep 23, 2021
f1c3720
Add another print statement
NielsRogge Sep 23, 2021
275a59f
Improve PerceiverAudioPreprocessor
NielsRogge Sep 24, 2021
aedb68e
Improve conversion script for multimodal modal
NielsRogge Sep 24, 2021
adc1205
More improvements
NielsRogge Sep 24, 2021
89da95d
More improvements
NielsRogge Sep 25, 2021
a7f4870
Improve multimodal model
NielsRogge Sep 27, 2021
54021d3
Make forward pass multimodal model work
NielsRogge Sep 28, 2021
327d16c
More improvements
NielsRogge Sep 29, 2021
f3a2d0c
Improve tests
NielsRogge Oct 6, 2021
1f34526
Fix some more tests
NielsRogge Oct 6, 2021
7c4cbbc
Add output dataclasses
NielsRogge Oct 6, 2021
2a4dab2
Make more tests pass
NielsRogge Oct 7, 2021
1205dd9
Add print statements for debuggin
NielsRogge Oct 7, 2021
4408a69
Add tests for image classification
NielsRogge Oct 7, 2021
1a60c6a
Add PerceiverClassifierOutput
NielsRogge Oct 7, 2021
0a1bfcd
More improvements
NielsRogge Oct 7, 2021
27f7190
Make more tests pass for the optical flow model
NielsRogge Oct 7, 2021
6815bf7
Make style & quality
NielsRogge Oct 7, 2021
d7fedc7
Small improvements
NielsRogge Oct 7, 2021
06839cb
Don't support training for optical flow model for now
NielsRogge Oct 11, 2021
5acb88c
Fix _prepare_for_class for tests
NielsRogge Oct 11, 2021
db7b6bb
Make more tests pass, add some docs
NielsRogge Oct 12, 2021
0264043
Add multimodal model to tests
NielsRogge Oct 12, 2021
107c971
Minor fixes
NielsRogge Nov 3, 2021
ed7d7ea
Fix tests
NielsRogge Nov 4, 2021
f62a6f5
Improve conversion script
NielsRogge Nov 4, 2021
d32808b
Make fixup
NielsRogge Nov 4, 2021
08b67de
Remove pos_dim argument
NielsRogge Nov 4, 2021
e7f8329
Fix device issue
NielsRogge Nov 4, 2021
0a93591
Potential fix for OOM
NielsRogge Nov 4, 2021
1091cfe
Revert previous commit
NielsRogge Nov 4, 2021
4c10a9d
Fix test_initialization
NielsRogge Nov 5, 2021
06c7b06
Add print statements for debugging
NielsRogge Nov 5, 2021
adfda8f
Fix print statement
NielsRogge Nov 5, 2021
927dd92
Add print statement
NielsRogge Nov 5, 2021
786f57f
Add print statement
NielsRogge Nov 5, 2021
bde8cf3
Add print statement
NielsRogge Nov 5, 2021
d832391
Add print statement
NielsRogge Nov 8, 2021
8aa3228
Add print statement
NielsRogge Nov 8, 2021
5a84a3e
Add print statement
NielsRogge Nov 8, 2021
8887f98
Remove need for output_shape
NielsRogge Nov 8, 2021
f9800c5
Comment out output_shape
NielsRogge Nov 8, 2021
134bfc4
Remove unnecessary code
NielsRogge Nov 8, 2021
d5187fb
Improve docs
NielsRogge Nov 10, 2021
e9003fb
Fix make fixup
NielsRogge Nov 19, 2021
d965bca
Remove PerceiverTextProcessor from init
NielsRogge Nov 19, 2021
42630e7
Improve docs
NielsRogge Nov 19, 2021
29037ba
Small improvement
NielsRogge Nov 22, 2021
4a2b81a
Apply first batch of suggestions from code review
NielsRogge Nov 30, 2021
3235318
Apply more suggestions from code review
NielsRogge Nov 30, 2021
22becd9
Update docstrings
NielsRogge Nov 30, 2021
dc95e00
Define dicts beforehand for readability
NielsRogge Nov 30, 2021
31ae669
Rename task to architecture in conversion script, include PerceiverMo…
NielsRogge Dec 1, 2021
fa41b1a
Add print statements for debugging
NielsRogge Dec 1, 2021
a3f16f2
Fix tests on GPU
NielsRogge Dec 1, 2021
afcb875
Remove preprocessors, postprocessors and decoders from main init
NielsRogge Dec 1, 2021
c5e3af7
Add integration test
NielsRogge Dec 1, 2021
dc68fed
Fix docs
NielsRogge Dec 1, 2021
ffc6fde
Replace einops by torch
NielsRogge Dec 2, 2021
83a6776
Update for new docs frontend
NielsRogge Dec 2, 2021
46c8e04
Rename PerceiverForImageClassification
NielsRogge Dec 2, 2021
a358e38
Improve docs
NielsRogge Dec 2, 2021
c5ae758
Improve docs
NielsRogge Dec 2, 2021
48503c0
Improve docs of PerceiverModel
NielsRogge Dec 2, 2021
ec0e016
Fix some more tests
NielsRogge Dec 3, 2021
da79d8a
Improve center_crop
NielsRogge Dec 3, 2021
2a3c57c
Add PerceiverForSequenceClassification
NielsRogge Dec 3, 2021
60eefd7
Small improvements
NielsRogge Dec 6, 2021
b36ba76
Fix tests
NielsRogge Dec 6, 2021
e8cf21a
Add integration test for optical flow model
NielsRogge Dec 7, 2021
e084c05
Clean up
NielsRogge Dec 7, 2021
d1c0245
Add tests for tokenizer
NielsRogge Dec 7, 2021
520f132
Fix tokenizer by adding special tokens properly
NielsRogge Dec 8, 2021
cf534be
Fix CI
NielsRogge Dec 8, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,7 @@ Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih.
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (from Google AI) released with the paper [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
1. **[Pegasus](https://huggingface.co/docs/transformers/model_doc/pegasus)** (from Google) released with the paper [PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization](https://arxiv.org/abs/1912.08777) by Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu.
1. **[Perceiver IO](https://huggingface.co/docs/transformers/model_doc/perceiver)** (from Deepmind) released with the paper [Perceiver IO: A General Architecture for Structured Inputs & Outputs](https://arxiv.org/abs/2107.14795) by Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, João Carreira.
1. **[PhoBERT](https://huggingface.co/docs/transformers/model_doc/phobert)** (from VinAI Research) released with the paper [PhoBERT: Pre-trained language models for Vietnamese](https://www.aclweb.org/anthology/2020.findings-emnlp.92/) by Dat Quoc Nguyen and Anh Tuan Nguyen.
1. **[ProphetNet](https://huggingface.co/docs/transformers/model_doc/prophetnet)** (from Microsoft Research) released with the paper [ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training](https://arxiv.org/abs/2001.04063) by Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou.
1. **[QDQBert](https://huggingface.co/docs/transformers/model_doc/qdqbert)** (from NVIDIA) released with the paper [Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation](https://arxiv.org/abs/2004.09602) by Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev and Paulius Micikevicius.
Expand Down
1 change: 1 addition & 0 deletions README_ko.md
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,7 @@ Flax, PyTorch, TensorFlow 설치 페이지에서 이들을 conda로 설치하는
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (from Google AI) released with the paper [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
1. **[Pegasus](https://huggingface.co/docs/transformers/model_doc/pegasus)** (from Google) released with the paper [PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization](https://arxiv.org/abs/1912.08777) by Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu.
1. **[Perceiver IO](https://huggingface.co/docs/transformers/model_doc/perceiver)** (from Deepmind) released with the paper [Perceiver IO: A General Architecture for Structured Inputs & Outputs](https://arxiv.org/abs/2107.14795) by Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, João Carreira.
1. **[PhoBERT](https://huggingface.co/docs/transformers/model_doc/phobert)** (from VinAI Research) released with the paper [PhoBERT: Pre-trained language models for Vietnamese](https://www.aclweb.org/anthology/2020.findings-emnlp.92/) by Dat Quoc Nguyen and Anh Tuan Nguyen.
1. **[ProphetNet](https://huggingface.co/docs/transformers/model_doc/prophetnet)** (from Microsoft Research) released with the paper [ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training](https://arxiv.org/abs/2001.04063) by Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou.
1. **[QDQBert](https://huggingface.co/docs/transformers/model_doc/qdqbert)** (from NVIDIA) released with the paper [Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation](https://arxiv.org/abs/2004.09602) by Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev and Paulius Micikevicius.
Expand Down
1 change: 1 addition & 0 deletions README_zh-hans.md
Original file line number Diff line number Diff line change
Expand Up @@ -288,6 +288,7 @@ conda install -c huggingface transformers
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (来自 Microsoft Research) 伴随论文 [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) 由 Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu 发布。
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (来自 Google AI) 伴随论文 [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) 由 Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel 发布。
1. **[Pegasus](https://huggingface.co/docs/transformers/model_doc/pegasus)** (来自 Google) 伴随论文 [PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization](https://arxiv.org/abs/1912.08777) 由 Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu 发布。
1. **[Perceiver IO](https://huggingface.co/docs/transformers/model_doc/perceiver)** (来自 Deepmind) 伴随论文 [Perceiver IO: A General Architecture for Structured Inputs & Outputs](https://arxiv.org/abs/2107.14795) 由 Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, João Carreira 发布。
1. **[PhoBERT](https://huggingface.co/docs/transformers/model_doc/phobert)** (来自 VinAI Research) 伴随论文 [PhoBERT: Pre-trained language models for Vietnamese](https://www.aclweb.org/anthology/2020.findings-emnlp.92/) 由 Dat Quoc Nguyen and Anh Tuan Nguyen 发布。
1. **[ProphetNet](https://huggingface.co/docs/transformers/model_doc/prophetnet)** (来自 Microsoft Research) 伴随论文 [ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training](https://arxiv.org/abs/2001.04063) 由 Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou 发布。
1. **[QDQBert](https://huggingface.co/docs/transformers/model_doc/qdqbert)** (来自 NVIDIA) 伴随论文 [Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation](https://arxiv.org/abs/2004.09602) 由 Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev and Paulius Micikevicius 发布。
Expand Down
1 change: 1 addition & 0 deletions README_zh-hant.md
Original file line number Diff line number Diff line change
Expand Up @@ -300,6 +300,7 @@ conda install -c huggingface transformers
1. **[MPNet](https://huggingface.co/docs/transformers/model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
1. **[MT5](https://huggingface.co/docs/transformers/model_doc/mt5)** (from Google AI) released with the paper [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
1. **[Pegasus](https://huggingface.co/docs/transformers/model_doc/pegasus)** (from Google) released with the paper [PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization](https://arxiv.org/abs/1912.08777) by Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu.
1. **[Perceiver IO](https://huggingface.co/docs/transformers/model_doc/perceiver)** (from Deepmind) released with the paper [Perceiver IO: A General Architecture for Structured Inputs & Outputs](https://arxiv.org/abs/2107.14795) by Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, João Carreira.
1. **[PhoBERT](https://huggingface.co/docs/transformers/model_doc/phobert)** (from VinAI Research) released with the paper [PhoBERT: Pre-trained language models for Vietnamese](https://www.aclweb.org/anthology/2020.findings-emnlp.92/) by Dat Quoc Nguyen and Anh Tuan Nguyen.
1. **[ProphetNet](https://huggingface.co/docs/transformers/model_doc/prophetnet)** (from Microsoft Research) released with the paper [ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training](https://arxiv.org/abs/2001.04063) by Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou.
1. **[QDQBert](https://huggingface.co/docs/transformers/model_doc/qdqbert)** (from NVIDIA) released with the paper [Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation](https://arxiv.org/abs/2004.09602) by Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev and Paulius Micikevicius.
Expand Down
2 changes: 2 additions & 0 deletions docs/source/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@ conversion utilities for the following models.
1. **[MPNet](model_doc/mpnet)** (from Microsoft Research) released with the paper [MPNet: Masked and Permuted Pre-training for Language Understanding](https://arxiv.org/abs/2004.09297) by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu.
1. **[MT5](model_doc/mt5)** (from Google AI) released with the paper [mT5: A massively multilingual pre-trained text-to-text transformer](https://arxiv.org/abs/2010.11934) by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
1. **[Pegasus](model_doc/pegasus)** (from Google) released with the paper [PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization](https://arxiv.org/abs/1912.08777) by Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu.
1. **[Perceiver IO](model_doc/perceiver)** (from Deepmind) released with the paper [Perceiver IO: A General Architecture for Structured Inputs & Outputs](https://arxiv.org/abs/2107.14795) by Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, João Carreira.
1. **[PhoBERT](model_doc/phobert)** (from VinAI Research) released with the paper [PhoBERT: Pre-trained language models for Vietnamese](https://www.aclweb.org/anthology/2020.findings-emnlp.92/) by Dat Quoc Nguyen and Anh Tuan Nguyen.
1. **[ProphetNet](model_doc/prophetnet)** (from Microsoft Research) released with the paper [ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training](https://arxiv.org/abs/2001.04063) by Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou.
1. **[QDQBert](model_doc/qdqbert)** (from NVIDIA) released with the paper [Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation](https://arxiv.org/abs/2004.09602) by Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev and Paulius Micikevicius.
Expand Down Expand Up @@ -227,6 +228,7 @@ Flax), PyTorch, and/or TensorFlow.
| OpenAI GPT | ✅ | ✅ | ✅ | ✅ | ❌ |
| OpenAI GPT-2 | ✅ | ✅ | ✅ | ✅ | ✅ |
| Pegasus | ✅ | ✅ | ✅ | ✅ | ✅ |
| Perceiver | ✅ | ❌ | ✅ | ❌ | ❌ |
| ProphetNet | ✅ | ❌ | ✅ | ❌ | ❌ |
| QDQBert | ❌ | ❌ | ✅ | ❌ | ❌ |
| RAG | ✅ | ❌ | ✅ | ✅ | ❌ |
Expand Down
Loading