English | 简体中文 | 繁體中文 | 한국어 | Español | 日本語 | हिन्दी | Русский | Рortuguês | తెలుగు | Français | Deutsch | Tiếng Việt |
🤗 Transformers는 분류, 정보 추출, 질문 답변, 요약, 번역, 문장 생성 등을 100개 이상의 언어로 수행할 수 있는 수천개의 사전학습된 모델을 제공합니다. 우리의 목표는 모두가 최첨단의 NLP 기술을 쉽게 사용하는 것입니다.
🤗 Transformers는 이러한 사전학습 모델을 빠르게 다운로드해 특정 텍스트에 사용하고, 원하는 데이터로 fine-tuning해 커뮤니티나 우리의 모델 허브에 공유할 수 있도록 API를 제공합니다. 또한, 모델 구조를 정의하는 각 파이썬 모듈은 완전히 독립적이여서 연구 실험을 위해 손쉽게 수정할 수 있습니다.
🤗 Transformers는 가장 유명한 3개의 딥러닝 라이브러리를 지원합니다. 이들은 서로 완벽히 연동됩니다 — Jax, PyTorch, TensorFlow. 간단하게 이 라이브러리 중 하나로 모델을 학습하고, 또 다른 라이브러리로 추론을 위해 모델을 불러올 수 있습니다.
대부분의 모델을 모델 허브 페이지에서 바로 테스트해볼 수 있습니다. 공개 및 비공개 모델을 위한 비공개 모델 호스팅, 버전 관리, 추론 API도 제공합니다.
예시:
- BERT로 마스킹된 단어 완성하기
- Electra를 이용한 개체명 인식
- GPT-2로 텍스트 생성하기
- RoBERTa로 자연어 추론하기
- BART를 이용한 요약
- DistilBERT를 이용한 질문 답변
- T5로 번역하기
Transformer와 글쓰기 는 이 저장소의 텍스트 생성 능력에 관한 Hugging Face 팀의 공식 데모입니다.
원하는 텍스트에 바로 모델을 사용할 수 있도록, 우리는 pipeline
API를 제공합니다. Pipeline은 사전학습 모델과 그 모델을 학습할 때 적용한 전처리 방식을 하나로 합칩니다. 다음은 긍정적인 텍스트와 부정적인 텍스트를 분류하기 위해 pipeline을 사용한 간단한 예시입니다:
>>> from transformers import pipeline
# Allocate a pipeline for sentiment-analysis
>>> classifier = pipeline('sentiment-analysis')
>>> classifier('We are very happy to introduce pipeline to the transformers repository.')
[{'label': 'POSITIVE', 'score': 0.9996980428695679}]
코드의 두번째 줄은 pipeline이 사용하는 사전학습 모델을 다운로드하고 캐시로 저장합니다. 세번째 줄에선 그 모델이 주어진 텍스트를 평가합니다. 여기서 모델은 99.97%의 확률로 텍스트가 긍정적이라고 평가했습니다.
많은 NLP 과제들을 pipeline
으로 바로 수행할 수 있습니다. 예를 들어, 질문과 문맥이 주어지면 손쉽게 답변을 추출할 수 있습니다:
>>> from transformers import pipeline
# Allocate a pipeline for question-answering
>>> question_answerer = pipeline('question-answering')
>>> question_answerer({
... 'question': 'What is the name of the repository ?',
... 'context': 'Pipeline has been included in the huggingface/transformers repository'
... })
{'score': 0.30970096588134766, 'start': 34, 'end': 58, 'answer': 'huggingface/transformers'}
답변뿐만 아니라, 여기에 사용된 사전학습 모델은 확신도와 토크나이즈된 문장 속 답변의 시작점, 끝점까지 반환합니다. 이 튜토리얼에서 pipeline
API가 지원하는 다양한 과제를 확인할 수 있습니다.
코드 3줄로 원하는 과제에 맞게 사전학습 모델을 다운로드 받고 사용할 수 있습니다. 다음은 PyTorch 버전입니다:
>>> from transformers import AutoTokenizer, AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")
>>> model = AutoModel.from_pretrained("google-bert/bert-base-uncased")
>>> inputs = tokenizer("Hello world!", return_tensors="pt")
>>> outputs = model(**inputs)
다음은 TensorFlow 버전입니다:
>>> from transformers import AutoTokenizer, TFAutoModel
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")
>>> model = TFAutoModel.from_pretrained("google-bert/bert-base-uncased")
>>> inputs = tokenizer("Hello world!", return_tensors="tf")
>>> outputs = model(**inputs)
토크나이저는 사전학습 모델의 모든 전처리를 책임집니다. 그리고 (위의 예시처럼) 1개의 스트링이나 리스트도 처리할 수 있습니다. 토크나이저는 딕셔너리를 반환하는데, 이는 다운스트림 코드에 사용하거나 언패킹 연산자 ** 를 이용해 모델에 바로 전달할 수도 있습니다.
모델 자체는 일반적으로 사용되는 Pytorch nn.Module
나 TensorFlow tf.keras.Model
입니다. 이 튜토리얼은 이러한 모델을 표준적인 PyTorch나 TensorFlow 학습 과정에서 사용하는 방법, 또는 새로운 데이터로 fine-tune하기 위해 Trainer
API를 사용하는 방법을 설명해줍니다.
-
손쉽게 사용할 수 있는 최첨단 모델:
- NLU와 NLG 과제에서 뛰어난 성능을 보입니다.
- 교육자 실무자에게 진입 장벽이 낮습니다.
- 3개의 클래스만 배우면 바로 사용할 수 있습니다.
- 하나의 API로 모든 사전학습 모델을 사용할 수 있습니다.
-
더 적은 계산 비용, 더 적은 탄소 발자국:
- 연구자들은 모델을 계속 다시 학습시키는 대신 학습된 모델을 공유할 수 있습니다.
- 실무자들은 학습에 필요한 시간과 비용을 절약할 수 있습니다.
- 수십개의 모델 구조, 2,000개 이상의 사전학습 모델, 100개 이상의 언어로 학습된 모델 등.
-
모델의 각 생애주기에 적합한 프레임워크:
- 코드 3줄로 최첨단 모델을 학습하세요.
- 자유롭게 모델을 TF2.0나 PyTorch 프레임워크로 변환하세요.
- 학습, 평가, 공개 등 각 단계에 맞는 프레임워크를 원하는대로 선택하세요.
-
필요한 대로 모델이나 예시를 커스터마이즈하세요:
- 우리는 저자가 공개한 결과를 재현하기 위해 각 모델 구조의 예시를 제공합니다.
- 모델 내부 구조는 가능한 일관적으로 공개되어 있습니다.
- 빠른 실험을 위해 모델 파일은 라이브러리와 독립적으로 사용될 수 있습니다.
- 이 라이브러리는 신경망 블록을 만들기 위한 모듈이 아닙니다. 연구자들이 여러 파일을 살펴보지 않고 바로 각 모델을 사용할 수 있도록, 모델 파일 코드의 추상화 수준을 적정하게 유지했습니다.
- 학습 API는 모든 모델에 적용할 수 있도록 만들어지진 않았지만, 라이브러리가 제공하는 모델들에 적용할 수 있도록 최적화되었습니다. 일반적인 머신 러닝을 위해선, 다른 라이브러리를 사용하세요.
- 가능한 많은 사용 예시를 보여드리고 싶어서, 예시 폴더의 스크립트를 준비했습니다. 이 스크립트들을 수정 없이 특정한 문제에 바로 적용하지 못할 수 있습니다. 필요에 맞게 일부 코드를 수정해야 할 수 있습니다.
이 저장소는 Python 3.8+, Flax 0.4.1+, PyTorch 1.11+, TensorFlow 2.6+에서 테스트 되었습니다.
가상 환경에 🤗 Transformers를 설치하세요. Python 가상 환경에 익숙하지 않다면, 사용자 가이드를 확인하세요.
우선, 사용할 Python 버전으로 가상 환경을 만들고 실행하세요.
그 다음, Flax, PyTorch, TensorFlow 중 적어도 하나는 설치해야 합니다. 플랫폼에 맞는 설치 명령어를 확인하기 위해 TensorFlow 설치 페이지, PyTorch 설치 페이지, Flax 설치 페이지를 확인하세요.
이들 중 적어도 하나가 설치되었다면, 🤗 Transformers는 다음과 같이 pip을 이용해 설치할 수 있습니다:
pip install transformers
예시들을 체험해보고 싶거나, 최최최첨단 코드를 원하거나, 새로운 버전이 나올 때까지 기다릴 수 없다면 라이브러리를 소스에서 바로 설치하셔야 합니다.
🤗 Transformers는 다음과 같이 conda로 설치할 수 있습니다:
conda install conda-forge::transformers
노트:
huggingface
채널에서transformers
를 설치하는 것은 사용이 중단되었습니다.
Flax, PyTorch, TensorFlow 설치 페이지에서 이들을 conda로 설치하는 방법을 확인하세요.
🤗 Transformers가 제공하는 모든 모델 체크포인트 는 huggingface.co 모델 허브에 완벽히 연동되어 있습니다. 개인과 기관이 모델 허브에 직접 업로드할 수 있습니다.
🤗 Transformers는 다음 모델들을 제공합니다 (각 모델의 요약은 여기서 확인하세요):
- ALBERT (from Google Research and the Toyota Technological Institute at Chicago) released with the paper ALBERT: A Lite BERT for Self-supervised Learning of Language Representations, by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut.
- ALIGN (Google Research 에서 제공)은 Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig.의 Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision논문과 함께 발표했습니다.
- AltCLIP (from BAAI) released with the paper AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities by Chen, Zhongzhi and Liu, Guang and Zhang, Bo-Wen and Ye, Fulong and Yang, Qinghong and Wu, Ledell.
- Audio Spectrogram Transformer (from MIT) released with the paper AST: Audio Spectrogram Transformer by Yuan Gong, Yu-An Chung, James Glass.
- Autoformer (from Tsinghua University) released with the paper Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting by Haixu Wu, Jiehui Xu, Jianmin Wang, Mingsheng Long.
- Bark (from Suno) released in the repository suno-ai/bark by Suno AI team.
- BART (from Facebook) released with the paper BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer.
- BARThez (from École polytechnique) released with the paper BARThez: a Skilled Pretrained French Sequence-to-Sequence Model by Moussa Kamal Eddine, Antoine J.-P. Tixier, Michalis Vazirgiannis.
- BARTpho (from VinAI Research) released with the paper BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese by Nguyen Luong Tran, Duong Minh Le and Dat Quoc Nguyen.
- BEiT (from Microsoft) released with the paper BEiT: BERT Pre-Training of Image Transformers by Hangbo Bao, Li Dong, Furu Wei.
- BERT (from Google) released with the paper BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova.
- BERT For Sequence Generation (from Google) released with the paper Leveraging Pre-trained Checkpoints for Sequence Generation Tasks by Sascha Rothe, Shashi Narayan, Aliaksei Severyn.
- BERTweet (from VinAI Research) released with the paper BERTweet: A pre-trained language model for English Tweets by Dat Quoc Nguyen, Thanh Vu and Anh Tuan Nguyen.
- BigBird-Pegasus (from Google Research) released with the paper Big Bird: Transformers for Longer Sequences by Manzil Zaheer, Guru Guruganesh, Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, Amr Ahmed.
- BigBird-RoBERTa (from Google Research) released with the paper Big Bird: Transformers for Longer Sequences by Manzil Zaheer, Guru Guruganesh, Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, Amr Ahmed.
- BioGpt (from Microsoft Research AI4Science) released with the paper BioGPT: generative pre-trained transformer for biomedical text generation and mining by Renqian Luo, Liai Sun, Yingce Xia, Tao Qin, Sheng Zhang, Hoifung Poon and Tie-Yan Liu.
- BiT (from Google AI) released with the paper [Big Transfer (BiT) by Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, Neil Houlsby.
- Blenderbot (from Facebook) released with the paper Recipes for building an open-domain chatbot by Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu, Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston.
- BlenderbotSmall (from Facebook) released with the paper Recipes for building an open-domain chatbot by Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu, Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston.
- BLIP (from Salesforce) released with the paper BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation by Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi.
- BLIP-2 (Salesforce 에서 제공)은 Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi.의 BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models논문과 함께 발표했습니다.
- BLOOM (from BigScience workshop) released by the BigScience Workshop.
- BORT (Alexa 에서) Adrian de Wynter and Daniel J. Perry 의 Optimal Subarchitecture Extraction For BERT 논문과 함께 발표했습니다.
- BridgeTower (from Harbin Institute of Technology/Microsoft Research Asia/Intel Labs) released with the paper BridgeTower: Building Bridges Between Encoders in Vision-Language Representation Learning by Xiao Xu, Chenfei Wu, Shachar Rosenman, Vasudev Lal, Wanxiang Che, Nan Duan.
- BROS (NAVER CLOVA 에서 제공)은 Teakgyu Hong, Donghyun Kim, Mingi Ji, Wonseok Hwang, Daehyun Nam, Sungrae Park.의 BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents논문과 함께 발표했습니다.
- ByT5 (Google Research 에서) Linting Xue, Aditya Barua, Noah Constant, Rami Al-Rfou, Sharan Narang, Mihir Kale, Adam Roberts, Colin Raffel 의 ByT5: Towards a token-free future with pre-trained byte-to-byte models 논문과 함께 발표했습니다.
- CamemBERT (Inria/Facebook/Sorbonne 에서) Louis Martin*, Benjamin Muller*, Pedro Javier Ortiz Suárez*, Yoann Dupont, Laurent Romary, Éric Villemonte de la Clergerie, Djamé Seddah and Benoît Sagot 의 CamemBERT: a Tasty French Language Model 논문과 함께 발표했습니다.
- CANINE (Google Research 에서) Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting 의 CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation 논문과 함께 발표했습니다.
- Chinese-CLIP (OFA-Sys 에서) An Yang, Junshu Pan, Junyang Lin, Rui Men, Yichang Zhang, Jingren Zhou, Chang Zhou 의 Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese 논문과 함께 발표했습니다.
- CLAP (LAION-AI 에서 제공)은 Yusong Wu, Ke Chen, Tianyu Zhang, Yuchen Hui, Taylor Berg-Kirkpatrick, Shlomo Dubnov.의 Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation논문과 함께 발표했습니다.
- CLIP (OpenAI 에서) Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever 의 Learning Transferable Visual Models From Natural Language Supervision 논문과 함께 발표했습니다.
- CLIPSeg (University of Göttingen 에서) Timo Lüddecke and Alexander Ecker 의 Image Segmentation Using Text and Image Prompts 논문과 함께 발표했습니다.
- CLVP released with the paper Better speech synthesis through scaling by James Betker.
- CodeGen (Salesforce 에서) Erik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, Caiming Xiong 의 A Conversational Paradigm for Program Synthesis 논문과 함께 발표했습니다.
- CodeLlama (MetaAI 에서 제공)은 Baptiste Rozière, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat, Xiaoqing Ellen Tan, Yossi Adi, Jingyu Liu, Tal Remez, Jérémy Rapin, Artyom Kozhevnikov, Ivan Evtimov, Joanna Bitton, Manish Bhatt, Cristian Canton Ferrer, Aaron Grattafiori, Wenhan Xiong, Alexandre Défossez, Jade Copet, Faisal Azhar, Hugo Touvron, Louis Martin, Nicolas Usunier, Thomas Scialom, Gabriel Synnaeve.의 Code Llama: Open Foundation Models for Code논문과 함께 발표했습니다.
- Cohere (Cohere 에서 제공)은 Cohere. 의 Command-R: Retrieval Augmented Generation at Production Scale논문과 함께 발표했습니다.
- Conditional DETR (Microsoft Research Asia 에서) Depu Meng, Xiaokang Chen, Zejia Fan, Gang Zeng, Houqiang Li, Yuhui Yuan, Lei Sun, Jingdong Wang 의 Conditional DETR for Fast Training Convergence 논문과 함께 발표했습니다.
- ConvBERT (YituTech 에서) Zihang Jiang, Weihao Yu, Daquan Zhou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan 의 ConvBERT: Improving BERT with Span-based Dynamic Convolution 논문과 함께 발표했습니다.
- ConvNeXT (Facebook AI 에서) Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie 의 A ConvNet for the 2020s 논문과 함께 발표했습니다.
- ConvNeXTV2 (from Facebook AI) released with the paper ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders by Sanghyun Woo, Shoubhik Debnath, Ronghang Hu, Xinlei Chen, Zhuang Liu, In So Kweon, Saining Xie.
- CPM (Tsinghua University 에서) Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun 의 CPM: A Large-scale Generative Chinese Pre-trained Language Model 논문과 함께 발표했습니다.
- CPM-Ant (from OpenBMB) released by the OpenBMB.
- CTRL (Salesforce 에서) Nitish Shirish Keskar*, Bryan McCann*, Lav R. Varshney, Caiming Xiong and Richard Socher 의 CTRL: A Conditional Transformer Language Model for Controllable Generation 논문과 함께 발표했습니다.
- CvT (Microsoft 에서) Haiping Wu, Bin Xiao, Noel Codella, Mengchen Liu, Xiyang Dai, Lu Yuan, Lei Zhang 의 CvT: Introducing Convolutions to Vision Transformers 논문과 함께 발표했습니다.
- Data2Vec (Facebook 에서) Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli 의 Data2Vec: A General Framework for Self-supervised Learning in Speech, Vision and Language 논문과 함께 발표했습니다.
- DeBERTa (Microsoft 에서) Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen 의 DeBERTa: Decoding-enhanced BERT with Disentangled Attention 논문과 함께 발표했습니다.
- DeBERTa-v2 (Microsoft 에서) Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen 의 DeBERTa: Decoding-enhanced BERT with Disentangled Attention 논문과 함께 발표했습니다.
- Decision Transformer (Berkeley/Facebook/Google 에서) Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch 의 Decision Transformer: Reinforcement Learning via Sequence Modeling 논문과 함께 발표했습니다.
- Deformable DETR (SenseTime Research 에서) Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, Jifeng Dai 의 Deformable DETR: Deformable Transformers for End-to-End Object Detection 논문과 함께 발표했습니다.
- DeiT (Facebook 에서) Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou 의 Training data-efficient image transformers & distillation through attention 논문과 함께 발표했습니다.
- DePlot (Google AI 에서 제공)은 Fangyu Liu, Julian Martin Eisenschlos, Francesco Piccinno, Syrine Krichene, Chenxi Pang, Kenton Lee, Mandar Joshi, Wenhu Chen, Nigel Collier, Yasemin Altun.의 DePlot: One-shot visual language reasoning by plot-to-table translation논문과 함께 발표했습니다.
- Depth Anything (University of Hong Kong and TikTok 에서 제공)은 Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao.의 Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data논문과 함께 발표했습니다.
- DETA (The University of Texas at Austin 에서 제공)은 Jeffrey Ouyang-Zhang, Jang Hyun Cho, Xingyi Zhou, Philipp Krähenbühl.의 NMS Strikes Back논문과 함께 발표했습니다.
- DETR (Facebook 에서) Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, Sergey Zagoruyko 의 End-to-End Object Detection with Transformers 논문과 함께 발표했습니다.
- DialoGPT (Microsoft Research 에서) Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, Bill Dolan 의 DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation 논문과 함께 발표했습니다.
- DiNAT (SHI Labs 에서) Ali Hassani and Humphrey Shi 의 Dilated Neighborhood Attention Transformer 논문과 함께 발표했습니다.
- DINOv2 (Meta AI 에서 제공)은 Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Hervé Jegou, Julien Mairal, Patrick Labatut, Armand Joulin, Piotr Bojanowski.의 DINOv2: Learning Robust Visual Features without Supervision논문과 함께 발표했습니다.
- DistilBERT (HuggingFace 에서) Victor Sanh, Lysandre Debut and Thomas Wolf. The same method has been applied to compress GPT2 into DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter, RoBERTa into DistilRoBERTa, Multilingual BERT into DistilmBERT and a German version of DistilBERT 의 DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter 논문과 함께 발표했습니다.
- DiT (Microsoft Research 에서) Junlong Li, Yiheng Xu, Tengchao Lv, Lei Cui, Cha Zhang, Furu Wei 의 DiT: Self-supervised Pre-training for Document Image Transformer 논문과 함께 발표했습니다.
- Donut (NAVER 에서) Geewook Kim, Teakgyu Hong, Moonbin Yim, Jeongyeon Nam, Jinyoung Park, Jinyeong Yim, Wonseok Hwang, Sangdoo Yun, Dongyoon Han, Seunghyun Park 의 OCR-free Document Understanding Transformer 논문과 함께 발표했습니다.
- DPR (Facebook 에서) Vladimir Karpukhin, Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih 의 Dense Passage Retrieval for Open-Domain Question Answering 논문과 함께 발표했습니다.
- DPT (Intel Labs 에서) René Ranftl, Alexey Bochkovskiy, Vladlen Koltun 의 Vision Transformers for Dense Prediction 논문과 함께 발표했습니다.
- EfficientFormer (from Snap Research) released with the paper EfficientFormer: Vision Transformers at MobileNetSpeed by Yanyu Li, Geng Yuan, Yang Wen, Ju Hu, Georgios Evangelidis, Sergey Tulyakov, Yanzhi Wang, Jian Ren.
- EfficientNet (from Google Brain) released with the paper EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks by Mingxing Tan, Quoc V. Le.
- ELECTRA (Google Research/Stanford University 에서) Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning 의 ELECTRA: Pre-training text encoders as discriminators rather than generators 논문과 함께 발표했습니다.
- EnCodec (Meta AI 에서 제공)은 Alexandre Défossez, Jade Copet, Gabriel Synnaeve, Yossi Adi.의 High Fidelity Neural Audio Compression논문과 함께 발표했습니다.
- EncoderDecoder (Google Research 에서) Sascha Rothe, Shashi Narayan, Aliaksei Severyn 의 Leveraging Pre-trained Checkpoints for Sequence Generation Tasks 논문과 함께 발표했습니다.
- ERNIE (Baidu 에서) Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, Hua Wu 의 ERNIE: Enhanced Representation through Knowledge Integration 논문과 함께 발표했습니다.
- ErnieM (Baidu 에서 제공)은 Xuan Ouyang, Shuohuan Wang, Chao Pang, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang.의 ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual Semantics with Monolingual Corpora논문과 함께 발표했습니다.
- ESM (from Meta AI) are transformer protein language models. ESM-1b was released with the paper Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences by Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, Myle Ott, C. Lawrence Zitnick, Jerry Ma, and Rob Fergus. ESM-1v was released with the paper Language models enable zero-shot prediction of the effects of mutations on protein function by Joshua Meier, Roshan Rao, Robert Verkuil, Jason Liu, Tom Sercu and Alexander Rives. ESM-2 was released with the paper Language models of protein sequences at the scale of evolution enable accurate structure prediction by Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Allan dos Santos Costa, Maryam Fazel-Zarandi, Tom Sercu, Sal Candido, Alexander Rives.
- Falcon (from Technology Innovation Institute) by Almazrouei, Ebtesam and Alobeidli, Hamza and Alshamsi, Abdulaziz and Cappelli, Alessandro and Cojocaru, Ruxandra and Debbah, Merouane and Goffinet, Etienne and Heslow, Daniel and Launay, Julien and Malartic, Quentin and Noune, Badreddine and Pannier, Baptiste and Penedo, Guilherme.
- FastSpeech2Conformer (ESPnet and Microsoft Research 에서 제공)은 Pengcheng Guo, Florian Boyer, Xuankai Chang, Tomoki Hayashi, Yosuke Higuchi, Hirofumi Inaguma, Naoyuki Kamo, Chenda Li, Daniel Garcia-Romero, Jiatong Shi, Jing Shi, Shinji Watanabe, Kun Wei, Wangyou Zhang, and Yuekai Zhang.의 Recent Developments On Espnet Toolkit Boosted By Conformer논문과 함께 발표했습니다.
- FLAN-T5 (from Google AI) released in the repository google-research/t5x by Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, and Jason Wei
- FLAN-UL2 (from Google AI) released in the repository google-research/t5x by Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Eric Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang, Andrew Dai, Hongkun Yu, Slav Petrov, Ed H. Chi, Jeff Dean, Jacob Devlin, Adam Roberts, Denny Zhou, Quoc V. Le, and Jason Wei
- FlauBERT (from CNRS) released with the paper FlauBERT: Unsupervised Language Model Pre-training for French by Hang Le, Loïc Vial, Jibril Frej, Vincent Segonne, Maximin Coavoux, Benjamin Lecouteux, Alexandre Allauzen, Benoît Crabbé, Laurent Besacier, Didier Schwab.
- FLAVA (from Facebook AI) released with the paper FLAVA: A Foundational Language And Vision Alignment Model by Amanpreet Singh, Ronghang Hu, Vedanuj Goswami, Guillaume Couairon, Wojciech Galuba, Marcus Rohrbach, and Douwe Kiela.
- FNet (from Google Research) released with the paper FNet: Mixing Tokens with Fourier Transforms by James Lee-Thorp, Joshua Ainslie, Ilya Eckstein, Santiago Ontanon.
- FocalNet (from Microsoft Research) released with the paper Focal Modulation Networks by Jianwei Yang, Chunyuan Li, Xiyang Dai, Lu Yuan, Jianfeng Gao.
- Funnel Transformer (from CMU/Google Brain) released with the paper Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing by Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le.
- Fuyu (from ADEPT) Rohan Bavishi, Erich Elsen, Curtis Hawthorne, Maxwell Nye, Augustus Odena, Arushi Somani, Sağnak Taşırlar. 논문과 함께 공개 blog post
- Gemma (Google 에서 제공)은 the Gemma Google team.의 Gemma: Open Models Based on Gemini Technology and Research논문과 함께 발표했습니다.
- GIT (from Microsoft Research) released with the paper GIT: A Generative Image-to-text Transformer for Vision and Language by Jianfeng Wang, Zhengyuan Yang, Xiaowei Hu, Linjie Li, Kevin Lin, Zhe Gan, Zicheng Liu, Ce Liu, Lijuan Wang.
- GLPN (from KAIST) released with the paper Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth by Doyeon Kim, Woonghyun Ga, Pyungwhan Ahn, Donggyu Joo, Sehwan Chun, Junmo Kim.
- GPT (from OpenAI) released with the paper Improving Language Understanding by Generative Pre-Training by Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever.
- GPT Neo (from EleutherAI) released in the repository EleutherAI/gpt-neo by Sid Black, Stella Biderman, Leo Gao, Phil Wang and Connor Leahy.
- GPT NeoX (EleutherAI 에서) Sid Black, Stella Biderman, Eric Hallahan, Quentin Anthony, Leo Gao, Laurence Golding, Horace He, Connor Leahy, Kyle McDonell, Jason Phang, Michael Pieler, USVSN Sai Prashanth, Shivanshu Purohit, Laria Reynolds, Jonathan Tow, Ben Wang, Samuel Weinbac 의 GPT-NeoX-20B: An Open-Source Autoregressive Language Model 논문과 함께 발표했습니다.
- GPT NeoX Japanese (from ABEJA) released by Shinya Otani, Takayoshi Makabe, Anuj Arora, and Kyo Hattori.
- GPT-2 (OpenAI 에서) Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei and Ilya Sutskever 의 Language Models are Unsupervised Multitask Learners 논문과 함께 발표했습니다.
- GPT-J (from EleutherAI) released in the repository kingoflolz/mesh-transformer-jax by Ben Wang and Aran Komatsuzaki.
- GPT-Sw3 (AI-Sweden 에서) Ariel Ekgren, Amaru Cuba Gyllensten, Evangelia Gogoulou, Alice Heiman, Severine Verlinden, Joey Öhman, Fredrik Carlsson, Magnus Sahlgren. 의 Lessons Learned from GPT-SW3: Building the First Large-Scale Generative Language Model for Swedish 논문과 함께 발표했습니다.
- GPTBigCode (BigCode 에서 제공)은 Loubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff, Mayank Mishra, Alex Gu, Manan Dey, Logesh Kumar Umapathi, Carolyn Jane Anderson, Yangtian Zi, Joel Lamy Poirier, Hailey Schoelkopf, Sergey Troshin, Dmitry Abulkhanov, Manuel Romero, Michael Lappert, Francesco De Toni, Bernardo García del Río, Qian Liu, Shamik Bose, Urvashi Bhattacharyya, Terry Yue Zhuo, Ian Yu, Paulo Villegas, Marco Zocca, Sourab Mangrulkar, David Lansky, Huu Nguyen, Danish Contractor, Luis Villa, Jia Li, Dzmitry Bahdanau, Yacine Jernite, Sean Hughes, Daniel Fried, Arjun Guha, Harm de Vries, Leandro von Werra.의 SantaCoder: don't reach for the stars!논문과 함께 발표했습니다.
- GPTSAN-japanese released in the repository tanreinama/GPTSAN by Toshiyuki Sakamoto(tanreinama).
- Graphormer (from Microsoft) Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, Tie-Yan Liu 의 Do Transformers Really Perform Bad for Graph Representation? 논문과 함께 발표했습니다.
- GroupViT (UCSD, NVIDIA 에서) Jiarui Xu, Shalini De Mello, Sifei Liu, Wonmin Byeon, Thomas Breuel, Jan Kautz, Xiaolong Wang 의 GroupViT: Semantic Segmentation Emerges from Text Supervision 논문과 함께 발표했습니다.
- HerBERT (Allegro.pl, AGH University of Science and Technology 에서 제공)은 Piotr Rybak, Robert Mroczkowski, Janusz Tracz, Ireneusz Gawlik.의 KLEJ: Comprehensive Benchmark for Polish Language Understanding논문과 함께 발표했습니다.
- Hubert (Facebook 에서) Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, Abdelrahman Mohamed 의 HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units 논문과 함께 발표했습니다.
- I-BERT (Berkeley 에서) Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer 의 I-BERT: Integer-only BERT Quantization 논문과 함께 발표했습니다.
- IDEFICS (from HuggingFace) released with the paper OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents by Hugo Laurençon, Lucile Saulnier, Léo Tronchon, Stas Bekman, Amanpreet Singh, Anton Lozhkov, Thomas Wang, Siddharth Karamcheti, Alexander M. Rush, Douwe Kiela, Matthieu Cord, Victor Sanh.
- ImageGPT (OpenAI 에서) Mark Chen, Alec Radford, Rewon Child, Jeffrey Wu, Heewoo Jun, David Luan, Ilya Sutskever 의 Generative Pretraining from Pixels 논문과 함께 발표했습니다.
- Informer (from Beihang University, UC Berkeley, Rutgers University, SEDD Company) released with the paper Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting by Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, and Wancai Zhang.
- InstructBLIP (Salesforce 에서 제공)은 Wenliang Dai, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Junqi Zhao, Weisheng Wang, Boyang Li, Pascale Fung, Steven Hoi.의 InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning논문과 함께 발표했습니다.
- Jukebox (OpenAI 에서) Prafulla Dhariwal, Heewoo Jun, Christine Payne, Jong Wook Kim, Alec Radford, Ilya Sutskever 의 Jukebox: A Generative Model for Music 논문과 함께 발표했습니다.
- KOSMOS-2 (from Microsoft Research Asia) released with the paper Kosmos-2: Grounding Multimodal Large Language Models to the World by Zhiliang Peng, Wenhui Wang, Li Dong, Yaru Hao, Shaohan Huang, Shuming Ma, Furu Wei.
- LayoutLM (Microsoft Research Asia 에서) Yiheng Xu, Minghao Li, Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou 의 LayoutLM: Pre-training of Text and Layout for Document Image Understanding 논문과 함께 발표했습니다.
- LayoutLMv2 (Microsoft Research Asia 에서) Yang Xu, Yiheng Xu, Tengchao Lv, Lei Cui, Furu Wei, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Wanxiang Che, Min Zhang, Lidong Zhou 의 LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding 논문과 함께 발표했습니다.
- LayoutLMv3 (Microsoft Research Asia 에서) Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei 의 LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking 논문과 함께 발표했습니다.
- LayoutXLM (Microsoft Research Asia 에서) Yiheng Xu, Tengchao Lv, Lei Cui, Guoxin Wang, Yijuan Lu, Dinei Florencio, Cha Zhang, Furu Wei 의 LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding 논문과 함께 발표했습니다.
- LED (AllenAI 에서) Iz Beltagy, Matthew E. Peters, Arman Cohan 의 Longformer: The Long-Document Transformer 논문과 함께 발표했습니다.
- LeViT (Meta AI 에서) Ben Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Hervé Jégou, Matthijs Douze 의 LeViT: A Vision Transformer in ConvNet's Clothing for Faster Inference 논문과 함께 발표했습니다.
- LiLT (South China University of Technology 에서) Jiapeng Wang, Lianwen Jin, Kai Ding 의 LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding 논문과 함께 발표했습니다.
- LLaMA (The FAIR team of Meta AI 에서 제공)은 Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample.의 LLaMA: Open and Efficient Foundation Language Models논문과 함께 발표했습니다.
- Llama2 (The FAIR team of Meta AI 에서 제공)은 Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushka rMishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing EllenTan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurelien Rodriguez, Robert Stojnic, Sergey Edunov, Thomas Scialom..의 Llama2: Open Foundation and Fine-Tuned Chat Models논문과 함께 발표했습니다.
- LLaVa (Microsoft Research & University of Wisconsin-Madison 에서 제공)은 Haotian Liu, Chunyuan Li, Yuheng Li and Yong Jae Lee.의 Visual Instruction Tuning논문과 함께 발표했습니다.
- LLaVA-NeXT (Microsoft Research & University of Wisconsin-Madison 에서 제공)은 Haotian Liu, Chunyuan Li, Yuheng Li and Yong Jae Lee.의 Improved Baselines with Visual Instruction Tuning논문과 함께 발표했습니다.
- Longformer (AllenAI 에서) Iz Beltagy, Matthew E. Peters, Arman Cohan 의 Longformer: The Long-Document Transformer 논문과 함께 발표했습니다.
- LongT5 (Google AI 에서) Mandy Guo, Joshua Ainslie, David Uthus, Santiago Ontanon, Jianmo Ni, Yun-Hsuan Sung, Yinfei Yang 의 LongT5: Efficient Text-To-Text Transformer for Long Sequences 논문과 함께 발표했습니다.
- LUKE (Studio Ousia 에서) Ikuya Yamada, Akari Asai, Hiroyuki Shindo, Hideaki Takeda, Yuji Matsumoto 의 LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention 논문과 함께 발표했습니다.
- LXMERT (UNC Chapel Hill 에서) Hao Tan and Mohit Bansal 의 LXMERT: Learning Cross-Modality Encoder Representations from Transformers for Open-Domain Question Answering 논문과 함께 발표했습니다.
- M-CTC-T (Facebook 에서) Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, and Ronan Collobert 의 Pseudo-Labeling For Massively Multilingual Speech Recognition 논문과 함께 발표했습니다.
- M2M100 (Facebook 에서) Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin 의 Beyond English-Centric Multilingual Machine Translation 논문과 함께 발표했습니다.
- MADLAD-400 (from Google) released with the paper MADLAD-400: A Multilingual And Document-Level Large Audited Dataset by Sneha Kudugunta, Isaac Caswell, Biao Zhang, Xavier Garcia, Christopher A. Choquette-Choo, Katherine Lee, Derrick Xin, Aditya Kusupati, Romi Stella, Ankur Bapna, Orhan Firat.
- Mamba (Albert Gu and Tri Dao 에서 제공)은 Albert Gu and Tri Dao.의 Mamba: Linear-Time Sequence Modeling with Selective State Spaces논문과 함께 발표했습니다.
- MarianMT Machine translation models trained using OPUS data by Jörg Tiedemann. The Marian Framework is being developed by the Microsoft Translator Team.
- MarkupLM (Microsoft Research Asia 에서) Junlong Li, Yiheng Xu, Lei Cui, Furu Wei 의 MarkupLM: Pre-training of Text and Markup Language for Visually-rich Document Understanding 논문과 함께 발표했습니다.
- Mask2Former (FAIR and UIUC 에서 제공)은 Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar.의 Masked-attention Mask Transformer for Universal Image Segmentation논문과 함께 발표했습니다.
- MaskFormer (Meta and UIUC 에서) Bowen Cheng, Alexander G. Schwing, Alexander Kirillov 의 Per-Pixel Classification is Not All You Need for Semantic Segmentation 논문과 함께 발표했습니다.
- MatCha (Google AI 에서 제공)은 Fangyu Liu, Francesco Piccinno, Syrine Krichene, Chenxi Pang, Kenton Lee, Mandar Joshi, Yasemin Altun, Nigel Collier, Julian Martin Eisenschlos.의 MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering논문과 함께 발표했습니다.
- mBART (Facebook 에서) Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer 의 Multilingual Denoising Pre-training for Neural Machine Translation 논문과 함께 발표했습니다.
- mBART-50 (Facebook 에서) Yuqing Tang, Chau Tran, Xian Li, Peng-Jen Chen, Naman Goyal, Vishrav Chaudhary, Jiatao Gu, Angela Fan 의 Multilingual Translation with Extensible Multilingual Pretraining and Finetuning 논문과 함께 발표했습니다.
- MEGA (Facebook 에서 제공)은 Xuezhe Ma, Chunting Zhou, Xiang Kong, Junxian He, Liangke Gui, Graham Neubig, Jonathan May, and Luke Zettlemoyer.의 Mega: Moving Average Equipped Gated Attention논문과 함께 발표했습니다.
- Megatron-BERT (NVIDIA 에서) Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper and Bryan Catanzaro 의 Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism 논문과 함께 발표했습니다.
- Megatron-GPT2 (NVIDIA 에서) Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper and Bryan Catanzaro 의 Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism 논문과 함께 발표했습니다.
- MGP-STR (Alibaba Research 에서 제공)은 Peng Wang, Cheng Da, and Cong Yao.의 Multi-Granularity Prediction for Scene Text Recognition논문과 함께 발표했습니다.
- Mistral (from Mistral AI) by The Mistral AI team: Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed..
- Mixtral (from Mistral AI) by The Mistral AI team: Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.
- mLUKE (Studio Ousia 에서) Ryokan Ri, Ikuya Yamada, and Yoshimasa Tsuruoka 의 mLUKE: The Power of Entity Representations in Multilingual Pretrained Language Models 논문과 함께 발표했습니다.
- MMS (Facebook 에서 제공)은 Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli.의 Scaling Speech Technology to 1,000+ Languages논문과 함께 발표했습니다.
- MobileBERT (CMU/Google Brain 에서) Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, and Denny Zhou 의 MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices 논문과 함께 발표했습니다.
- MobileNetV1 (Google Inc. 에서) Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam 의 MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications 논문과 함께 발표했습니다.
- MobileNetV2 (Google Inc. 에서) Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen 의 MobileNetV2: Inverted Residuals and Linear Bottlenecks 논문과 함께 발표했습니다.
- MobileViT (Apple 에서) Sachin Mehta and Mohammad Rastegari 의 MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer 논문과 함께 발표했습니다.
- MobileViTV2 (Apple 에서 제공)은 Sachin Mehta and Mohammad Rastegari.의 Separable Self-attention for Mobile Vision Transformers논문과 함께 발표했습니다.
- MPNet (Microsoft Research 에서) Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu 의 MPNet: Masked and Permuted Pre-training for Language Understanding 논문과 함께 발표했습니다.
- MPT (MosaiML 에서 제공)은 the MosaicML NLP Team.의 llm-foundry논문과 함께 발표했습니다.
- MRA (the University of Wisconsin - Madison 에서 제공)은 Zhanpeng Zeng, Sourav Pal, Jeffery Kline, Glenn M Fung, Vikas Singh.의 Multi Resolution Analysis (MRA) for Approximate Self-Attention 논문과 함께 발표했습니다.
- MT5 (Google AI 에서) Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel 의 mT5: A massively multilingual pre-trained text-to-text transformer 논문과 함께 발표했습니다.
- MusicGen (from Meta) released with the paper Simple and Controllable Music Generation by Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi and Alexandre Défossez.
- MusicGen Melody (from Meta) released with the paper Simple and Controllable Music Generation by Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, Gabriel Synnaeve, Yossi Adi and Alexandre Défossez.
- MVP (RUC AI Box 에서) Tianyi Tang, Junyi Li, Wayne Xin Zhao and Ji-Rong Wen 의 MVP: Multi-task Supervised Pre-training for Natural Language Generation 논문과 함께 발표했습니다.
- NAT (SHI Labs 에서) Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi 의 Neighborhood Attention Transformer 논문과 함께 발표했습니다.
- Nezha (Huawei Noah’s Ark Lab 에서) Junqiu Wei, Xiaozhe Ren, Xiaoguang Li, Wenyong Huang, Yi Liao, Yasheng Wang, Jiashu Lin, Xin Jiang, Xiao Chen and Qun Liu 의 NEZHA: Neural Contextualized Representation for Chinese Language Understanding 논문과 함께 발표했습니다.
- NLLB (Meta 에서) the NLLB team 의 No Language Left Behind: Scaling Human-Centered Machine Translation 논문과 함께 발표했습니다.
- NLLB-MOE (Meta 에서 제공)은 the NLLB team.의 No Language Left Behind: Scaling Human-Centered Machine Translation논문과 함께 발표했습니다.
- Nougat (Meta AI 에서 제공)은 Lukas Blecher, Guillem Cucurull, Thomas Scialom, Robert Stojnic.의 Nougat: Neural Optical Understanding for Academic Documents논문과 함께 발표했습니다.
- Nyströmformer (the University of Wisconsin - Madison 에서) Yunyang Xiong, Zhanpeng Zeng, Rudrasis Chakraborty, Mingxing Tan, Glenn Fung, Yin Li, Vikas Singh 의 Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention 논문과 함께 발표했습니다.
- OneFormer (SHI Labs 에서) Jitesh Jain, Jiachen Li, MangTik Chiu, Ali Hassani, Nikita Orlov, Humphrey Shi 의 OneFormer: One Transformer to Rule Universal Image Segmentation 논문과 함께 발표했습니다.
- OpenLlama (from s-JoL) released on GitHub (now removed).
- OPT (Meta AI 에서) Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen et al 의 OPT: Open Pre-trained Transformer Language Models 논문과 함께 발표했습니다.
- OWL-ViT (Google AI 에서) Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, and Neil Houlsby 의 Simple Open-Vocabulary Object Detection with Vision Transformers 논문과 함께 발표했습니다.
- OWLv2 (Google AI 에서 제공)은 Matthias Minderer, Alexey Gritsenko, Neil Houlsby.의 Scaling Open-Vocabulary Object Detection논문과 함께 발표했습니다.
- PatchTSMixer ( IBM Research 에서 제공)은 Vijay Ekambaram, Arindam Jati, Nam Nguyen, Phanwadee Sinthong, Jayant Kalagnanam.의 TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecasting논문과 함께 발표했습니다.
- PatchTST (IBM 에서 제공)은 Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, Jayant Kalagnanam.의 A Time Series is Worth 64 Words: Long-term Forecasting with Transformers논문과 함께 발표했습니다.
- Pegasus (Google 에서) Jingqing Zhang, Yao Zhao, Mohammad Saleh and Peter J. Liu 의 PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization 논문과 함께 발표했습니다.
- PEGASUS-X (Google 에서) Jason Phang, Yao Zhao, Peter J. Liu 의 Investigating Efficiently Extending Transformers for Long Input Summarization 논문과 함께 발표했습니다.
- Perceiver IO (Deepmind 에서) Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, João Carreira 의 Perceiver IO: A General Architecture for Structured Inputs & Outputs 논문과 함께 발표했습니다.
- Persimmon (ADEPT 에서 제공)은 Erich Elsen, Augustus Odena, Maxwell Nye, Sağnak Taşırlar, Tri Dao, Curtis Hawthorne, Deepak Moparthi, Arushi Somani.의 blog post논문과 함께 발표했습니다.
- Phi (from Microsoft) released with the papers - Textbooks Are All You Need by Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee and Yuanzhi Li, Textbooks Are All You Need II: phi-1.5 technical report by Yuanzhi Li, Sébastien Bubeck, Ronen Eldan, Allie Del Giorno, Suriya Gunasekar and Yin Tat Lee.
- PhoBERT (VinAI Research 에서) Dat Quoc Nguyen and Anh Tuan Nguyen 의 PhoBERT: Pre-trained language models for Vietnamese 논문과 함께 발표했습니다.
- Pix2Struct (Google 에서 제공)은 Kenton Lee, Mandar Joshi, Iulia Turc, Hexiang Hu, Fangyu Liu, Julian Eisenschlos, Urvashi Khandelwal, Peter Shaw, Ming-Wei Chang, Kristina Toutanova.의 Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding논문과 함께 발표했습니다.
- PLBart (UCLA NLP 에서) Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang 의 Unified Pre-training for Program Understanding and Generation 논문과 함께 발표했습니다.
- PoolFormer (Sea AI Labs 에서) Yu, Weihao and Luo, Mi and Zhou, Pan and Si, Chenyang and Zhou, Yichen and Wang, Xinchao and Feng, Jiashi and Yan, Shuicheng 의 MetaFormer is Actually What You Need for Vision 논문과 함께 발표했습니다.
- Pop2Piano released with the paper Pop2Piano : Pop Audio-based Piano Cover Generation by Jongho Choi, Kyogu Lee.
- ProphetNet (Microsoft Research 에서) Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou 의 ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training 논문과 함께 발표했습니다.
- PVT (Nanjing University, The University of Hong Kong etc. 에서 제공)은 Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao.의 Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions논문과 함께 발표했습니다.
- PVTv2 (Shanghai AI Laboratory, Nanjing University, The University of Hong Kong etc. 에서 제공)은 Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao.의 PVT v2: Improved Baselines with Pyramid Vision Transformer논문과 함께 발표했습니다.
- QDQBert (NVIDIA 에서) Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev and Paulius Micikevicius 의 Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation 논문과 함께 발표했습니다.
- Qwen2 (the Qwen team, Alibaba Group 에서 제공)은 Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou and Tianhang Zhu.의 Qwen Technical Report논문과 함께 발표했습니다.
- Qwen2MoE (the Qwen team, Alibaba Group 에서 제공)은 Bo Zheng, Dayiheng Liu, Rui Men, Junyang Lin, Zhou San, Bowen Yu, An Yang, Mingfeng Xue, Fei Huang, Binyuan Hui, Mei Li, Tianyu Liu, Xingzhang Ren, Xuancheng Ren, Kexin Yang, Chang Zhou, Jingren Zhou.의 blog post논문과 함께 발표했습니다.
- RAG (Facebook 에서) Patrick Lewis, Ethan Perez, Aleksandara Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, Douwe Kiela 의 Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks 논문과 함께 발표했습니다.
- REALM (Google Research 에서) Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat and Ming-Wei Chang 의 REALM: Retrieval-Augmented Language Model Pre-Training 논문과 함께 발표했습니다.
- Reformer (Google Research 에서) Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya 의 Reformer: The Efficient Transformer 논문과 함께 발표했습니다.
- RegNet (META Research 에서) Ilija Radosavovic, Raj Prateek Kosaraju, Ross Girshick, Kaiming He, Piotr Dollár 의 Designing Network Design Space 논문과 함께 발표했습니다.
- RemBERT (Google Research 에서) Hyung Won Chung, Thibault Févry, Henry Tsai, M. Johnson, Sebastian Ruder 의 Rethinking embedding coupling in pre-trained language models 논문과 함께 발표했습니다.
- ResNet (Microsoft Research 에서) Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun 의 Deep Residual Learning for Image Recognition 논문과 함께 발표했습니다.
- RoBERTa (Facebook 에서) Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov 의 a RoBERTa: A Robustly Optimized BERT Pretraining Approach 논문과 함께 발표했습니다.
- RoBERTa-PreLayerNorm (Facebook 에서) Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, Michael Auli 의 fairseq: A Fast, Extensible Toolkit for Sequence Modeling 논문과 함께 발표했습니다.
- RoCBert (WeChatAI 에서) HuiSu, WeiweiShi, XiaoyuShen, XiaoZhou, TuoJi, JiaruiFang, JieZhou 의 RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining 논문과 함께 발표했습니다.
- RoFormer (ZhuiyiTechnology 에서) Jianlin Su and Yu Lu and Shengfeng Pan and Bo Wen and Yunfeng Liu 의 a RoFormer: Enhanced Transformer with Rotary Position Embedding 논문과 함께 발표했습니다.
- RWKV (Bo Peng 에서 제공)은 Bo Peng.의 this repo논문과 함께 발표했습니다.
- SeamlessM4T (from Meta AI) released with the paper SeamlessM4T — Massively Multilingual & Multimodal Machine Translation by the Seamless Communication team.
- SeamlessM4Tv2 (from Meta AI) released with the paper Seamless: Multilingual Expressive and Streaming Speech Translation by the Seamless Communication team.
- SegFormer (NVIDIA 에서) Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M. Alvarez, Ping Luo 의 SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers 논문과 함께 발표했습니다.
- SegGPT (Beijing Academy of Artificial Intelligence (BAAI 에서 제공)은 Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang.의 SegGPT: Segmenting Everything In Context논문과 함께 발표했습니다.
- Segment Anything (Meta AI 에서 제공)은 Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick.의 Segment Anything논문과 함께 발표했습니다.
- SEW (ASAPP 에서) Felix Wu, Kwangyoun Kim, Jing Pan, Kyu Han, Kilian Q. Weinberger, Yoav Artzi 의 Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition 논문과 함께 발표했습니다.
- SEW-D (ASAPP 에서) Felix Wu, Kwangyoun Kim, Jing Pan, Kyu Han, Kilian Q. Weinberger, Yoav Artzi 의 Performance-Efficiency Trade-offs in Unsupervised Pre-training for Speech Recognition 논문과 함께 발표했습니다.
- SigLIP (Google AI 에서 제공)은 Xiaohua Zhai, Basil Mustafa, Alexander Kolesnikov, Lucas Beyer.의 Sigmoid Loss for Language Image Pre-Training논문과 함께 발표했습니다.
- SpeechT5 (Microsoft Research 에서 제공)은 Junyi Ao, Rui Wang, Long Zhou, Chengyi Wang, Shuo Ren, Yu Wu, Shujie Liu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei.의 SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing논문과 함께 발표했습니다.
- SpeechToTextTransformer (Facebook 에서) Changhan Wang, Yun Tang, Xutai Ma, Anne Wu, Dmytro Okhonko, Juan Pino 의 fairseq S2T: Fast Speech-to-Text Modeling with fairseq 논문과 함께 발표했습니다.
- SpeechToTextTransformer2 (Facebook 에서) Changhan Wang, Anne Wu, Juan Pino, Alexei Baevski, Michael Auli, Alexis Conneau 의 Large-Scale Self- and Semi-Supervised Learning for Speech Translation 논문과 함께 발표했습니다.
- Splinter (Tel Aviv University 에서) Ori Ram, Yuval Kirstain, Jonathan Berant, Amir Globerson, Omer Levy 의 Few-Shot Question Answering by Pretraining Span Selection 논문과 함께 발표했습니다.
- SqueezeBERT (Berkeley 에서) Forrest N. Iandola, Albert E. Shaw, Ravi Krishna, and Kurt W. Keutzer 의 SqueezeBERT: What can computer vision teach NLP about efficient neural networks? 논문과 함께 발표했습니다.
- StableLm (from Stability AI) released with the paper StableLM 3B 4E1T (Technical Report) by Jonathan Tow, Marco Bellagente, Dakota Mahan, Carlos Riquelme Ruiz, Duy Phung, Maksym Zhuravinskyi, Nathan Cooper, Nikhil Pinnaparaju, Reshinth Adithyan, and James Baicoianu.
- Starcoder2 (from BigCode team) released with the paper StarCoder 2 and The Stack v2: The Next Generation by Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo, Evgenii Zheltonozhskii, Nii Osae Osae Dade, Wenhao Yu, Lucas Krauß, Naman Jain, Yixuan Su, Xuanli He, Manan Dey, Edoardo Abati, Yekun Chai, Niklas Muennighoff, Xiangru Tang, Muhtasham Oblokulov, Christopher Akiki, Marc Marone, Chenghao Mou, Mayank Mishra, Alex Gu, Binyuan Hui, Tri Dao, Armel Zebaze, Olivier Dehaene, Nicolas Patry, Canwen Xu, Julian McAuley, Han Hu, Torsten Scholak, Sebastien Paquet, Jennifer Robinson, Carolyn Jane Anderson, Nicolas Chapados, Mostofa Patwary, Nima Tajbakhsh, Yacine Jernite, Carlos Muñoz Ferrandis, Lingming Zhang, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, and Harm de Vries.
- SuperPoint (from MagicLeap) released with the paper SuperPoint: Self-Supervised Interest Point Detection and Description by Daniel DeTone, Tomasz Malisiewicz and Andrew Rabinovich.
- SwiftFormer (MBZUAI 에서 제공)은 Abdelrahman Shaker, Muhammad Maaz, Hanoona Rasheed, Salman Khan, Ming-Hsuan Yang, Fahad Shahbaz Khan.의 SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications논문과 함께 발표했습니다.
- Swin Transformer (Microsoft 에서) Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo 의 Swin Transformer: Hierarchical Vision Transformer using Shifted Windows 논문과 함께 발표했습니다.
- Swin Transformer V2 (Microsoft 에서) Ze Liu, Han Hu, Yutong Lin, Zhuliang Yao, Zhenda Xie, Yixuan Wei, Jia Ning, Yue Cao, Zheng Zhang, Li Dong, Furu Wei, Baining Guo 의 Swin Transformer V2: Scaling Up Capacity and Resolution 논문과 함께 발표했습니다.
- Swin2SR (University of Würzburg 에서) Marcos V. Conde, Ui-Jin Choi, Maxime Burchi, Radu Timofte 의 Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration 논문과 함께 발표했습니다.
- SwitchTransformers (Google 에서) William Fedus, Barret Zoph, Noam Shazeer. 의 Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity 논문과 함께 발표했습니다.
- T5 (Google AI 에서) Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu 의 Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer 논문과 함께 발표했습니다.
- T5v1.1 (from Google AI) released in the repository google-research/text-to-text-transfer-transformer by Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu.
- Table Transformer (Microsoft Research 에서) Brandon Smock, Rohith Pesala, Robin Abraham 의 PubTables-1M: Towards Comprehensive Table Extraction From Unstructured Documents 논문과 함께 발표했습니다.
- TAPAS (Google AI 에서) Jonathan Herzig, Paweł Krzysztof Nowak, Thomas Müller, Francesco Piccinno and Julian Martin Eisenschlos 의 TAPAS: Weakly Supervised Table Parsing via Pre-training 논문과 함께 발표했습니다.
- TAPEX (Microsoft Research 에서) Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian-Guang Lou 의 TAPEX: Table Pre-training via Learning a Neural SQL Executor 논문과 함께 발표했습니다.
- Time Series Transformer (from HuggingFace).
- TimeSformer (Facebook 에서) Gedas Bertasius, Heng Wang, Lorenzo Torresani 의 Is Space-Time Attention All You Need for Video Understanding? 논문과 함께 발표했습니다.
- Trajectory Transformer (the University of California at Berkeley 에서) Michael Janner, Qiyang Li, Sergey Levin 의 Offline Reinforcement Learning as One Big Sequence Modeling Problem 논문과 함께 발표했습니다.
- Transformer-XL (Google/CMU 에서) Zihang Dai*, Zhilin Yang*, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov 의 Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context 논문과 함께 발표했습니다.
- TrOCR (Microsoft 에서) Minghao Li, Tengchao Lv, Lei Cui, Yijuan Lu, Dinei Florencio, Cha Zhang, Zhoujun Li, Furu Wei 의 TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models 논문과 함께 발표했습니다.
- TVLT (from UNC Chapel Hill 에서) Zineng Tang, Jaemin Cho, Yixin Nie, Mohit Bansal 의 TVLT: Textless Vision-Language Transformer 논문과 함께 발표했습니다.
- TVP (Intel 에서) Yimeng Zhang, Xin Chen, Jinghan Jia, Sijia Liu, Ke Ding 의 Text-Visual Prompting for Efficient 2D Temporal Video Grounding 논문과 함께 발표했습니다.
- UDOP (Microsoft Research 에서 제공)은 Zineng Tang, Ziyi Yang, Guoxin Wang, Yuwei Fang, Yang Liu, Chenguang Zhu, Michael Zeng, Cha Zhang, Mohit Bansal.의 Unifying Vision, Text, and Layout for Universal Document Processing논문과 함께 발표했습니다.
- UL2 (Google Research 에서) Yi Tay, Mostafa Dehghani, Vinh Q. Tran, Xavier Garcia, Dara Bahri, Tal Schuster, Huaixiu Steven Zheng, Neil Houlsby, Donald Metzle 의 Unifying Language Learning Paradigms 논문과 함께 발표했습니다.
- UMT5 (Google Research 에서 제공)은 Hyung Won Chung, Xavier Garcia, Adam Roberts, Yi Tay, Orhan Firat, Sharan Narang, Noah Constant.의 UniMax: Fairer and More Effective Language Sampling for Large-Scale Multilingual Pretraining논문과 함께 발표했습니다.
- UniSpeech (Microsoft Research 에서) Chengyi Wang, Yu Wu, Yao Qian, Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang 의 UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data 논문과 함께 발표했습니다.
- UniSpeechSat (Microsoft Research 에서) Sanyuan Chen, Yu Wu, Chengyi Wang, Zhengyang Chen, Zhuo Chen, Shujie Liu, Jian Wu, Yao Qian, Furu Wei, Jinyu Li, Xiangzhan Yu 의 UNISPEECH-SAT: UNIVERSAL SPEECH REPRESENTATION LEARNING WITH SPEAKER AWARE PRE-TRAINING 논문과 함께 발표했습니다.
- UnivNet (from Kakao Corporation) released with the paper UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation by Won Jang, Dan Lim, Jaesam Yoon, Bongwan Kim, and Juntae Kim.
- UPerNet (Peking University 에서 제공)은 Tete Xiao, Yingcheng Liu, Bolei Zhou, Yuning Jiang, Jian Sun.의 Unified Perceptual Parsing for Scene Understanding논문과 함께 발표했습니다.
- VAN (Tsinghua University and Nankai University 에서) Meng-Hao Guo, Cheng-Ze Lu, Zheng-Ning Liu, Ming-Ming Cheng, Shi-Min Hu 의 Visual Attention Network 논문과 함께 발표했습니다.
- VideoMAE (Multimedia Computing Group, Nanjing University 에서) Zhan Tong, Yibing Song, Jue Wang, Limin Wang 의 VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training 논문과 함께 발표했습니다.
- ViLT (NAVER AI Lab/Kakao Enterprise/Kakao Brain 에서) Wonjae Kim, Bokyung Son, Ildoo Kim 의 ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision 논문과 함께 발표했습니다.
- VipLlava (University of Wisconsin–Madison 에서 제공)은 Mu Cai, Haotian Liu, Siva Karthik Mustikovela, Gregory P. Meyer, Yuning Chai, Dennis Park, Yong Jae Lee.의 Making Large Multimodal Models Understand Arbitrary Visual Prompts논문과 함께 발표했습니다.
- Vision Transformer (ViT) (Google AI 에서) Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby 의 An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale 논문과 함께 발표했습니다.
- VisualBERT (UCLA NLP 에서) Liunian Harold Li, Mark Yatskar, Da Yin, Cho-Jui Hsieh, Kai-Wei Chang 의 VisualBERT: A Simple and Performant Baseline for Vision and Language 논문과 함께 발표했습니다.
- ViT Hybrid (Google AI 에서) Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby 의 An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale 논문과 함께 발표했습니다.
- VitDet (Meta AI 에서 제공)은 Yanghao Li, Hanzi Mao, Ross Girshick, Kaiming He.의 Exploring Plain Vision Transformer Backbones for Object Detection논문과 함께 발표했습니다.
- ViTMAE (Meta AI 에서) Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, Ross Girshick 의 Masked Autoencoders Are Scalable Vision Learners 논문과 함께 발표했습니다.
- ViTMatte (HUST-VL 에서 제공)은 Jingfeng Yao, Xinggang Wang, Shusheng Yang, Baoyuan Wang.의 ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers논문과 함께 발표했습니다.
- ViTMSN (Meta AI 에서) Mahmoud Assran, Mathilde Caron, Ishan Misra, Piotr Bojanowski, Florian Bordes, Pascal Vincent, Armand Joulin, Michael Rabbat, Nicolas Ballas 의 Masked Siamese Networks for Label-Efficient Learning 논문과 함께 발표했습니다.
- VITS (Kakao Enterprise 에서 제공)은 Jaehyeon Kim, Jungil Kong, Juhee Son.의 Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech논문과 함께 발표했습니다.
- ViViT (from Google Research) released with the paper ViViT: A Video Vision Transformer by Anurag Arnab, Mostafa Dehghani, Georg Heigold, Chen Sun, Mario Lučić, Cordelia Schmid.
- Wav2Vec2 (Facebook AI 에서) Alexei Baevski, Henry Zhou, Abdelrahman Mohamed, Michael Auli 의 wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations 논문과 함께 발표했습니다.
- Wav2Vec2-BERT (from Meta AI) released with the paper Seamless: Multilingual Expressive and Streaming Speech Translation by the Seamless Communication team.
- Wav2Vec2-Conformer (Facebook AI 에서) Changhan Wang, Yun Tang, Xutai Ma, Anne Wu, Sravya Popuri, Dmytro Okhonko, Juan Pino 의 FAIRSEQ S2T: Fast Speech-to-Text Modeling with FAIRSEQ 논문과 함께 발표했습니다.
- Wav2Vec2Phoneme (Facebook AI 에서) Qiantong Xu, Alexei Baevski, Michael Auli 의 Simple and Effective Zero-shot Cross-lingual Phoneme Recognition 논문과 함께 발표했습니다.
- WavLM (Microsoft Research 에서) Sanyuan Chen, Chengyi Wang, Zhengyang Chen, Yu Wu, Shujie Liu, Zhuo Chen, Jinyu Li, Naoyuki Kanda, Takuya Yoshioka, Xiong Xiao, Jian Wu, Long Zhou, Shuo Ren, Yanmin Qian, Yao Qian, Jian Wu, Michael Zeng, Furu Wei 의 WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing 논문과 함께 발표했습니다.
- Whisper (OpenAI 에서) Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, Ilya Sutskever 의 Robust Speech Recognition via Large-Scale Weak Supervision 논문과 함께 발표했습니다.
- X-CLIP (Microsoft Research 에서) Bolin Ni, Houwen Peng, Minghao Chen, Songyang Zhang, Gaofeng Meng, Jianlong Fu, Shiming Xiang, Haibin Ling 의 Expanding Language-Image Pretrained Models for General Video Recognition 논문과 함께 발표했습니다.
- X-MOD (Meta AI 에서 제공)은 Jonas Pfeiffer, Naman Goyal, Xi Lin, Xian Li, James Cross, Sebastian Riedel, Mikel Artetxe.의 Lifting the Curse of Multilinguality by Pre-training Modular Transformers논문과 함께 발표했습니다.
- XGLM (Facebook AI 에서 제공) Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, Jingfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona Diab, Veselin Stoyanov, Xian Li 의 Few-shot Learning with Multilingual Language Models 논문과 함께 발표했습니다.
- XLM (Facebook 에서) Guillaume Lample and Alexis Conneau 의 Cross-lingual Language Model Pretraining 논문과 함께 발표했습니다.
- XLM-ProphetNet (Microsoft Research 에서) Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou 의 ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training 논문과 함께 발표했습니다.
- XLM-RoBERTa (Facebook AI 에서) Alexis Conneau*, Kartikay Khandelwal*, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer and Veselin Stoyanov 의 Unsupervised Cross-lingual Representation Learning at Scale 논문과 함께 발표했습니다.
- XLM-RoBERTa-XL (Facebook AI 에서) Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau 의 Larger-Scale Transformers for Multilingual Masked Language Modeling 논문과 함께 발표했습니다.
- XLM-V (Meta AI 에서) Davis Liang, Hila Gonen, Yuning Mao, Rui Hou, Naman Goyal, Marjan Ghazvininejad, Luke Zettlemoyer, Madian Khabsa 의 XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models 논문과 함께 발표했습니다.
- XLNet (Google/CMU 에서) Zhilin Yang*, Zihang Dai*, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le 의 XLNet: Generalized Autoregressive Pretraining for Language Understanding 논문과 함께 발표했습니다.
- XLS-R (Facebook AI 에서) Arun Babu, Changhan Wang, Andros Tjandra, Kushal Lakhotia, Qiantong Xu, Naman Goyal, Kritika Singh, Patrick von Platen, Yatharth Saraf, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli 의 XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale 논문과 함께 발표했습니다.
- XLSR-Wav2Vec2 (Facebook AI 에서) Alexis Conneau, Alexei Baevski, Ronan Collobert, Abdelrahman Mohamed, Michael Auli 의 Unsupervised Cross-Lingual Representation Learning For Speech Recognition 논문과 함께 발표했습니다.
- YOLOS (Huazhong University of Science & Technology 에서) Yuxin Fang, Bencheng Liao, Xinggang Wang, Jiemin Fang, Jiyang Qi, Rui Wu, Jianwei Niu, Wenyu Liu 의 You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection 논문과 함께 발표했습니다.
- YOSO (the University of Wisconsin - Madison 에서) Zhanpeng Zeng, Yunyang Xiong, Sathya N. Ravi, Shailesh Acharya, Glenn Fung, Vikas Singh 의 You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling 논문과 함께 발표했습니다.
- 새로운 모델을 올리고 싶나요? 우리가 상세한 가이드와 템플릿 으로 새로운 모델을 올리도록 도와드릴게요. 가이드와 템플릿은 이 저장소의
templates
폴더에서 확인하실 수 있습니다. 컨트리뷰션 가이드라인을 꼭 확인해주시고, PR을 올리기 전에 메인테이너에게 연락하거나 이슈를 오픈해 피드백을 받으시길 바랍니다.
각 모델이 Flax, PyTorch, TensorFlow으로 구현되었는지 또는 🤗 Tokenizers 라이브러리가 지원하는 토크나이저를 사용하는지 확인하려면, 이 표를 확인하세요.
이 구현은 여러 데이터로 검증되었고 (예시 스크립트를 참고하세요) 오리지널 구현의 성능과 같아야 합니다. 도큐먼트의 Examples 섹션에서 성능에 대한 자세한 설명을 확인할 수 있습니다.
섹션 | 설명 |
---|---|
도큐먼트 | 전체 API 도큐먼트와 튜토리얼 |
과제 요약 | 🤗 Transformers가 지원하는 과제들 |
전처리 튜토리얼 | Tokenizer 클래스를 이용해 모델을 위한 데이터 준비하기 |
학습과 fine-tuning | 🤗 Transformers가 제공하는 모델 PyTorch/TensorFlow 학습 과정과 Trainer API에서 사용하기 |
퀵 투어: Fine-tuning/사용 스크립트 | 다양한 과제에서 모델 fine-tuning하는 예시 스크립트 |
모델 공유 및 업로드 | 커뮤니티에 fine-tune된 모델을 업로드 및 공유하기 |
마이그레이션 | pytorch-transformers 나 pytorch-pretrained-bert 에서 🤗 Transformers로 이동하기 |
🤗 Transformers 라이브러리를 인용하고 싶다면, 이 논문을 인용해 주세요:
@inproceedings{wolf-etal-2020-transformers,
title = "Transformers: State-of-the-Art Natural Language Processing",
author = "Thomas Wolf and Lysandre Debut and Victor Sanh and Julien Chaumond and Clement Delangue and Anthony Moi and Pierric Cistac and Tim Rault and Rémi Louf and Morgan Funtowicz and Joe Davison and Sam Shleifer and Patrick von Platen and Clara Ma and Yacine Jernite and Julien Plu and Canwen Xu and Teven Le Scao and Sylvain Gugger and Mariama Drame and Quentin Lhoest and Alexander M. Rush",
booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
month = oct,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.emnlp-demos.6",
pages = "38--45"
}