Skip to content

Conversation

@renovate
Copy link

@renovate renovate bot commented May 6, 2021

WhiteSource Renovate

This PR contains the following updates:

Package Change Age Adoption Passing Confidence
transformers ==4.3.3 -> ==4.5.1 age adoption passing confidence

Release Notes

huggingface/transformers

v4.5.1

Compare Source

  • Fix pipeline when used with private models (#​11123)
  • Fix loading an architecture in an other (#​11207)

v4.5.0

Compare Source

v4.5.0: BigBird, GPT Neo, Examples, Flax support

BigBird (@​vasudevgupta7)

Seven new models are released as part of the BigBird implementation: BigBirdModel, BigBirdForPreTraining, BigBirdForMaskedLM, BigBirdForCausalLM, BigBirdForSequenceClassification, BigBirdForMultipleChoice, BigBirdForQuestionAnswering in PyTorch.

BigBird is a sparse-attention based transformer which extends Transformer based models, such as BERT to much longer sequences. In addition to sparse attention, BigBird also applies global attention as well as random attention to the input sequence.

The BigBird model was proposed in Big Bird: Transformers for Longer Sequences by Manzil Zaheer, Guru Guruganesh, Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, Amr Ahmed.

It is released with an accompanying blog post: Understanding BigBird's Block Sparse Attention

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=big_bird

GPT Neo (@​patil-suraj)

Two new models are released as part of the GPT Neo implementation: GPTNeoModel, GPTNeoForCausalLM in PyTorch.

GPT⁠-⁠Neo is the code name for a family of transformer-based language models loosely styled around the GPT architecture. EleutherAI's primary goal is to replicate a GPT⁠-⁠3 DaVinci-sized model and open-source it to the public.

The implementation within Transformers is a GPT2-like causal language model trained on the Pile dataset.

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=gpt_neo

Examples

Features have been added to some examples, and additional examples have been added.

Raw training loop examples

Based on the accelerate library, examples completely exposing the training loop are now part of the library. For easy customization if you want to try a new research idea!

Standardize examples with Trainer

Thanks to the amazing contributions of @​bhadreshpsavani, all examples with Trainer are now standardized and all support the predict stage and will return/save metrics in the same fashion.

Trainer & SageMaker Model Parallelism

The Trainer now supports SageMaker model parallelism out of the box, the old SageMakerTrainer is deprecated as a consequence and will be removed in version 5.

FLAX

FLAX support has been widened to support all model heads of the BERT architecture, alongside a general conversion script for checkpoints in PyTorch to be used in FLAX.

Auto models now have a FLAX implementation.

General improvements and bugfixes

v4.4.2

Compare Source

  • Add support for detecting intel-tensorflow version
  • Fix distributed evaluation on SageMaker with distributed evaluation

v4.4.1

Compare Source

v4.4.0

Compare Source

v4.4.0: S2T, M2M100, I-BERT, mBART-50, DeBERTa-v2, XLSR-Wav2Vec2
SpeechToText

Two new models are released as part of the S2T implementation: Speech2TextModel and Speech2TextForConditionalGeneration, in PyTorch.

Speech2Text is a speech model that accepts a float tensor of log-mel filter-bank features extracted from the speech signal. It’s a transformer-based seq2seq model, so the transcripts/translations are generated autoregressively.

The Speech2Text model was proposed in fairseq S2T: Fast Speech-to-Text Modeling with fairseq by Changhan Wang, Yun Tang, Xutai Ma, Anne Wu, Dmytro Okhonko, Juan Pino.

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=speech_to_text

M2M100

Two new models are released as part of the M2M100 implementation: M2M100Model and M2M100ForConditionalGeneration, in PyTorch.

M2M100 is a multilingual encoder-decoder (seq-to-seq) model primarily intended for translation tasks.

The M2M100 model was proposed in Beyond English-Centric Multilingual Machine Translation by Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin.

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=m2m_100

I-BERT

Six new models are released as part of the I-BERT implementation: IBertModel, IBertForMaskedLM, IBertForSequenceClassification, IBertForMultipleChoice, IBertForTokenClassification and IBertForQuestionAnswering, in PyTorch.

I-BERT is a quantized version of RoBERTa running inference up to four times faster.

The I-BERT framework in PyTorch allows to identify the best parameters for quantization. Once the model is exported in a framework that supports int8 execution (such as TensorRT), a speedup of up to 4x is visible, with no loss in performance thanks to the parameter search.

The I-BERT model was proposed in I-BERT: Integer-only BERT Quantization by Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney and Kurt Keutzer.

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=ibert

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=speech_to_text

mBART-50

MBart-50 is created using the original mbart-large-cc25 checkpoint by extending its embedding layers with randomly initialized vectors for an extra set of 25 language tokens and then pretrained on 50 languages.

The MBart model was presented in Multilingual Translation with Extensible Multilingual Pretraining and Finetuning by Yuqing Tang, Chau Tran, Xian Li, Peng-Jen Chen, Naman Goyal, Vishrav Chaudhary, Jiatao Gu, Angela Fan.

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=mbart-50

DeBERTa-v2

Fixe new models are released as part of the DeBERTa-v2 implementation: DebertaV2Model, DebertaV2ForMaskedLM, DebertaV2ForSequenceClassification, DeberaV2ForTokenClassification and DebertaV2ForQuestionAnswering, in PyTorch.

The DeBERTa model was proposed in DeBERTa: Decoding-enhanced BERT with Disentangled Attention by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu Chen. It is based on Google’s BERT model released in 2018 and Facebook’s RoBERTa model released in 2019.

It builds on RoBERTa with disentangled attention and enhanced mask decoder training with half of the data used in RoBERTa.

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=deberta-v2

Wav2Vec2
XLSR-Wav2Vec2

The XLSR-Wav2Vec2 model was proposed in Unsupervised Cross-Lingual Representation Learning For Speech Recognition by Alexis Conneau, Alexei Baevski, Ronan Collobert, Abdelrahman Mohamed, Michael Auli.

The checkpoint corresponding to that model is added to the model hub: facebook/
wav2vec2-large-xlsr-53

Training script

A fine-tuning script showcasing how the Wav2Vec2 model can be trained has been added.

Further improvements

The Wav2Vec2 architecture becomes more stable as several changes are done to its architecture. This introduces feature extractors and feature processors as the pre-processing aspect of multi-modal speech models.

AMP & XLA Support for TensorFlow models

Most of the TensorFlow models are now compatible with automatic mixed precision and have XLA support.

SageMaker Trainer for model parallelism

We are rolling out experimental support for model parallelism on SageMaker with a new SageMakerTrainer that can be used in place of the regular Trainer. This is a temporary class that will be removed in a future version, the end goal is to have Trainer support this feature out of the box.

General improvements and bugfixes


Configuration

📅 Schedule: At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻️ Rebasing: Renovate will not automatically rebase this PR, because other commits have been found.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box.

This PR has been generated by WhiteSource Renovate. View repository job log here.

@renovate
Copy link
Author

renovate bot commented Aug 28, 2021

Autoclosing Skipped

This PR has been flagged for autoclosing. However, it is being skipped due to the branch being already modified. Please close/delete it manually or report a bug if you think this is in error.

@renovate renovate bot changed the title Update dependency transformers to v4.5.1 Update dependency transformers to v4.5.1 - abandoned Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants