Skip to content

T5 Conversion from Original Tensorflow Produce rubbish Text #7791

Closed
@agemagician

Description

@agemagician

Environment info

  • transformers version: 3.0.2
  • Platform: Linux-4.19.112+-x86_64-with-Ubuntu-18.04-bionic
  • Python version: 3.6.9
  • PyTorch version (GPU?): 1.6.0+cu101 (False)
  • Tensorflow version (GPU?): 2.3.0 (False)
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: no

Who can help

Text Generation: @TevenLeScao
T5: @patrickvonplaten

Information

Model I am using (Bert, XLNet ...):
T5

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

https://colab.research.google.com/drive/112Jt7VFwHHT-QmMxFPJ764GNJBn0d5eX?usp=sharing

Expected behavior

We have started a big project for source code tasks (generation, summarisation, documentation, etc.) using language models. Using T5 text to text library, the model can predict the input correctly, However, after we converted the Tensorflow checkpoint to huggingface the output text is rubbish.
I am not sure if we are doing something wrong during conversion or there is a problem in loading and converting the weights from the original Tensorflow checkpoint to Pytorch.

The above Colab re-produce the issue.
Important Note: We are using a copy of "adapt_t5_for_covid_19_3b" branch which should fix the conversion problem with only one small modification, setting is_tied to false.

Your help is highly appreciated.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions