Description
Environment info
transformers
version: 3.0.2- Platform: Linux-4.19.112+-x86_64-with-Ubuntu-18.04-bionic
- Python version: 3.6.9
- PyTorch version (GPU?): 1.6.0+cu101 (False)
- Tensorflow version (GPU?): 2.3.0 (False)
- Using GPU in script?: yes
- Using distributed or parallel set-up in script?: no
Who can help
Text Generation: @TevenLeScao
T5: @patrickvonplaten
Information
Model I am using (Bert, XLNet ...):
T5
The problem arises when using:
- the official example scripts: (give details below)
- my own modified scripts: (give details below)
The tasks I am working on is:
- an official GLUE/SQUaD task: (give the name)
- my own task or dataset: (give details below)
To reproduce
Steps to reproduce the behavior:
https://colab.research.google.com/drive/112Jt7VFwHHT-QmMxFPJ764GNJBn0d5eX?usp=sharing
Expected behavior
We have started a big project for source code tasks (generation, summarisation, documentation, etc.) using language models. Using T5 text to text library, the model can predict the input correctly, However, after we converted the Tensorflow checkpoint to huggingface the output text is rubbish.
I am not sure if we are doing something wrong during conversion or there is a problem in loading and converting the weights from the original Tensorflow checkpoint to Pytorch.
The above Colab re-produce the issue.
Important Note: We are using a copy of "adapt_t5_for_covid_19_3b" branch which should fix the conversion problem with only one small modification, setting is_tied to false.
Your help is highly appreciated.