This is an helper repo for the verification of the implementation equivalence during porting LongT5 model into the transformers package.
The LongT5Model has been successfully merged to transformers huggingface#16792, and it is available in the package as of v4.20.
This repo can be still, however, use as a simple example how to use t5x repo and related Google repositories to run a forward pass.
-
Run
setup.shfile. This script downloads some original LongT5 checkpoints, and takes care of some necessary Google packages likeT5x, Flaxformer. -
Then, there is an example python file
forward_passwhich contains logic for the evaluation both of the original model and proposed HF-implemented model. For this purpose, there are some example bash script inrunning_scripts/folder.
Currently, there's a problem with TGlobal model where a discrepancy occurs when the sequence length is larger than global_block_size.
This can be illustrated by running (1) longt5-tglobal_forward_pass.sh , and (2) longt5-tglobal_forward_pass_seq25.sh scripts
in the running_scripts/ folder. There is no issue with the output of the (1), however (2) does not work as expected.