Fix beam search generation for GPT2 and T5 on model parallelism #9219

TobiasNorlund · 2020-12-19T23:34:39Z

What does this PR do?

This PR fixes beam search generation crash when the model layers are distributed on several devices (model parallelism). I have also added a test which showcase the bug on master and pass with the provided fix. Fixes issue #9200

This is my first contribution to Transformers, so please let me know if something seems wrong or can be improved! The added test doesn't actually test anything at the moment, just raises an error on master. Looking forward to some feedback on how you'd like to see that.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Issue Beam search fails when using model parallelism #9200
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@LysandreJik
@alexorona
@patrickvonplaten

patrickvonplaten

Looks good to me - thanks for adding the test!

patrickvonplaten · 2020-12-21T08:42:07Z

@sgugger can you maybe review as well and merge if ok?

sgugger

Very clean, thanks for adding the new test!

Fixed beam search generation for GPT2 and T5

23fb560

patrickvonplaten approved these changes Dec 21, 2020

View reviewed changes

patrickvonplaten requested a review from sgugger December 21, 2020 08:41

sgugger approved these changes Dec 21, 2020

View reviewed changes

sgugger merged commit 08abdab into huggingface:master Dec 21, 2020

thevasudevgupta pushed a commit to thevasudevgupta/transformers that referenced this pull request Dec 23, 2020

Fixed beam search generation for GPT2 and T5 (huggingface#9219)

f9266c6

guyrosin pushed a commit to guyrosin/transformers that referenced this pull request Jan 15, 2021

Fixed beam search generation for GPT2 and T5 (huggingface#9219)

401e155

OyvindTafjord mentioned this pull request May 13, 2021

Fix T5 beam search when using parallelize #11717

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix beam search generation for GPT2 and T5 on model parallelism #9219

Fix beam search generation for GPT2 and T5 on model parallelism #9219

TobiasNorlund commented Dec 19, 2020

patrickvonplaten left a comment •

edited

Loading

patrickvonplaten commented Dec 21, 2020

sgugger left a comment

Fix beam search generation for GPT2 and T5 on model parallelism #9219

Fix beam search generation for GPT2 and T5 on model parallelism #9219

Conversation

TobiasNorlund commented Dec 19, 2020

What does this PR do?

Before submitting

Who can review?

patrickvonplaten left a comment • edited Loading

Choose a reason for hiding this comment

patrickvonplaten commented Dec 21, 2020

sgugger left a comment

Choose a reason for hiding this comment

patrickvonplaten left a comment •

edited

Loading