Complete merge Seq-2-Seq generation into default generation #3225

patrickvonplaten · 2020-03-11T13:38:30Z

This is a follow-up PR to finalize #3140 .

There was still no conclusion on how to handle the fairseq tricks for generation. To summarize:

I think we have three options:

Remove all faiseq tricks. Here the ROUGE score is: 20.285
Implement the fairseq tricks EXCEPT leaving the starting decoding_inputs_tokens to be the BOS token instead of EOS. Here the ROUGE score is: 19.369
Add all fairseq tricks and maybe add a new argument to generate() which is called decoder_start_token_id=bos_token_id , but can be overriden to be the eos_token_id in the case of Bart. Here the ROUGE score is: 21.072

ROUGE scores from @sshleifer

For comparison:

UPDATE:

Given the above scores, option 1. was chosen for the moment to have the same scores as fairseq.
This means that we have to start the decoder ids with a EOS token (which might be weird and fairseq specific). Therefore, a new argument decoder_start_token_id was added to the generate function that defaults to the bos_token_id. When using Bart generate this argument should be set to the eos_token_id to have good results. To see how Bart.generate() should be used take a look at:

transformers/tests/test_modeling_bart.py

Line 470 in 2e81b9d

hypotheses_batch = hf.generate(

At the moment option 2. is implemented which seems to give the worst results and is also not the cleanest option.
This PR implements option 3.

For me either option 1. or option 3. is fine. Up to discuss @thomwolf , @julien-c , @LysandreJik @sshleifer

codecov-io · 2020-03-12T19:26:21Z

Codecov Report

Merging #3225 into master will increase coverage by 0.1%.
The diff coverage is 100%.

@@            Coverage Diff            @@
##           master    #3225     +/-   ##
=========================================
+ Coverage   77.93%   78.03%   +0.1%     
=========================================
  Files          98       98             
  Lines       16666    16668      +2     
=========================================
+ Hits        12988    13007     +19     
+ Misses       3678     3661     -17

Impacted Files	Coverage Δ
src/transformers/modeling_utils.py	`93.86% <100%> (+0.15%)`	⬆️
src/transformers/modeling_tf_utils.py	`88.55% <0%> (+2.86%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e81b9d...6a82f77. Read the comment docs.

sshleifer · 2020-03-12T19:31:37Z

Another issue that came up today from @yjernite, Bart does not support do_sample=True before 3140 this was clear and now it is :(

patrickvonplaten · 2020-03-12T19:38:45Z

Bart.generate() with do_sample=True does not throw any errors. If it should not be done this way then we can just write in the docs that do_sample should be set to False and show an example. I don't see a problem here. We could think about a Bart.summarize() that calls Bart.generate() with the correct parameters if it's just for the prettier API.

sshleifer · 2020-03-13T14:31:18Z

I prefer option #3, it looks like very little code and the best metrics.
Could you also fix the kwarg in examples/summarization/bart/evaluate_cnn.py?

patrickvonplaten · 2020-03-13T15:36:34Z

I prefer option #3, it looks like very little code and the best metrics.
Could you also fix the kwarg in examples/summarization/bart/evaluate_cnn.py?

done :-)

patrickvonplaten · 2020-03-13T15:38:46Z

Linked to PR: #3264
Might need to fix eos_token_id conflicts when rebasing/merging.

thomwolf · 2020-03-13T21:02:11Z

This one is ok for me.
Is it currently option 3 which is implemented?

Also, I guess we could default to do_sample==False in generate(). Seems like the default expectation from the user is simple greedy decoding to me.

patrickvonplaten · 2020-03-13T21:32:59Z

Yeah currently option 3 is implemented.
do_sample used to default to False, but was changed to True by @LysandreJik (can't find the issue/PR anymore :-/) Does not matter too much for me what is chosen, just would need to update some tests in modeling_utils.py

sshleifer · 2020-03-13T23:49:35Z

Yeah I'd love to merge this! Having trouble connecting to brutasse to run rouge, but afaict it will be the same as pre 3140 :)

patrickvonplaten · 2020-03-14T13:23:52Z

Good to merge for me!
changing do_sample=False can be done in another PR, I think.

IMPORNTANT:
the config.json files of:

bart-large-cnn
bart-large-mnli
bart-large

have to be updated on AWS to pass all slow tests. All special_tokens_id parameters should be deleted there. For the moment, we will go with the solution:
#3264 .

thomwolf · 2020-03-14T14:08:54Z

Ok merging. I let you continue in other PRs and ping me.

patrickvonplaten · 2020-03-14T16:08:40Z

Good to merge for me!
changing do_sample=False can be done in another PR, I think.

IMPORNTANT:
the config.json files of:

bart-large-cnn

bart-large-mnli

bart-large

have to be updated on AWS to pass all slow tests. All special_tokens_id parameters should be deleted there. For the moment, we will go with the solution:
#3264 .

Changed the configs on AWS. All slow tests pass now.

patrickvonplaten requested review from sshleifer and thomwolf and removed request for sshleifer March 11, 2020 13:38

patrickvonplaten added 2 commits March 12, 2020 20:02

small clean-up

c111601

re-add eos token to get good bart results

6047f46

patrickvonplaten force-pushed the finalize_merge_bart_generate_into_default_generate branch from f5a6bd0 to 6047f46 Compare March 12, 2020 19:18

patrickvonplaten mentioned this pull request Mar 12, 2020

Merge bart generate into default generate #3140

Merged

patrickvonplaten added 2 commits March 12, 2020 21:00

fix eos_token_ids in test

f1c71da

fix typo

6a82f77

patrickvonplaten added 2 commits March 13, 2020 16:34

update file to new starting token logic

c2ee384

make style

4f75d38

patrickvonplaten changed the title ~~[Discussion] Seq-2-Seq generation merge~~ Complete merge Seq-2-Seq generation into default generation Mar 14, 2020

thomwolf merged commit 3814e16 into huggingface:master Mar 14, 2020

This was referenced Apr 9, 2020

❓ In BART, why forcing the first token to BOS ? #3668

Closed

Seq2seq generation with prefix #3739

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Complete merge Seq-2-Seq generation into default generation #3225

Complete merge Seq-2-Seq generation into default generation #3225

patrickvonplaten commented Mar 11, 2020 •

edited

Loading

codecov-io commented Mar 12, 2020 •

edited

Loading

sshleifer commented Mar 12, 2020

patrickvonplaten commented Mar 12, 2020 •

edited

Loading

sshleifer commented Mar 13, 2020

patrickvonplaten commented Mar 13, 2020

patrickvonplaten commented Mar 13, 2020

thomwolf commented Mar 13, 2020

patrickvonplaten commented Mar 13, 2020

sshleifer commented Mar 13, 2020

patrickvonplaten commented Mar 14, 2020

thomwolf commented Mar 14, 2020

patrickvonplaten commented Mar 14, 2020

Complete merge Seq-2-Seq generation into default generation #3225

Complete merge Seq-2-Seq generation into default generation #3225

Conversation

patrickvonplaten commented Mar 11, 2020 • edited Loading

codecov-io commented Mar 12, 2020 • edited Loading

Codecov Report

sshleifer commented Mar 12, 2020

patrickvonplaten commented Mar 12, 2020 • edited Loading

sshleifer commented Mar 13, 2020

patrickvonplaten commented Mar 13, 2020

patrickvonplaten commented Mar 13, 2020

thomwolf commented Mar 13, 2020

patrickvonplaten commented Mar 13, 2020

sshleifer commented Mar 13, 2020

patrickvonplaten commented Mar 14, 2020

thomwolf commented Mar 14, 2020

patrickvonplaten commented Mar 14, 2020

patrickvonplaten commented Mar 11, 2020 •

edited

Loading

codecov-io commented Mar 12, 2020 •

edited

Loading

patrickvonplaten commented Mar 12, 2020 •

edited

Loading