Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement coverage penalty in fast beam search #1277

Merged
merged 9 commits into from
Feb 9, 2019

Conversation

flauted
Copy link
Contributor

@flauted flauted commented Feb 9, 2019

Per discussion on #1268 .

All the slow/fast scores are matching. There's also some better validation of the coverage params. This fixes #994 since the scores don't line up unless that's fixed. That makes the slow code appreciably faster, but still not as fast as the fast code.


Scores on 694c8fc (master)

Slow baseline

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time
PRED AVG SCORE: -0.4390, PRED PPL: 1.5512
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 40.024521
Average translation time (s): 0.080049
Tokens per second: 301.690056

Slow summary cov

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -coverage_penalty summary -beta 2
PRED AVG SCORE: -1.5997, PRED PPL: 4.9518
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 41.000031
Average translation time (s): 0.082000
Tokens per second: 290.024173

Slow summary cov stepwise

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -coverage_penalty summary -beta 2 --stepwise_penalty
PRED AVG SCORE: -1.5862, PRED PPL: 4.8854
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 42.441436
Average translation time (s): 0.084883
Tokens per second: 272.304641

Slow length avg penalty

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -length_penalty avg
PRED AVG SCORE: -0.0160, PRED PPL: 1.0161
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 39.979531
Average translation time (s): 0.079959
Tokens per second: 311.234268

Slow length wu penalty

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -length_penalty wu -alpha 0.7
PRED AVG SCORE: -0.1305, PRED PPL: 1.1394
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 39.943928
Average translation time (s): 0.079888
Tokens per second: 306.254305

Fast baseline

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -fast
PRED AVG SCORE: -0.4390, PRED PPL: 1.5512
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 9.512364
Average translation time (s): 0.019025
Tokens per second: 1269.400488

Fast length avg penalty

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -fast -length_penalty avg
PRED AVG SCORE: -0.0172, PRED PPL: 1.0174
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 9.149232
Average translation time (s): 0.018298
Tokens per second: 1334.428923

Fast length wu penalty

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -fast -length_penalty wu -alpha 0.7
PRED AVG SCORE: -0.1346, PRED PPL: 1.1441
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 8.986004
Average translation time (s): 0.017972
Tokens per second: 1353.660667

"Evidence" of #994 fix

(Note the gold score changes too. This doesn't fix that!)

Batch size 1 (baseline) 7999890

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -length_penalty avg -batch_size 1
SENT 1: ['▁28', '-', 'Y', 'ear', '-', 'O', 'ld', '▁Chef', '▁Found', '▁Dead', '▁at', '▁San', '▁Francisco', '▁Mal', 'l']
PRED 1: ▁28 - Jahr - O ld ▁Chef ▁Found ▁Dead ▁in ▁San ▁Francisco ▁Mal l
PRED SCORE: -0.5435
GOLD 1: ▁28 - jährige r ▁Koch ▁in ▁San ▁Francisco ▁Mal l ▁to t ▁auf gefunden
GOLD SCORE: -177.9969
python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -length_penalty avg -batch_size 1 -fast
SENT 1: ['▁28', '-', 'Y', 'ear', '-', 'O', 'ld', '▁Chef', '▁Found', '▁Dead', '▁at', '▁San', '▁Francisco', '▁Mal', 'l']
PRED 1: ▁28 - Jahr - O ld ▁Chef ▁Found ▁Dead ▁in ▁San ▁Francisco ▁Mal l
PRED SCORE: -0.5435
GOLD 1: ▁28 - jährige r ▁Koch ▁in ▁San ▁Francisco ▁Mal l ▁to t ▁auf gefunden
GOLD SCORE: -177.9969

Batch size default

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -length_penalty avg
SENT 1: ['▁28', '-', 'Y', 'ear', '-', 'O', 'ld', '▁Chef', '▁Found', '▁Dead', '▁at', '▁San', '▁Francisco', '▁Mal', 'l']
PRED 1: ▁28 - Jahr - O ld - C he f ▁To te ▁in ▁San ▁Francisco ▁Mal l
PRED SCORE: -0.5009
GOLD 1: ▁28 - jährige r ▁Koch ▁in ▁San ▁Francisco ▁Mal l ▁to t ▁auf gefunden
GOLD SCORE: -178.0805
python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -length_penalty avg -fast
SENT 1: ['▁28', '-', 'Y', 'ear', '-', 'O', 'ld', '▁Chef', '▁Found', '▁Dead', '▁at', '▁San', '▁Francisco', '▁Mal', 'l']
PRED 1: ▁28 - Jahr - O ld ▁Chef ▁Found ▁Dead ▁in ▁San ▁Francisco ▁Mal l
PRED SCORE: -0.5435
GOLD 1: ▁28 - jährige r ▁Koch ▁in ▁San ▁Francisco ▁Mal l ▁to t ▁auf gefunden
GOLD SCORE: -178.0805

Batch size default after fix a03250e

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -length_penalty avg
SENT 1: ['▁28', '-', 'Y', 'ear', '-', 'O', 'ld', '▁Chef', '▁Found', '▁Dead', '▁at', '▁San', '▁Francisco', '▁Mal', 'l']
PRED 1: ▁28 - Jahr - O ld ▁Chef ▁Found ▁Dead ▁in ▁San ▁Francisco ▁Mal l
PRED SCORE: -0.5435
GOLD 1: ▁28 - jährige r ▁Koch ▁in ▁San ▁Francisco ▁Mal l ▁to t ▁auf gefunden
GOLD SCORE: -178.0805

03b3539 (this)

Slow baseline

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time
PRED AVG SCORE: -0.4390, PRED PPL: 1.5512
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 21.382824
Average translation time (s): 0.042766
Tokens per second: 564.705580

Slow summary cov

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -coverage_penalty summary -beta 2
PRED AVG SCORE: -1.5999, PRED PPL: 4.9525
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 21.606813
Average translation time (s): 0.043214
Tokens per second: 550.289388

Slow summary cov stepwise

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -coverage_penalty summary -beta 2 --stepwise_penalty
PRED AVG SCORE: -1.5862, PRED PPL: 4.8854
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 22.788975
Average translation time (s): 0.045578
Tokens per second: 507.131184

Slow length avg penalty

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -length_penalty avg
PRED AVG SCORE: -0.0164, PRED PPL: 1.0165
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 21.549666
Average translation time (s): 0.043099
Tokens per second: 566.226875

Slow length wu penalty

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -length_penalty wu -alpha 0.7
PRED AVG SCORE: -0.1314, PRED PPL: 1.1404
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 22.088348
Average translation time (s): 0.044177
Tokens per second: 550.652325

Fast baseline

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -fast
PRED AVG SCORE: -0.4390, PRED PPL: 1.5512
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 9.145718
Average translation time (s): 0.018291
Tokens per second: 1320.290018

Fast length avg penalty

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -fast -length_penalty avg
PRED AVG SCORE: -0.0164, PRED PPL: 1.0165
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 9.202680
Average translation time (s): 0.018405
Tokens per second: 1325.918155

Fast length wu penalty

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -fast -length_penalty wu -alpha 0.7
PRED AVG SCORE: -0.1314, PRED PPL: 1.1404
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 9.181109
Average translation time (s): 0.018362
Tokens per second: 1324.785533

Fast summary cov

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -coverage_penalty summary -beta 2 -fast
PRED AVG SCORE: -1.5999, PRED PPL: 4.9525
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 9.390767
Average translation time (s): 0.018782
Tokens per second: 1266.137215

Fast summary cov stepwise

python translate.py -model /home/dylan/Downloads/averaged-10-epoch.pt -src /home/dylan/Downloads/test_short.en -tgt /home/dylan/Downloads/test_short.de -verbose -output pred.txt -gpu 0 -report_time -coverage_penalty summary -beta 2 -fast -stepwise_penalty
PRED AVG SCORE: -1.5862, PRED PPL: 4.8854
GOLD AVG SCORE: -11.2287, GOLD PPL: 75260.5316
Total translation time (s): 9.198415
Average translation time (s): 0.018397
Tokens per second: 1256.412072

@flauted flauted changed the title [WIP] Implement coverage penalty in fast beam search Implement coverage penalty in fast beam search Feb 9, 2019
@vince62s
Copy link
Member

vince62s commented Feb 9, 2019

Many thanks for this.

@vince62s vince62s merged commit 55bc871 into OpenNMT:master Feb 9, 2019
ItaySofer pushed a commit to ItaySofer/OpenNMT-py that referenced this pull request Mar 17, 2019
* Test length penalty.
* Fix OpenNMT#994 - now seems length penalty scores are consistent for fast and slow.
* Get matching summary cov score on fast.
* Get stepwise coverage penalty scores matching.
* Better document beam search.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Batch_size dependent beam search
2 participants