Fix transformer_moe model has wrong logic in pre/postprocessing #1233

twilightdema · 2018-11-17T08:52:20Z

There is a wrong logic in transformer_moe model that make training loss not decrease and make decoding always generate empty output.

By comparing logic with transformer.py and common_attention.py, I found that 'dp_postprocess' should receive input 'x' before passing 'dp_preprocess' in order to have it run with the same logic with transformer model. I changed the logic as in this commit and ran test data to confirm that training loss is decrease and decoding generate correct result.

Unit Testing Result:

Before Fix:

After Fix:

afrozenator · 2018-11-21T18:27:09Z

Thanks a lot @twilightdema 👍

PiperOrigin-RevId: 222429349

…orflow#1233)

PiperOrigin-RevId: 222429349

Fix transformer_moe model has wrong logic in pre/postprocessing

5c587fc

googlebot added the cla: yes PR author has signed CLA label Nov 17, 2018

twilightdema mentioned this pull request Nov 17, 2018

No Inference results OUTPUT With the transformer_moe Model #435

Open

afrozenator merged commit eed1ccf into tensorflow:master Nov 21, 2018

tensorflow-copybara pushed a commit that referenced this pull request Nov 21, 2018

internal merge of PR #1233

9977c52

PiperOrigin-RevId: 222429349

kpe pushed a commit to kpe/tensor2tensor that referenced this pull request Mar 2, 2019

Fix transformer_moe model has wrong logic in pre/postprocessing (tens…

e0a11ae

…orflow#1233)

kpe pushed a commit to kpe/tensor2tensor that referenced this pull request Mar 2, 2019

internal merge of PR tensorflow#1233

138992d

PiperOrigin-RevId: 222429349

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix transformer_moe model has wrong logic in pre/postprocessing #1233

Fix transformer_moe model has wrong logic in pre/postprocessing #1233

Uh oh!

twilightdema commented Nov 17, 2018

Uh oh!

afrozenator commented Nov 21, 2018

Uh oh!

Uh oh!

Fix transformer_moe model has wrong logic in pre/postprocessing #1233

Fix transformer_moe model has wrong logic in pre/postprocessing #1233

Uh oh!

Conversation

twilightdema commented Nov 17, 2018

Uh oh!

afrozenator commented Nov 21, 2018

Uh oh!

Uh oh!