Skip to content
This repository was archived by the owner on Jul 7, 2023. It is now read-only.

Avoid error in beam search when "f" is in cache #1302

Merged
merged 4 commits into from
Jan 9, 2019

Conversation

aeloyq
Copy link
Contributor

@aeloyq aeloyq commented Dec 14, 2018

If ffn_layer is in ["dense_relu_dense" or "conv_hidden_relu"], than the cache "f" won't be used
which means that the shape of cache["f"] won't be changed to [beamsize*batch_size, decode_length, hparams.hidden_size] and may cause error when applying nest.map reshape function on it.

This only happened in eager_execution mode because this part of graph may not be execute in non-eager_execution mode.

@googlebot googlebot added the cla: yes PR author has signed CLA label Dec 14, 2018
Copy link
Contributor

@lukaszkaiser lukaszkaiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

@lukaszkaiser lukaszkaiser merged commit 7ba9f5a into tensorflow:master Jan 9, 2019
tensorflow-copybara pushed a commit that referenced this pull request Jan 9, 2019
PiperOrigin-RevId: 228581110
kpe pushed a commit to kpe/tensor2tensor that referenced this pull request Mar 2, 2019
* add caching mechanism support for fast decoding with relative_dot_product in transformer model

* fix typo

* remove f in cache if use dense_relu_dense or conv_hidden_relu so that errors won't occur in beamsearch (nest.map function on cache)
kpe pushed a commit to kpe/tensor2tensor that referenced this pull request Mar 2, 2019
PiperOrigin-RevId: 228581110
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cla: yes PR author has signed CLA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants