Closed
Description
Hi there, I have a question about calculating dp_mask
for x_t
and m_dp_mask
for m_t
in your GRU-D implementation (file gru_d.py).
First, the dp_mask
is generated from GRUCell built-in function get_dropout_mask_for_cell
: code
Then, the dropout mask m_dp_mask
for masking vector m_t
is generated by calling _generate_dropout_mask
: code
By doing so, the dp_mask
and m_dp_mask
zero out different elements in two inputs x_t
and m_t
. I can reproduce your result, however, I think that the dropout masks should be the same for x_t
and m_t
. Can you please clarify this for me? Did I misunderstand something in the core TensorFlow implementation/your implementation?
Thanks for the great work!
Metadata
Metadata
Assignees
Labels
No labels