-
Notifications
You must be signed in to change notification settings - Fork 28
Closed
Description
Hi there, I have a question about calculating dp_mask for x_t and m_dp_mask for m_t in your GRU-D implementation (file gru_d.py).
First, the dp_mask is generated from GRUCell built-in function get_dropout_mask_for_cell: code
Then, the dropout mask m_dp_mask for masking vector m_t is generated by calling _generate_dropout_mask: code
By doing so, the dp_mask and m_dp_mask zero out different elements in two inputs x_t and m_t. I can reproduce your result, however, I think that the dropout masks should be the same for x_t and m_t. Can you please clarify this for me? Did I misunderstand something in the core TensorFlow implementation/your implementation?
Thanks for the great work!
Metadata
Metadata
Assignees
Labels
No labels