Skip to content

Question about GRU-D implementation #1

Closed
@ducnx

Description

@ducnx

Hi there, I have a question about calculating dp_mask for x_t and m_dp_mask for m_t in your GRU-D implementation (file gru_d.py).

First, the dp_mask is generated from GRUCell built-in function get_dropout_mask_for_cell: code
Then, the dropout mask m_dp_mask for masking vector m_t is generated by calling _generate_dropout_mask: code
By doing so, the dp_mask and m_dp_mask zero out different elements in two inputs x_t and m_t. I can reproduce your result, however, I think that the dropout masks should be the same for x_t and m_t. Can you please clarify this for me? Did I misunderstand something in the core TensorFlow implementation/your implementation?

Thanks for the great work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions