-
Notifications
You must be signed in to change notification settings - Fork 19.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RNN.call should get initial state from full input spec #10845
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
This bugfix doesn't work with ConvLSTMs with initialized states. I think it's because ConvLSTM initializes states differently than other LSTMs. Instead of taking two inputs, (input, init_state=init_state), |
hi @mikejhuang i have same issue with you in convLSTM.... how did you solve your problem ?? |
@JunHyungYu To get it to work, you input a list containing the initial states ([input, init_state_h, init_state_c]) |
@mikejhuang but when i train this multi gpu model, i have same error .. InvalidArgumentError: Incompatible shapes: [8,32,224,224] vs. [16,32,224,224] can you train your convlstm with initial_state using multi gpu model???? |
Nope, I haven't worked it out. Let me know if you find out.
…On Wed, May 1, 2019 at 2:49 AM JunHyungYu ***@***.***> wrote:
@mikejhuang <https://github.com/mikejhuang>
if i use convlstm()([input, init_state_h, init_state_c]) , i can make
multi gpu model..
but when i train this multi gpu model, i have same error ..
can you train your convlstm with initial_state using multi gpu model????
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#10845 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACTAYDGFUVPNKVRCQ7U4CB3PTCICFANCNFSM4FN3TBPA>
.
|
@mikejhuang Oh my god.... thank you.. |
@mikejhuang, @JunHyungYu have any of you managed to find a workaround for the issue? I'm struggling with it atm :( |
Same issue with ConvLSTM2D w/ latest of Keras (2.2.4) and TensorFlow 1.13 |
It only happens when I use this combination:
It is working when I remove the dropout on this layer, but it results in a less efficient learning :/ EDIT: |
Summary
This PR fix a critical bug in
RNN
which reported at #9449 and #10830 .In
RNN.call
, ifinitial_state
is a tensor that was returned by a Keras layer, we should getinitial_state
from full input spec(including training data, state and constants) which was generated atRNN.__call__
, as it could be copied to multiple GPUs. Otherwise, it would use the originalinitial_states
which is not be sliced according the number of GPUs.BTW, I have also check
CuDNNRNN
, it use the correct way, so we don't need to modify it.I run the following test code in a machine with 2 GPUs, it works well after this fix.
Related Issues
#9449
#10830
PR Overview