-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt to infer the output shape for Deconvolution2D layer #4
base: master
Are you sure you want to change the base?
Conversation
cool. let me know when you're happy with it and I'll merge (also note the conflicts) |
It works for TF for all cases, but it fails for the Theano early, and as I've checked it fails for Theano if I rollback to your latest commit before my changes - 0742daa (see error log below).
|
weird.. the tests seem to pass here |
…gal-master Conflicts: keras/layers/convolutional.py
…er and actual kernel shape - evaluate kernel shape in runtime BUG: Inferred output shape is a tuple of Theano variables/expressions, Theano expects tupe of int - Not fixed!
@@ -1052,7 +1052,15 @@ def deconv2d(x, kernel, output_shape, strides=(1, 1), | |||
kernel = kernel.dimshuffle((1, 0, 2, 3)) | |||
th_border_mode = _preprocess_border_mode(border_mode) | |||
np_kernel = kernel.eval() | |||
filter_shape = _preprocess_filter_shape(dim_ordering, filter_shape) | |||
filter_shape = _preprocess_filter_shape(dim_ordering, shape(kernel)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there was a kind of inconsistency here - we receive the filter_shape as a parameter, after that we reshape the kernel itself (shape should change) but leave filter_shape intact and pass it to the AbstractConv2d_gradInputs then.
Probably we could totally remove filter_shape from the parameters, if there are no other reasons to use it.
for v in output_shape: | ||
try: | ||
v = v.eval() | ||
_v = v.eval() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I stuck probably the same issue that @yaringal mentioned - if we infer output_shape, it becomes the tuple of Theano variables/expressions, which Theano couldn't accept as an input due to some reasons.
So I try to evaluate them, but here I get another error:
MissingInputError: An input of the graph, used to compute Shape(input_25), was not provided and not given a value.Use the Theano flag exception_verbosity='high',for more information on this error.
I'm not an expert in Theano, so have no idea how to fix it quick.
Now it passes TF tests and Theano tests where output_shape is not being inferred. Need to fix evaluation of Theano expressions in output_shape. |
@fchollet any suggestions for the theano part? |
Passing of K.eval(K.shape(x)) instead of K.shape(x) doesn't help either.
|
@lukovkin my deconv PR has been merged into fchollet/keras. You can open a new PR in fchollet/keras if you want to merge these changes |
A proper solution would be to do shape inference yourself, without relying on Theano. Basically do Theano's job, in Python-land. |
I've used the same approach that I've used in 1D Convolution Transpose implementation here - https://github.com/lukovkin/ufcnn-keras/blob/master/models/convolutional_transpose.py#L11-L23.
Was also inspired by the output shape calculation in Caffe - https://github.com/BVLC/caffe/blob/master/src/caffe/layers/deconv_layer.cpp#L14-L21 (we don't account for the dilated case for now AFAIK).
It looks like it passes the test at least for TensorFlow, but I will play with it a bit more.