Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

output shape deconv2d #3824

Closed
3 tasks done
tboquet opened this issue Sep 20, 2016 · 9 comments
Closed
3 tasks done

output shape deconv2d #3824

tboquet opened this issue Sep 20, 2016 · 9 comments

Comments

@tboquet
Copy link
Contributor

tboquet commented Sep 20, 2016

  • Check that you are up-to-date with the master branch of Keras. You can update with:
    pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
  • If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with:
    pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
  • Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

Hi all,

I wonder if get_output_shape_for in Deconv2d is right in all cases.
For some filter's shapes and border modes it differs from the true shape. I just started using deconvolution so my understanding of the arithmetic, the implementation in Theano\Tensorflow, and how the shape inference works in Keras is maybe wrong but it seems the behavior is not the same.
#3540 is also related and maybe you guys @EderSantana, @lukovkin and @dolaameng can help!

With theano:

import os
os.environ["THEANO_FLAGS"] = "mode=FAST_COMPILE,device=gpu,floatX=float32"
import theano
import numpy as np
from keras.layers import Deconvolution2D
from keras.layers import Input
from keras.models import Model

for f, shape, inpsize, sub, mode in [(3, 7, 8, 1, 'same'),
                                    (4, 7, 8, 1, 'same'),
                                    (4, 15, 8, 2, 'same'),
                                    (4, 14, 15, 1, 'same'),
                                    (4, 32, 15, 2, 'valid'),
                                    (5, 32, 32, 1, 'same')]:
    print('Filter size: {}, Output height/width shape: {}, Inp size: {}, Mode: {}'.format(f, shape, inpsize, mode))
    deconv_test = Deconvolution2D(10, f, f, 
                                (10, 10, shape, shape),
                                subsample=(sub, sub),
                                border_mode=mode)

    batch_shape_1 = (10, 10, inpsize, inpsize)

    inp = Input(batch_shape=batch_shape_1)
    result = deconv_test(inp)
    print('Shape inference: ', deconv_test.get_output_shape_for(batch_shape_1))
    print('Modified Shape inference filter heigth\width: ', conv_input_length(inpsize, f, mode, sub))

    matrix = np.ones(batch_shape_1, dtype=np.float32)

    model = Model(inp, result)
    results = model.predict(matrix)
    print('True shape: ', results.shape)
    print()

Output GPU:

Filter size: 3, Output height/width shape: 7, Inp size: 8, Mode: same
Shape inference:  (10, 10, 8, 8)
True shape:  (10, 10, 7, 7)

Filter size: 4, Output height/width shape: 7, Inp size: 8, Mode: same
Shape inference:  (10, 10, 7, 7)
True shape:  (10, 10, 7, 7)

Filter size: 4, Output height/width shape: 15, Inp size: 8, Mode: same
Shape inference:  (10, 10, 14, 14)
True shape:  (10, 10, 4, 4)

Filter size: 4, Output height/width shape: 14, Inp size: 15, Mode: same
Shape inference:  (10, 10, 14, 14)
True shape:  (10, 10, 14, 14)

Filter size: 4, Output height/width shape: 32, Inp size: 15, Mode: valid
Shape inference:  (10, 10, 32, 32)
True shape:  (10, 10, 32, 32)

Filter size: 5, Output height/width shape: 32, Inp size: 32, Mode: same
Shape inference:  (10, 10, 32, 32)
True shape:  (10, 10, 32, 32)

Output CPU:

Filter size: 3, Output height/width shape: 7, Inp size: 8, Mode: same
Shape inference:  (10, 10, 8, 8)
True shape:  (10, 10, 8, 8)

Filter size: 4, Output height/width shape: 7, Inp size: 8, Mode: same
Shape inference:  (10, 10, 7, 7)
True shape:  (10, 10, 7, 7)

Filter size: 4, Output height/width shape: 15, Inp size: 8, Mode: same
Shape inference:  (10, 10, 14, 14)
True shape:  (10, 10, 4, 4)

Filter size: 4, Output height/width shape: 14, Inp size: 15, Mode: same
Shape inference:  (10, 10, 14, 14)
True shape:  (10, 10, 14, 14)

Filter size: 4, Output height/width shape: 32, Inp size: 15, Mode: valid
Shape inference:  (10, 10, 32, 32)
True shape:  (10, 10, 32, 32)

Filter size: 5, Output height/width shape: 32, Inp size: 32, Mode: same
Shape inference:  (10, 10, 32, 32)
True shape:  (10, 10, 32, 32)

With TensorFlow:

import numpy as np
from keras.layers import Deconvolution2D
from keras.layers import Input
from keras.models import Model

for f, shape, inpsize, sub, mode in [(3, 8, 8, 1, 'same'),
                                    (4, 8, 8, 1, 'same'),
                                    (4, 15, 8, 2, 'same'),
                                    (4, 15, 15, 1, 'same'),
                                    (4, 32, 15, 2, 'valid'),
                                    (5, 32, 32, 1, 'same')]:
    print 'Filter size: {}, Output height/width shape: {}, Inp size: {}, Mode: {}'.format(f, shape, inpsize, mode)
    deconv_test = Deconvolution2D(10, f, f, 
                                (10, shape, shape, 10),
                                subsample=(sub, sub),
                                border_mode=mode)

    batch_shape_1 = (10, inpsize, inpsize, 10)

    inp = Input(batch_shape=batch_shape_1)
    result = deconv_test(inp)
    print 'Shape inference: ', deconv_test.get_output_shape_for(batch_shape_1)

    matrix = np.ones(batch_shape_1, dtype=np.float32)

    model = Model(inp, result)
    results = model.predict(matrix)
    print 'True shape: ', results.shape
    print

Output GPU:

Filter size: 3, Output height/width shape: 8, Inp size: 8, Mode: same
Shape inference:  (10, 8, 8, 10)
True shape:  (10, 8, 8, 10)

Filter size: 4, Output height/width shape: 8, Inp size: 8, Mode: same
Shape inference:  (10, 7, 7, 10)
True shape:  (10, 8, 8, 10)

Filter size: 4, Output height/width shape: 15, Inp size: 8, Mode: same
Shape inference:  (10, 14, 14, 10)
True shape:  (10, 15, 15, 10)

Filter size: 4, Output height/width shape: 15, Inp size: 15, Mode: same
Shape inference:  (10, 14, 14, 10)
True shape:  (10, 15, 15, 10)

Filter size: 4, Output height/width shape: 32, Inp size: 15, Mode: valid
Shape inference:  (10, 32, 32, 10)
True shape:  (10, 32, 32, 10)

Filter size: 5, Output height/width shape: 32, Inp size: 32, Mode: same
Shape inference:  (10, 32, 32, 10)
True shape:  (10, 32, 32, 10)

I don't have a Tensorflow cpu setup so if someone is willing to test this on cpu it would be interesting to compare the results.

Tx!

@EderSantana
Copy link
Contributor

I my previous tests a few months ago I could only get a deep deconv net to work on Tensorflow. I do think there might be some edge cases in the shape inference, specially in the theano side.

I started working to get Spatial Transformers up here in the main branch. Maybe somebody else could check that out meanwhile?

@tboquet
Copy link
Contributor Author

tboquet commented Sep 20, 2016

Because it seems many corner cases exist and because we need to pass an output shape, is it ok to just bypass this calculation?
If the behavior changes so that we don't need to pass the output shape, the shape inference could be reintroduced at this time?

@dolaameng
Copy link
Contributor

dolaameng commented Sep 21, 2016

@tboquet : I myself am struggling to understand the math behind Deconv2D, so I might be wrong. But here are some of my observations.

  1. The get_output_shape_for() is based on conv_input_length(), which implements a special case for subsample=1, as explained by @lukovkin in Specifying a FCN based on VGG16 #3540. For subsample > 1, it seems that the implementation doesn't take into account variable a as discussed at the end of the paper. That a should be a user choice because it accounts for multiple possible output sizes for a single input size after a deconvolution? But I am not sure if my understanding is correct.
  2. The output_shape for Deconv2D() is not a free choice, it should be calculated based on the formula given in the doc string. So for example if filter_size=3, input_size=8, and stride=1, and if mode='same', which implies pad=1, that gives output_size=stride*(input_size-1)+filter_size-2*pad=8. So in your theano test code f, shape, inpsize, sub, mode = (3, 7, 8, 1, 'same') the output_shape should actually be 8 rather than 7, to make the inference shape consistent with actual. Changing the size to 8 works for both th and tf backend in my test.
  3. Same for your tensorflow test where f, shape, inpsize, sub, mode = (4, 8, 8, 1, 'same'), where shape should be 7 rather than 8. It works with th backend but got an "shape mismatch" error on tf backend for both 'cpu' and 'gpu' modes. I suspect it is about the mode="same" and image_dim_ordering="tf" because that is where the errors are raised most of time when I tried to use inferred output_shape as the parameter to Deconv2D. I am not even sure if the tensorflow implementation follows the same guide as Theano. But the question is, is this a valid test as filter size is almost always odd. Shall we make it clear in the doc string?

Any thoughts?

@EderSantana
Copy link
Contributor

@yaringal wrote this. Would check this out if you have some time?

@tboquet
Copy link
Contributor Author

tboquet commented Sep 21, 2016

@dolaameng weird, the code I posted works for both tensorflow and theano on my side.

@dolaameng
Copy link
Contributor

@tboquet : sorry for not being clear enough. What I meant was for the tf backend, when I used the setting f, shape, inpsize, sub, mode = (4, 7, 8, 1, 'same') (as in your theano test code), it threw an error.

@tboquet
Copy link
Contributor Author

tboquet commented Sep 21, 2016

Ah yep! Let's wait for @yaringal suggestions 😃 !

@yaringal
Copy link
Contributor

Apologies for not responding - I'm currently writing up and won't have time to help until the end of the month...

Like what @dolaameng said, the output shape is not arbitrary but calculated based on the input shape and filter size / padding / stride. From memory Theano didn't have a proper documentation for the calculation of the output shape, and I ended up borrowing one from Lasagne (as written in the doc string for conv_input_length).

The way we ended up using this layer in practice for our project was feeding a dummy input and observing the true output from the deconv layer. We then used the shape of the true output as the input shape to the deconv layer. Following this we built a deep conv-deconv autoencoder with multiple layers. A proper way to do this would be to infer the shape following the implementation in Theano, but this is quite difficult because Theano expects a scalar input for the shape rather than a symbolic variable. @lukovkin started working on this here yaringal#4.

@tboquet
Copy link
Contributor Author

tboquet commented Sep 23, 2016

@yaringal thank you for your help! I ended up doing the same shape inference job by manually checking the output shapes with Theano. I'll take a look at @lukovkin s work and open another PR for the shape inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants