-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AdaptiveAveragePooling: Number of ways to split should evenly divide the split dimension #2321
Comments
/cc @Susmit-A for visibility. Seems that pytorch will have overlapping kernel when the dimension is not divisible by output size. I wonder if this is achievable with python ops. https://colab.research.google.com/drive/1KMSD4nMv_-P59TsbSxQksndBZ6oNyyri?usp=sharing |
This was done on purpose, to prevent unexpected behaviour for different input sizes, as well as offer flexibility for shape. The inputs can be clipped/padded as per requirement before passing to the layer. Since different models perform differently based on whether data is clipped or padded, it's better to let the user decide the necessary action. The PyTorch example uses overlapping windows (2 and 4 are accessed twice). I haven't yet found a way to do this in tensorflow without having an additional loop, which negatively impacts performance. |
Hi @Susmit-A, I'd say we should try to support non-divisible case. If not, is there any different between adaptive and original pooling? We can always infer the strides and kernel sizes of
I guess. |
The way PyTorch accomplishes this is well explained here: https://discuss.pytorch.org/t/what-is-adaptiveavgpool2d/26897 The issue is the rounding, instead of floor/ceil. This gives unequal size pieces that can't be sliced and stacked for parallel pooling; each piece needs to be sliced and processed individually. The method mentioned by @WindQAQ does work, but the output is not the same as that of PyTorch, since there is one less or one more value (in 1D arrays; in case of 2D images, its one pixel in each dimension) covered by the kernel compared to PyTorch. If that is not an issue, I can implement it in that manner. We can also also give an option to iteratively compute the pooled values to obtain the exact values that PyTorch provides, if the user requires so. |
For divisible case, rounding up or down do not matter, which means current adaptive pooling is just a variant for https://colab.research.google.com/drive/1GwoL4Sv2lRd_wp5eCtQogDnBQv7KounS?usp=sharing For non-divisible case, I agree that it would be much slower if we try to implement with pure python ops. Let's see if the performance degradation is huge first, and improve it latter. |
I have created a PR with the updated version of Adaptive Pooling. However, this is an order of magnitude slower than the PyTorch version. Methods to improve performance need to be explored. Specifically, the internal loop needs to be parallelized. |
System information
Describe the bug
AdaptiveAveragePooling2D fails when input dimension not dividable by pool size
In contrast, PyTorch version of AdaptivePooling2D works well in the same case.
Code to reproduce the issue
Other info / logs
The text was updated successfully, but these errors were encountered: