Skip to content

Add op (AvgPool) | feat (torchlib) #754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jul 19, 2023

Conversation

titaiwangms
Copy link
Contributor

@titaiwangms titaiwangms commented May 31, 2023

1D and 3D

Fix can save the op after opset 19, but not before it. So an xfail is created when ceil_mode=True

@titaiwangms titaiwangms added the module: torchlib Related to the torch/aten function lib in development label May 31, 2023
@titaiwangms titaiwangms marked this pull request as ready for review May 31, 2023 23:00
@titaiwangms
Copy link
Contributor Author

titaiwangms commented May 31, 2023

I am blocked by ceil_mode. It seems that the ceil_mode in ORT is not following ONNX spec:

import numpy as np
from onnxscript.function_libs.torch_lib.ops.nn import aten_avg_pool2d

x = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ]
).astype(np.float32)
kernal = [3, 3]
stride = [2, 2]

print(aten_avg_pool2d(x, kernel_size=kernal, stride=stride, ceil_mode=True))
# Tensor(array([[[[6., 5.], [8., 6.]]]], dtype=float32))

# https://github.com/onnx/onnx/blob/main/docs/Operators.md#averagepool
# The answer should be:
# np.array([[[[6, 7.5], [12, 13.5]]]]).astype(np.float32)

So the difference is that in ceil_mode, ORT tends to divide the value by kernal_size no matter the right padding exists or not.
But I don't think they would take this as repro..

@titaiwangms titaiwangms added the help wanted Extra attention is needed label May 31, 2023
@titaiwangms
Copy link
Contributor Author

I am blocked by ceil_mode. It seems that the ceil_mode in ORT is not following ONNX spec:

import numpy as np
from onnxscript.function_libs.torch_lib.ops.nn import aten_avg_pool2d

x = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ]
).astype(np.float32)
kernal = [3, 3]
stride = [2, 2]

print(aten_avg_pool2d(x, kernel_size=kernal, stride=stride, ceil_mode=True))
# Tensor(array([[[[6., 5.], [8., 6.]]]], dtype=float32))

# https://github.com/onnx/onnx/blob/main/docs/Operators.md#averagepool
# The answer should be:
# np.array([[[[6, 7.5], [12, 13.5]]]]).astype(np.float32)

So the difference is that in ceil_mode, ORT tends to divide the value by kernal_size no matter the right padding exists or not. But I don't think they would take this as repro..

Just found this issue is caused by "count_include_pad=True will pad the right side even if it's not padded". So the following code would be able to solve this:

x = np.array(
    [
        [
            [
                [1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12],
                [13, 14, 15, 16],
            ]
        ]
    ]
).astype(np.float32)

import onnxscript
from onnxscript.onnx_opset import opset18 as op

@onnxscript.script(default_opset=op)
def avg_pool(x):
    result = op.AveragePool(x, kernel_shape=[3,3], strides=[2,2], ceil_mode=True, count_include_pad=False)
    return result

print(avg_pool(x))

Although I still think this doesn't make sense.

@titaiwangms
Copy link
Contributor Author

However, the above doesn't solve the root cause which is that in ORT, the last slide of window tends to dividing the value with kernal size regardless the remain pixel+pads seems like a bug.

@titaiwangms
Copy link
Contributor Author

titaiwangms commented Jun 1, 2023

Not sure if it's a spec issue or implementation: onnx/onnx#5276

@titaiwangms
Copy link
Contributor Author

@codecov
Copy link

codecov bot commented Jul 13, 2023

Codecov Report

Merging #754 (5b5b715) into main (65d7c03) will decrease coverage by 0.04%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #754      +/-   ##
==========================================
- Coverage   76.67%   76.64%   -0.04%     
==========================================
  Files         112      112              
  Lines       13496    13508      +12     
  Branches     1365     1366       +1     
==========================================
+ Hits        10348    10353       +5     
- Misses       2809     2817       +8     
+ Partials      339      338       -1     
Impacted Files Coverage Δ
onnxscript/function_libs/torch_lib/ops/core.py 77.10% <ø> (+0.02%) ⬆️
onnxscript/function_libs/torch_lib/ops/nn.py 71.42% <100.00%> (+1.08%) ⬆️
...ipt/tests/function_libs/torch_lib/ops_test_data.py 96.78% <100.00%> (+0.01%) ⬆️

... and 1 file with indirect coverage changes

@titaiwangms titaiwangms removed the help wanted Extra attention is needed label Jul 18, 2023
@titaiwangms titaiwangms merged commit 3797447 into microsoft:main Jul 19, 2023
titaiwangms added a commit to pytorch/pytorch that referenced this pull request Jul 20, 2023
… shapes"


In #87892, to pick up the corner cases found in #71549, the PR falls back the implementation of AvgPool to the way opset 9 implementing. However, it introduces a regression on dynamic shape cases found in #101397. This PR refactors the AvgPool op with the same implementation we have in onnxscript: microsoft/onnxscript#754.

However, the corner case with `ceil_mode` remains unsolved in onnxruntime: microsoft/onnxruntime#16203. The calculuation on the last value of each dimension is different between ORT and PyTorch. But the fix can be proved in: microsoft/onnxruntime#16752, and it supports AvgPool since opset19.

[ghstack-poisoned]
titaiwangms added a commit to pytorch/pytorch that referenced this pull request Jul 20, 2023
In #87892, to pick up the corner cases found in #71549, the PR falls back the implementation of AvgPool to the way opset 9 implementing. However, it introduces a regression on dynamic shape cases found in #101397. This PR refactors the AvgPool op with the same implementation we have in onnxscript: microsoft/onnxscript#754.

However, the corner case with `ceil_mode` remains unsolved in onnxruntime: microsoft/onnxruntime#16203. The calculuation on the last value of each dimension is different between ORT and PyTorch. But the fix can be proved in: microsoft/onnxruntime#16752, and it supports AvgPool since opset19.

[ghstack-poisoned]
titaiwangms added a commit to pytorch/pytorch that referenced this pull request Jul 20, 2023
… shapes"


In #87892, to pick up the corner cases found in #71549, the PR falls back the implementation of AvgPool to the way opset 9 implementing. However, it introduces a regression on dynamic shape cases found in #101397. This PR refactors the AvgPool op with the same implementation we have in onnxscript: microsoft/onnxscript#754.

However, the corner case with `ceil_mode` remains unsolved in onnxruntime: microsoft/onnxruntime#16203. The calculuation on the last value of each dimension is different between ORT and PyTorch. But the fix can be proved in: microsoft/onnxruntime#16752, and it supports AvgPool since opset19.

[ghstack-poisoned]
titaiwangms added a commit to pytorch/pytorch that referenced this pull request Jul 20, 2023
In #87892, to pick up the corner cases found in #71549, the PR falls back the implementation of AvgPool to the way opset 9 implementing. However, it introduces a regression on dynamic shape cases found in #101397. This PR refactors the AvgPool op with the same implementation we have in onnxscript: microsoft/onnxscript#754.

However, the corner case with `ceil_mode` remains unsolved in onnxruntime: microsoft/onnxruntime#16203. The calculuation on the last value of each dimension is different between ORT and PyTorch. But the fix can be proved in: microsoft/onnxruntime#16752, and it supports AvgPool since opset19.

[ghstack-poisoned]
titaiwangms added a commit to pytorch/pytorch that referenced this pull request Jul 21, 2023
… shapes"


In #87892, to pick up the corner cases found in #71549, the PR falls back the implementation of AvgPool to the way opset 9 implementing. However, it introduces a regression on dynamic shape cases found in #101397. This PR refactors the AvgPool op with the same implementation we have in onnxscript: microsoft/onnxscript#754.

However, the corner case with `ceil_mode` remains unsolved in onnxruntime: microsoft/onnxruntime#16203. The calculuation on the last value of each dimension is different between ORT and PyTorch. But the fix can be proved in: microsoft/onnxruntime#16752, and it supports AvgPool since opset19.

[ghstack-poisoned]
titaiwangms added a commit to pytorch/pytorch that referenced this pull request Jul 21, 2023
In #87892, to pick up the corner cases found in #71549, the PR falls back the implementation of AvgPool to the way opset 9 implementing. However, it introduces a regression on dynamic shape cases found in #101397. This PR refactors the AvgPool op with the same implementation we have in onnxscript: microsoft/onnxscript#754.

However, the corner case with `ceil_mode` remains unsolved in onnxruntime: microsoft/onnxruntime#16203. The calculuation on the last value of each dimension is different between ORT and PyTorch. But the fix can be proved in: microsoft/onnxruntime#16752, and it supports AvgPool since opset19.

[ghstack-poisoned]
titaiwangms added a commit to pytorch/pytorch that referenced this pull request Jul 21, 2023
… shapes"


In #87892, to pick up the corner cases found in #71549, the PR falls back the implementation of AvgPool to the way opset 9 implementing. However, it introduces a regression on dynamic shape cases found in #101397. This PR refactors the AvgPool op with the same implementation we have in onnxscript: microsoft/onnxscript#754.

However, the corner case with `ceil_mode` remains unsolved in onnxruntime: microsoft/onnxruntime#16203. The calculuation on the last value of each dimension is different between ORT and PyTorch. But the fix can be proved in: microsoft/onnxruntime#16752, and it supports AvgPool since opset19.

[ghstack-poisoned]
titaiwangms added a commit to pytorch/pytorch that referenced this pull request Jul 21, 2023
In #87892, to pick up the corner cases found in #71549, the PR falls back the implementation of AvgPool to the way opset 9 implementing. However, it introduces a regression on dynamic shape cases found in #101397. This PR refactors the AvgPool op with the same implementation we have in onnxscript: microsoft/onnxscript#754.

However, the corner case with `ceil_mode` remains unsolved in onnxruntime: microsoft/onnxruntime#16203. The calculuation on the last value of each dimension is different between ORT and PyTorch. But the fix can be proved in: microsoft/onnxruntime#16752, and it supports AvgPool since opset19.

[ghstack-poisoned]
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Jul 21, 2023
In #87892, to pick up the corner cases found in #71549, the PR falls back the implementation of AvgPool to the way opset 9 implementing. However, it introduces a regression on dynamic shape cases found in #101397. This PR refactors the AvgPool op with the same implementation we have in onnxscript: microsoft/onnxscript#754.

However, the corner case with `count_include_pad` remains unsolved in onnxruntime: microsoft/onnxruntime#16203. The calculuation on the last value of each dimension is different between ORT and PyTorch. But the fix can be proved in: microsoft/onnxruntime#16752, and it supports AvgPool since opset19.
Pull Request resolved: #105683
Approved by: https://github.com/thiagocrepaldi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: torchlib Related to the torch/aten function lib in development
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants