-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Open
Labels
needs-triagePRs or issues that need to be investigated by maintainers to find the right assignees to address itPRs or issues that need to be investigated by maintainers to find the right assignees to address ittype: bug
Description
Description
The Onnx Frontend incorrectly mapping the Operator implementation version. Specifically, frontend has selected implement an operator with a higher version than the model's opset number.
Steps to Reproduce
- Graph: ReduceMean Node
- Using tvm.relax.frontend.onnx.from_onnx with ignore pass opset, Relax module wrong output shape.
Expected behavior
Mean Relax have axis = 2 and output shape = (1, 68, 18)
. . .
lv90: R.Tensor((1, 516, 4, 18), dtype="float32") = R.concat((lv86, lv87, lv88, lv89), axis=1)
lv91: R.Tensor((1, 68, 4, 18), dtype="float32") = R.nn.conv2d(lv90, metadata["relax.expr.Constant"][30], strides=[1, 1], padding=[0, 0, 0, 0], dilation=[1, 1], groups=1, data_layout="NCHW", kernel_layout="OIHW", out_layout="NCHW", out_dtype="void")
lv92: R.Tensor((1, 68, 1, 1), dtype="float32") = R.reshape(metadata["relax.expr.Constant"][31], R.shape([1, 68, 1, 1]))
lv93: R.Tensor((1, 68, 4, 18), dtype="float32") = R.add(lv91, lv92)
gv: R.Tensor((1, 68, 18), dtype="float32") = R.mean(lv93, axis=[2], keepdims=False)
R.output(gv)
Actual behavior
Mean Relax have axis = None and output shape = ( )
. . .
lv90: R.Tensor((1, 516, 4, 18), dtype="float32") = R.concat((lv86, lv87, lv88, lv89), axis=1)
lv91: R.Tensor((1, 68, 4, 18), dtype="float32") = R.nn.conv2d(lv90, metadata["relax.expr.Constant"][30], strides=[1, 1], padding=[0, 0, 0, 0], dilation=[1, 1], groups=1, data_layout="NCHW", kernel_layout="OIHW", out_layout="NCHW", out_dtype="void")
lv92: R.Tensor((1, 68, 1, 1), dtype="float32") = R.reshape(metadata["relax.expr.Constant"][31], R.shape([1, 68, 1, 1]))
lv93: R.Tensor((1, 68, 4, 18), dtype="float32") = R.add(lv91, lv92)
gv: R.Tensor((), dtype="float32") = R.mean(lv93, axis=None, keepdims=False)
R.output(gv)
Debug
- In method get_converter of OnnxOpConverter, the impl_v mapping algorithm takes the highest version among all versions <= opset. If there is no implementation version <= opset, the index of the versions will be a negative number.
- Rotate to take the latest Implement of Op (Negative indexing starts at -1 for the last element) -> Wrong Implement Operator
Go back example: ReduceMean Operator in Onnx
- Currently, in tvm, ReduceMean have 2 implements ( _impl_v13 and _impl_v18)
- Opset model = 9 -> get _impl_v18. Right must get _impl_v1
- In ReduceMean, version 1 and version 18 have different about Inputs and Attributes. (Ref: https://onnx.ai/onnx/operators/text_diff_ReduceMean_1_18.html)
Possible fix method
- Suggestion 1: Using onnx.defs.get_schema
- Using "from onnx import defs" to look up the Operator schema based on the model's opset. Then get Operator Version Exactly to Implement. Result as ai.onnx.v1 in Image.
- Todo: Implement all Version of ONNX Operator.
- Suggestion 2: Add check empty impl, raise error
- In method get_converter of OnnxOpConverter, filter Impl Version <= Opset, then get max this list versions to take Impl Version of Op. If this list filter versions are Empty, raise "Not found implement of operator compatible with the model's opset".
- Todo: Implement version Op with different schemas (input, atrr)
Triage
- needs-triage
- type: bug
- frontend:onnx
What do you think about this issue? I would be happy to hear different opinions from everyone.
Thank you!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
needs-triagePRs or issues that need to be investigated by maintainers to find the right assignees to address itPRs or issues that need to be investigated by maintainers to find the right assignees to address ittype: bug