Skip to content

[Good First Issue]: Refactor PaddlePaddle Quantization Implement Schemes like ONNX #20687

Open
@xczhai

Description

Context

The current PaddlePaddle quantization implementation is different from ONNX,.

Same

Difference

PaddlePaddle fuses the quantize_linear and dequantize_linear into FakeQuantize using a custom pass( https://github.com/openvinotoolkit/openvino/blob/master/src/frontends/paddle/src/internal/pass/transform_fakequantize.cpp)
but ONNX FE just doesn't.

It is hard to maintain the almost same logic. So need to refactor PaddlePaddle quantization like ONNX.
Also, more patterns in the model will affect transformation performance.

What needs to be done?

  1. Ignore the HALF_AWAY_FROM_ZERO round mode directly and aggressively. It is just for performance.
  2. Remove or refactor the custom pass.
    LTP has done what custom pass(https://github.com/openvinotoolkit/openvino/blob/master/src/frontends/paddle/src/internal/pass/transform_fakequantize.cpp) does. So, need to remove or refactor the custom pass. Prepare quantization pattern for LTP(Low Precision Transformation).
  3. Refactor the PDPD FE to decrease the quantization pattern if need.

Example Pull Requests

Please refer to #14834 for more comments and background.
test case: #20689

Resources

Contact points

@xczhai

Ticket

104434

Metadata

Assignees

Labels

Type

No type

Projects

  • Status

    In Review

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions