Skip to content

QDQ node for weight tensor of Con2D undergoes Constant folding (enabled for node using tf type=FakeQuantWithMinMaxVarsPerChannel) #1972

Open
@rado82

Description

@rado82

I am doing some experiment on using QAT for a sample model. Looks like QDQ node for the weight tensor of Conv operation is always folded during onnx generation.

Version of various packages are as follows:
tensorflow version is 2.8.2
tf2onnx version is 1.11.1
tf model optimization toolkit version is 0.7.2

I am using tf model optimization to apply fake quantization nodes and using tf2onnx to convert the frozen graph from pb to onnx representation. I always get the weight tensor for the conv2d undergo constant folding during tf2onnx conversion. I can clearly see from the visualization of the frozen graph, there is a fake node introduced for weights.

To reproduce:
Colab pynb Link: https://colab.research.google.com/drive/1Y_LhhWtJejv5teHgQslMPQdwebyHY1GD?usp=sharing

Netron vis of Pb file (fed as input to tf2onnx)
github_0

Netron vis of Generated onnx
github_1

Checking the previous issues here , I found this. Though tf.quantize_and_dequantize_v2 is used in earlier issue. Here I am using tf model optimization which uses other tf quantization API's

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions