Description
I am doing some experiment on using QAT for a sample model. Looks like QDQ node for the weight tensor of Conv operation is always folded during onnx generation.
Version of various packages are as follows:
tensorflow version is 2.8.2
tf2onnx version is 1.11.1
tf model optimization toolkit version is 0.7.2
I am using tf model optimization to apply fake quantization nodes and using tf2onnx to convert the frozen graph from pb to onnx representation. I always get the weight tensor for the conv2d undergo constant folding during tf2onnx conversion. I can clearly see from the visualization of the frozen graph, there is a fake node introduced for weights.
To reproduce:
Colab pynb Link: https://colab.research.google.com/drive/1Y_LhhWtJejv5teHgQslMPQdwebyHY1GD?usp=sharing
Netron vis of Pb file (fed as input to tf2onnx)
Checking the previous issues here , I found this. Though tf.quantize_and_dequantize_v2 is used in earlier issue. Here I am using tf model optimization which uses other tf quantization API's