-
Couldn't load subscription status.
- Fork 458
Description
Describe the bug
I have taken these representations from TensorRT QAT Presentation
Figure(1)

As shown in Figure(1) above, I added QuantizeAndDequantizeV2 nodes before conv2d op and for conv2d kernel in my model.
But after converting it with tf2onnx, I don't seem to find the QuantizeLinear and DequantizeLinear nodes for the conv2d kernel, but as shown in Figure(2) below, I was expecting tf2onnx to keep them and not fold them into constants.
Figure(2)

Figure(3)

Tensorboard visualization of my model, showing QuantizeAndDequantizeV2 ops for both input and weights.
Figure(4)

Netron visualization of onnx model, that shows QuantizeLinear and DequantizeLinear nodes only for conv2d input.
It seems logical to fold the QuantizeLinear and DequantizeLinear nodes for weights into a constant as described here, #1394. But looking at Figure(5) below, it looks like TensorRT requires the QuantizeLinear and DequantizeLinear nodes for both conv2d input and weights!
Urgency
Blocked use-case, TensorFlow2.x QAT -> ONNX -> TensorRT
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
- Tensorflow Version: 2.6
- Python version: 3.8
To Reproduce
Adding QuantizeAndDequantizeV2 into TF2.x graphs is not trivial and hence it is difficult for me to add all the code here.
