Skip to content

How to deal with QuantizeLinear and DequantizeLinear node when I do qutization using openvino/tnn/mnn? #10

@xiaoxiongli

Description

@xiaoxiongli

I train a x2 model, and after that I finetune it using QAT training using below command:

"python train.py --opt options/train/base7_qat.yaml --name base7_D4C28_bs16ps64_lr12-3_qat_x2 --scale 2 --bs 16 --ps 64 --lr 1e-3 --gpu_ids 1 --qat --qat_path experiment/ base7_D4C28_bs16ps64_lr12-3_x2/best_status".

and then I convert it to ONNX model using below cammand: "python -m tf2onnx.convert --saved-model

./experiment/base7_D4C28_bs16ps128_lr1e-3_x2_20210603/best_status --opset 13 --output ./ONNX/base7_D4C28_bs16ps128_lr1e-3_x2_20210603.onnx"

then I open this onnx model using netron:

无标题

I want to do qutization using openvino/tnn/mnn for this onnx model, my question is do I need to remove the
QuantizeLinear and DequantizeLinear in red box first and then do qutization?

or should I just do qutization, and the openvino/tnn/mnn will remove it automatically?

and I also check the tflite model(using generate_tflite.py) --> onnx model, it seems the quantizated tflite model/onnx model contains node QuantizeLinear and DequantizeLinear, is it normal?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions