Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TFLite] Enable int64 biases for int16 quantized operators #12042

Merged
merged 1 commit into from
Nov 15, 2022

Conversation

leandron
Copy link
Contributor

@leandron leandron commented Jul 8, 2022

This enables int64 biases for quantized fully connected, requantize and transpose convolution in TFLite networks. It goes on top of existing int16 support for TFLite frontend.

cc @areusch for reviews

@github-actions github-actions bot requested a review from areusch July 8, 2022 16:04
Copy link
Member

@Mousius Mousius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some test cases for this please @leandron 😸

@kuladeep2706
Copy link

Hello @leandron,

I'm working on similar lines & have a model with conv2d_transpose & all the other ops are already supported from your already merged commit. I've made the same changes you've done for conv2d_transpose from this patch, but the dequantize layer at the end is getting int64 input which isn't right. Am I missing something that needs to be changed?

Thanks in advance!

@areusch areusch added needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it and removed needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it labels Oct 19, 2022
@leandron leandron force-pushed the int16_ops_dense_requantize branch from 7eb64a3 to a940412 Compare November 8, 2022 16:20
@leandron
Copy link
Contributor Author

leandron commented Nov 8, 2022

Hello @leandron,

I'm working on similar lines & have a model with conv2d_transpose & all the other ops are already supported from your already merged commit. I've made the same changes you've done for conv2d_transpose from this patch, but the dequantize layer at the end is getting int64 input which isn't right. Am I missing something that needs to be changed?

Thanks in advance!

In TFlite as of now, biases are set by default to be int64 when int16 quantisation is used.

I have this model which was created using the default int16 flow, and can be used to check these internal data types with e.g. Netron

@leandron leandron force-pushed the int16_ops_dense_requantize branch 3 times, most recently from 5b67c87 to 1846d00 Compare November 9, 2022 10:54
@tvm-bot
Copy link
Collaborator

tvm-bot commented Nov 9, 2022

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

This enables int64 biases for quantized fully connected, requantize
and transpose convolution in TFLite networks. It goes on top of existing
int16 support for TFLite frontend.

Add a test case using DS_CNN int16 quantized.

Change-Id: I3006ee76f5037fb6f915818358c9aada2faf40bf
@leandron leandron force-pushed the int16_ops_dense_requantize branch from 1846d00 to 98568b2 Compare November 9, 2022 15:23
@leandron
Copy link
Contributor Author

leandron commented Nov 9, 2022

Please have another look.

Copy link
Contributor

@ashutosh-arm ashutosh-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me. Do you know of any links to int16 specs similar to https://www.tensorflow.org/lite/performance/quantization_spec (int8 only)?

src/relay/qnn/op/dense.cc Show resolved Hide resolved
Copy link
Contributor

@ekalda ekalda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @leandron, looks good to me!

Copy link
Contributor

@ashutosh-arm ashutosh-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @leandron. LGTM 😄

@Mousius Mousius merged commit 034dc67 into apache:main Nov 15, 2022
@Mousius
Copy link
Member

Mousius commented Nov 15, 2022

Sorry for the delay - thanks @leandron 😸

xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
)

This enables int64 biases for quantized fully connected, requantize
and transpose convolution in TFLite networks. It goes on top of existing
int16 support for TFLite frontend.

Add a test case using DS_CNN int16 quantized.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants