-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[Compression] Add quantization tutorial #5454
Conversation
dependencies/recommended.txt
Outdated
@@ -21,3 +21,4 @@ matplotlib | |||
git+https://github.com/microsoft/nn-Meter.git#egg=nn_meter | |||
sympy | |||
timm >= 0.5.4 | |||
datasets == 2.10.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason to freeze the version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good suggestion, there is no need to freeze the version. I will modify it.
docs/source/examples.rst
Outdated
|
||
.. cardlinkitem:: | ||
:header: Quantize Bert on Task MNLI | ||
:description: An end to end example for how to using NNI to quantize transformer and show the real speedup number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we real speedup?😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no no no
Quantize BERT on Task GLUE | ||
========================== | ||
|
||
Here we show an effective transformer simulated quantization process that NNI team has tried, and users can use NNI to discover better process |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
|
||
Here we show an effective transformer simulated quantization process that NNI team has tried, and users can use NNI to discover better process | ||
|
||
we use the BERT model and the trainer pipeline in the Transformers to do some experiments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We
# .. note:: | ||
|
||
# Please set ``is_trace`` to ``False`` to fine-tune the BERT model and set ``is_trace`` to ``True`` | ||
# When you need to create a traced trainer for model quantization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
, when
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and fine-tuning model can also use a traced trainer.
config_list = [{ | ||
'op_types': ['Linear'], | ||
'op_names_re': ['bert.encoder.layer.{}'.format(i) for i in range(12)], | ||
'target_names': ['weight', '_output_'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not quant input?
|
||
if __name__ == "__main__": | ||
fake_quantize() | ||
evaluate() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove if __name__ == "__main__":
, this is a ipython style script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will review this doc after merge
Description
Test Options
Checklist
How to test