Adapt changes inc release 1.13 #9

echarlaix · 2022-07-08T10:19:29Z

No description provided.

…sion techniques

…-1.13 Fixed distillation bug and some UT error

HuggingFaceDocBuilderDev · 2022-08-05T10:01:17Z

The documentation is not available anymore as the PR was closed or merged.

echarlaix · 2022-08-05T10:31:52Z

tests/test_inc.py


            # Verification final sparsity is equal to the targeted sparsity
-            self.assertGreaterEqual(round(sparsity), target_sparsity * 100)
+            self.assertGreaterEqual(round(sparsity), 0.5)


Why should we replace with target_sparsity * 100 to 0.5 @PenghuiCheng ? Is it because sparsity is never guaranteed to be the exactly target_sparsity ?

because， the target sparsity in yaml is for every operator sparsity, not for whole model sparsity, so we can the sparsity of op is all greater than 0.02 in log, but there are many weights, like embedding op, didn't be pruned. And the function of optimizer.get_sparsity() is get the model sparsity. the model sparsity is 0.84, so I set it to 0.5.
the log is as below:
2022-08-05 17:30:41 [INFO] Name Shape NNZ (dense) NNZ (sparse) Sparsity(%) Std Mean Abs-M
ean
0 distilbert.embeddings.word_embeddings.module.w... [30522, 768] 23440896 0 0.00 0.05 -3.83e-02 0.05
1 distilbert.embeddings.position_embeddings.modu... [512, 768] 393216 0 0.00 _0.02_ -4.01e-05 0.01
2 distilbert.transformer.layer.0.attention.q_lin... [768, 768] 589824 0 0.00 0.04 5.97e-05 0.03
3 distilbert.transformer.layer.0.attention.k_lin... [768, 768] 589824 0 0.00 0.04 9.21e-06 0.03

Thanks for the clarification

echarlaix · 2022-08-05T10:39:55Z

optimum/intel/neural_compressor/trainer.py

+                if teacher_logits is None:
+                    teacher_outputs = self.agent.criterion.teacher_model_forward(inputs)
+                    teacher_logits = self._get_logits(teacher_outputs)
+            elif hasattr(self.agent, "on_post_forward"):


@PenghuiCheng Currently in main we use the teacher_model_forward method to compute the teacher outputs, but I agree that it makes totally sense to use on_post_forward to stay compatible for neural_compressor v1.12 or under

optimum/intel/neural_compressor/trainer.py

echarlaix · 2022-08-05T10:44:20Z

optimum/intel/neural_compressor/trainer.py

-
-            teacher_logits = self._get_logits(teacher_outputs)
+            teacher_logits = inputs.pop("teacher_logits", None)
+            if hasattr(self.agent, "on_after_compute_loss"):


Is it equivalent to hasattr(self.agent.criterion, "teacher_model_forward") ? cc @PenghuiCheng

on_after_compute_loss is not equivalent to teacher_model_forward, in future , we will use on_after_compute_loss callback to compute distillation loss. the usage is like below:
student_outputs = student_model(input)
student_loss = user_criterion(student_outputs, labels)
total_loss = agent.on_after_compute_loss(input, student_outputs, student_loss, tearch_outputs)
here, teacher_outputs is optional.
The callback of on_after_compute_loss is a new API in 1.13. The purpose of this is to reuse the user's criterion.
But in on_after_compute_loss, it didn't handle a tuple of student_outputs and teacher_outputs. so we didn't use the on_after_compute_loss directly, in the future I will commit another commit to use on_after_compute_loss directly.

My question was more about does hasattr(self.agent, "on_after_compute_loss") is equivalent to hasattr(self.agent.criterion, "teacher_model_forward") as self.agent.criterion.teacher_model_forward is called after this condition. Also thanks for your explanation, does it mean that the teacher_model_forward method will be deprecated ? I find this method very useful as it gives us more flexibility : and compute the loss ourselves with the trainer compute_distillation_loss method

The teacher_model_forward will exist in the old version and the new version in neural_compressor, we can use it anyway. Yes, if you want to use the trainer compute_distillation_loss method, you can use hasattr(self.agent.criterion, "teacher_model_forward") condition, but it is only in the new version that will return outputs. So you need to judge that is a new version neural_compressor first.

Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>

) * Initial commit to enable OVTrainer with joint pruning, quantization and distillation via NNCF * Review OpenVINO Q&A readme and configs * Update README.md * Add post init value checker to OVTrainingArguments * Initial enabling of audio classification/wav2vec2 [tests not included] (#2) * use nncf official branch for install since JPQD is merged * copy ac scripts from transformer repo * init commit for wav2vec2 * add onnx_config argument in OVTrainer for onnx export with unsupported model * enable customized teancher kd * add readme * delete debugging lines * Update openvino-dev and nncf version in setup.py * refactor _enable_standard_onnx_export_option to _set_standard_onnx_export_option * add tests for (movement/quantization) with distillation (#3) * test part 1 * clean "compute_distillation_loss' in OVTrainer * add test of OVTrainer for int8+kd / movement / movement+int8/ movement+int8+kd * add expectedFailuremark to test of OVModelForAudioClassification * revert unncessary codes about "OVModelForAudioClassification" * change to a shorter train for w2v2 in readme * revert compute_metrics change since it is not unnecessary * fix task_loss non-scalar bug for kd logging * make regex clearer in QA bert config * Refactor compression-related logging * Refactor OpenVINO IR generation and patch tests * Miscellaneous refactoring * MO IR pruning depends on scheduler stage * Readme tweaks for all example tasks * Minor tweak on tests * Align setup.py for openvino-dev and nncf versions needed for JPQD * Fix lint with Black * Refactor OpenVINO IR generation using python api * Fix via isort * Handle IR generation error to avoid run termination * Update QA readme * Enable distillation on openvino's image classification example * Minor refactoring in openvino's audio classification example * Move openvino-dev dependency to be extra of NNCF * Configure IR model to accept dynamic-shaped input * Revert _enable_standard_onnx_export_option method in OVConfig * Update wav2vec2 configs for audio classification * Add BERT-base/glue-sst2 example with QAT / JPQD (#4) * copy text-classification example from transformers * init draft for sst example * update sst2 accuracy & training time * Revise wav2vec2 config and audio classification readme * Patch _enable_standard_onnx_export_option to only add the key pair to quantization config * Set logging level to INFO in openvino/trainer.py * Review readme of text and image classification * Revert IR generation with static input shape for joint compression * Add distillation and advance optimization section in optimization_ov.mdx * Patch tests * Revise formatting of optimization_ov.mdx * Limit #checkpoint saved for JPQD samples * Handle NNCF output to text log and only print errors to stdout * Replace hardcoded model.onnx filename with constant variable * Fix movement sparsity config in optimization_ov.mdx * Change _set_feature to _set_task to align with OVQuantizer * Revert onnx_config exposure in OVTrainer, expand test coverages for joint compression variations, misc. patches * use builtin onnx configs for wav2vec onnx export * move teacher model argument from OVTrainingArgs to model args * fix duplicate call of `epoch_step` * temporal workaround about compression metrics * test for all training cases * temporal workaround for eval only * cover train/eval tests * style fix * Move old ovtrainer tests to a new `test_training.py` file; bugfix in training loss check (#6) * removing old tests in test_quantization since they are now in `test_training` * bugfix in checking compression metrics during training * keep bert examples only and misc. fixes (#7) * temporarily keep bert examples only; remove w2v2 and swin * move nncf_compression_config out of OVTrainingArguments * type hint change for nncf_compression_config * documnet rename feature to task * revert existing QAT image classification example * delete useless codes in test quantization * revert existing test_ quantization * misc change in compute_metric * revert unnecessary changes * temporal workaround for logging distill & compression loss (not using dist. reduce) * revert set_task method * bugfix in compression metric in qa task * bugfix in importing tpu * simplify pruning ir codes * clean unncessary distillation weight attribute in trainer * Change nncf requirement to official 2.4 * Log nncf compression statistics at the beginning of each training epoch * Revise optimization_ov.mdx documentation * Consolidate during training optimization to QAT and JPQD * Add known limitation regarding OpenVINO IR with static input shape * fix data parallel crashes and add tests for DP/DDP (#8) * fix "not same device" error in data parallel * wrap teacher model with data parallel * add sst2 tests for dp/ddp with fixes * Add remark in optimization_ov.mdx on supported model architecture for structured pruning * Refactor JPQD IR generation where final IR is dynamic in input shape * Revise optimization_ov.mdx to remove static IR limitations * revert snippet for inference with Transformers pipeline * Remove commented codes in openvino/trainer.py * Add tests about new OV IR export - check dynamic graph and output equivalence to torch model (#9) * draft for new export with some todos * draft for tests * delete onnx export debugging when errors on saving * add back the debug info when ir export fails * bugfix in random setting zeros in movement masks * Add tests on OV IR reshape-ability * Remove unused imports in openvino/trainer.py * Refine inference pipeline with OVModel in optimization_ov.mdx * Revise openvino extras in setup.py --------- Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com> Co-authored-by: Yujie Pan <yujie.pan@intel.com>

echarlaix added 23 commits June 20, 2022 11:47

Add one shot

cd22fb9

add get_sparsity method

a723b9b

Modify default framework to match qat for one shot

d551a7a

Update examples and tests

640fad2

Add post-training quantization after scheduler combination of compres…

fa422ce

…sion techniques

Update test

0370f5e

Fix quantizer addition

f0e4ffe

Add pattern lock to pruning mode

323910e

Remove space

baecd85

Add pruning once for all yaml configuration file

ae8a5e0

Add knowledge distillation

afa19f2

Minor change

f1cdf86

Add default distillation configuration file

4c74a57

Add distillation

6ccdbd4

Fix style

b29e708

Add docstring

9839394

Add knowledge distillation test

19e5d18

Update relative tuning accuracy inc optimizer test

42f777b

Modify variable name for clarity

0bdd189

Add distillation to examples readme

ccc0625

Fix inc test

c67cd58

Adapt for INC 1.13 release

a89bebf

Perform qat for one shot optimization test

bf58e83

echarlaix changed the title ~~Add distillation inc release 1.13~~ Adapt changes inc release 1.13 Jul 8, 2022

PenghuiCheng and others added 3 commits August 5, 2022 17:33

Fixed distillation bug and some UT error

1641051

Merge pull request #23 from PenghuiCheng/add-distillation-inc-release…

d4aebd4

…-1.13 Fixed distillation bug and some UT error

Merge branch 'main' into add-distillation-inc-release-1.13

a7734e5

echarlaix and others added 2 commits August 5, 2022 12:11

Merge branch 'main' into add-distillation-inc-release-1.13

80d7948

Readded modifications that where removed during merging

dbf25c8

echarlaix added 3 commits August 5, 2022 12:26

Readded modifications that where removed during merging

8bf4522

Readded modifications that where removed during merging

dfcbbcc

Fix tyle

5338a9c

echarlaix requested a review from PenghuiCheng August 5, 2022 10:29

echarlaix commented Aug 5, 2022

View reviewed changes

echarlaix marked this pull request as ready for review August 5, 2022 12:38

PenghuiCheng approved these changes Aug 5, 2022

View reviewed changes

PenghuiCheng and others added 2 commits August 5, 2022 22:14

Update optimum/intel/neural_compressor/trainer.py

7e74560

Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>

Merge branch 'main' into add-distillation-inc-release-1.13

074cf9f

echarlaix merged commit 60a673f into main Aug 5, 2022

echarlaix deleted the add-distillation-inc-release-1.13 branch August 5, 2022 15:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adapt changes inc release 1.13 #9

Adapt changes inc release 1.13 #9

Uh oh!

echarlaix commented Jul 8, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Aug 5, 2022 •

edited

Loading

Uh oh!

echarlaix Aug 5, 2022

Uh oh!

PenghuiCheng Aug 5, 2022

Uh oh!

echarlaix Aug 5, 2022

Uh oh!

echarlaix Aug 5, 2022

Uh oh!

Uh oh!

echarlaix Aug 5, 2022 •

edited

Loading

Uh oh!

PenghuiCheng Aug 5, 2022

Uh oh!

echarlaix Aug 5, 2022

Uh oh!

PenghuiCheng Aug 5, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Adapt changes inc release 1.13 #9

Adapt changes inc release 1.13 #9

Uh oh!

Conversation

echarlaix commented Jul 8, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Aug 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

echarlaix Aug 5, 2022

Choose a reason for hiding this comment

Uh oh!

PenghuiCheng Aug 5, 2022

Choose a reason for hiding this comment

Uh oh!

echarlaix Aug 5, 2022

Choose a reason for hiding this comment

Uh oh!

echarlaix Aug 5, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

echarlaix Aug 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PenghuiCheng Aug 5, 2022

Choose a reason for hiding this comment

Uh oh!

echarlaix Aug 5, 2022

Choose a reason for hiding this comment

Uh oh!

PenghuiCheng Aug 5, 2022

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

HuggingFaceDocBuilderDev commented Aug 5, 2022 •

edited

Loading

echarlaix Aug 5, 2022 •

edited

Loading