Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add release notes for v2.4.0 and update version #1514

Merged
merged 8 commits into from
Jan 24, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 38 additions & 6 deletions ReleaseNotes.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,43 @@
# Release Notes

## Introduction
*Neural Network Compression Framework (NNCF)* is a toolset for Neural Networks model compression.
The framework organized as a Python module that can be built and used as standalone or within
samples distributed with the code. The samples demonstrate the usage of compression methods on
public models and datasets for three different use cases: Image Classification, Object Detection,
and Semantic Segmentation.
## New in Release 2.4.0
Target version updates:
- Bump target framework versions to PyTorch 1.13.1, TensorFlow 2.8.x, ONNX 1.12, ONNXRuntime 1.13.1
- Increased target HuggingFace transformers version for the integration patch to 4.23.1

Features:
- Official release of the ONNX framework support.
NNCF may now be used for post-training quantization (PTQ) on ONNX models.
Added an example script demonstrating the ONNX post-training quantization on MobileNetV2.
- Common post-training quantization API across the supported framework model formats (PyTorch, TensorFlow, ONNX, OpenVINO IR) via the `nncf.quantize(...)` function.
The parameter set of the function is the same for all frameworks - actual framework-specific implementations are being dispatched based on the type of the model object argument.
- (PyTorch) Joint Pruning, Quantization and Distillation for Transformers made possible via NNCF -
- (PyTorch, TensorFlow) Improved the adaptive compression training functionality to reduce effective training time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Preview release of OpenVINO framework support. NNCF may now be used for post-training quantization on OpenVINO models. Added an example script demonstrating the OpenVINO post-training quantization on MobileNetV2.
  • NNCF with OpenVINO backend can be installed using pip install nncf[openvino]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

- (ONNX) Post-processing nodes are now automatically excluded from quantization.
- (PyTorch - Experimental) Joint Pruning, Quantization and Distillation for Transformers enabled for certain models from HuggingFace `transformers` repo

Bugfixes:
- Fixed a division by zero if every operation is added to ignored scope
- Improved logging output, cutting down on the number of messages being output to the standard `logging.INFO` log level.
- Fixed FLOPS calculation for linear filters - this impacts existing models that were pruned with a FLOPS target.
- "chunk" and "split" ops are correctly handled during pruning.
- Linear layers may now be pruned by input and output independently.
- Matmul-like operations and subsequent arithmetic operations are now treated as a fused pattern.
- (PyTorch) Fixed a rare condition with accumulator overflow in CUDA quantization kernels, which led to CUDA runtime errors and NaN values appearing in quantized tensors and
- (PyTorch) `transformers` integration patch now allows to export to ONNX during training, and not only at the end of it.
- (PyTorch) `torch.nn.utils.weight_norm` weights are now detected correctly.
- (PyTorch) Exporting a model with sparsity or pruning no longer leads to weights in the original model object in-memory to be hard-set to 0.
- (PyTorch, TensorFlow) Various bugs and issues with compression training were fixed.
- (TensorFlow) Fixed an error with `"num_bn_adaptation_samples": 0` in config leading to a `TypeError` during quantization algo initialization.
- (ONNX) Temporary model file is no longer saved on disk.
- (ONNX) Depthwise convolutions are now quantizable in per-channel mode.

Breaking changes:
- Fused patterns will be excluded from quantization via `ignored_scopes` only if the top-most node in data flow order matches against `ignored_scopes`
- NNCF config's `"ignored_scopes"` and `"target_scopes"` are now strictly checked to be matching against at least one node in the model graph instead of silently ignoring the unmatched entries.
- Calling `setup.py` directly to install NNCF is deprecated and no longer guaranteed to work.
- Importing NNCF logger as `from nncf.common.utils.logger import logger as nncf_logger` is deprecated - use `from nncf import nncf_logger` instead.
- `pruning_rate` is renamed to `pruning_level` in pruning compression controllers.

## New in Release 2.3.0
- (ONNX) PTQ API support for ONNX.
Expand Down
2 changes: 1 addition & 1 deletion nncf/version.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = '2.3.0'
__version__ = '2.4.0'

BKC_TORCH_VERSION = '1.13.1'
BKC_TORCHVISION_VERSION = '0.14.1'
Expand Down