Skip to content

Commit ff37401

Browse files
authored
3.X API installation update (#1935)
Signed-off-by: chensuyue <suyue.chen@intel.com>
1 parent 6c27c19 commit ff37401

38 files changed

+43
-6926
lines changed

.azure-pipelines/scripts/install_nc.sh

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,6 @@ elif [[ $1 = *"3x_tf"* ]]; then
1010
python -m pip install --no-cache-dir -r requirements_tf.txt
1111
python setup.py tf bdist_wheel
1212
pip install dist/neural_compressor*.whl --force-reinstall
13-
elif [[ $1 = *"3x_ort" ]]; then
14-
python -m pip install --no-cache-dir -r requirements_ort.txt
15-
python setup.py ort bdist_wheel
16-
pip install dist/neural_compressor*.whl --force-reinstall
1713
else
1814
python -m pip install --no-cache-dir -r requirements.txt
1915
python setup.py bdist_wheel

.azure-pipelines/scripts/ut/3x/coverage.3x_ort

Lines changed: 0 additions & 15 deletions
This file was deleted.

.azure-pipelines/scripts/ut/3x/run_3x_ort.sh

Lines changed: 0 additions & 35 deletions
This file was deleted.

.azure-pipelines/ut-3x-ort.yml

Lines changed: 0 additions & 109 deletions
This file was deleted.

.github/checkgroup.yml

Lines changed: 0 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -140,16 +140,3 @@ subprojects:
140140
- "UT-3x-Torch (Coverage Compare CollectDatafiles)"
141141
- "UT-3x-Torch (Unit Test 3x Torch Unit Test 3x Torch)"
142142
- "UT-3x-Torch (Unit Test 3x Torch baseline Unit Test 3x Torch baseline)"
143-
144-
- id: "Unit Tests 3x-ONNXRT workflow"
145-
paths:
146-
- "neural_compressor/common/**"
147-
- "neural_compressor/onnxrt/**"
148-
- "test/3x/onnxrt/**"
149-
- "setup.py"
150-
- "requirements_ort.txt"
151-
checks:
152-
- "UT-3x-ONNXRT"
153-
- "UT-3x-ONNXRT (Coverage Compare CollectDatafiles)"
154-
- "UT-3x-ONNXRT (Unit Test 3x ONNXRT Unit Test 3x ONNXRT)"
155-
- "UT-3x-ONNXRT (Unit Test 3x ONNXRT baseline Unit Test 3x ONNXRT baseline)"

README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,21 +19,25 @@ Intel® Neural Compressor aims to provide popular model compression techniques s
1919
as well as Intel extensions such as [Intel Extension for TensorFlow](https://github.com/intel/intel-extension-for-tensorflow) and [Intel Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch).
2020
In particular, the tool provides the key features, typical examples, and open collaborations as below:
2121

22-
* Support a wide range of Intel hardware such as [Intel Xeon Scalable Processors](https://www.intel.com/content/www/us/en/products/details/processors/xeon/scalable.html), [Intel Xeon CPU Max Series](https://www.intel.com/content/www/us/en/products/details/processors/xeon/max-series.html), [Intel Data Center GPU Flex Series](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu/flex-series.html), and [Intel Data Center GPU Max Series](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu/max-series.html) with extensive testing; support AMD CPU, ARM CPU, and NVidia GPU through ONNX Runtime with limited testing
22+
* Support a wide range of Intel hardware such as [Intel Gaudi Al Accelerators](https://www.intel.com/content/www/us/en/products/details/processors/ai-accelerators/gaudi-overview.html), [Intel Core Ultra Processors](https://www.intel.com/content/www/us/en/products/details/processors/core-ultra.html), [Intel Xeon Scalable Processors](https://www.intel.com/content/www/us/en/products/details/processors/xeon/scalable.html), [Intel Xeon CPU Max Series](https://www.intel.com/content/www/us/en/products/details/processors/xeon/max-series.html), [Intel Data Center GPU Flex Series](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu/flex-series.html), and [Intel Data Center GPU Max Series](https://www.intel.com/content/www/us/en/products/details/discrete-gpus/data-center-gpu/max-series.html) with extensive testing;
23+
support AMD CPU, ARM CPU, and NVidia GPU through ONNX Runtime with limited testing; support NVidia GPU for some WOQ algorithms like AutoRound and HQQ.
2324

2425
* Validate popular LLMs such as [LLama2](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [Falcon](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [GPT-J](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [Bloom](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), [OPT](/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/llm), and more than 10,000 broad models such as [Stable Diffusion](/examples/pytorch/nlp/huggingface_models/text-to-image/quantization), [BERT-Large](/examples/pytorch/nlp/huggingface_models/text-classification/quantization/ptq_static/fx), and [ResNet50](/examples/pytorch/image_recognition/torchvision_models/quantization/ptq/cpu/fx) from popular model hubs such as [Hugging Face](https://huggingface.co/), [Torch Vision](https://pytorch.org/vision/stable/index.html), and [ONNX Model Zoo](https://github.com/onnx/models#models), with automatic [accuracy-driven](/docs/source/design.md#workflow) quantization strategies
2526

2627
* Collaborate with cloud marketplaces such as [Google Cloud Platform](https://console.cloud.google.com/marketplace/product/bitnami-launchpad/inc-tensorflow-intel?project=verdant-sensor-286207), [Amazon Web Services](https://aws.amazon.com/marketplace/pp/prodview-yjyh2xmggbmga#pdp-support), and [Azure](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/bitnami.inc-tensorflow-intel), software platforms such as [Alibaba Cloud](https://www.intel.com/content/www/us/en/developer/articles/technical/quantize-ai-by-oneapi-analytics-on-alibaba-cloud.html), [Tencent TACO](https://new.qq.com/rain/a/20221202A00B9S00) and [Microsoft Olive](https://github.com/microsoft/Olive), and open AI ecosystem such as [Hugging Face](https://huggingface.co/blog/intel), [PyTorch](https://pytorch.org/tutorials/recipes/intel_neural_compressor_for_pytorch.html), [ONNX](https://github.com/onnx/models#models), [ONNX Runtime](https://github.com/microsoft/onnxruntime), and [Lightning AI](https://github.com/Lightning-AI/lightning/blob/master/docs/source-pytorch/advanced/post_training_quantization.rst)
2728

2829
## What's New
30+
* [2024/07] From 3.0 release, framework extension API is recommended to be used for quantization.
2931
* [2024/07] Performance optimizations and usability improvements on [client-side](https://github.com/intel/neural-compressor/blob/master/docs/3x/client_quant.md).
30-
* [2024/03] A new SOTA approach [AutoRound](https://github.com/intel/auto-round) Weight-Only Quantization on [Intel Gaudi2 AI accelerator](https://habana.ai/products/gaudi2/) is available for LLMs.
3132

3233
## Installation
3334

3435
### Install from pypi
3536
```Shell
36-
pip install neural-compressor
37+
# Install 2.X API + Framework extension API + PyTorch dependency
38+
pip install neural-compressor[pt]
39+
# Install 2.X API + Framework extension API + TensorFlow dependency
40+
pip install neural-compressor[tf]
3741
```
3842
> **Note**:
3943
> Further installation methods can be found under [Installation Guide](https://github.com/intel/neural-compressor/blob/master/docs/source/installation_guide.md). check out our [FAQ](https://github.com/intel/neural-compressor/blob/master/docs/source/faq.md) for more details.

docs/source/installation_guide.md

Lines changed: 31 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -29,28 +29,28 @@ The following prerequisites and requirements must be satisfied for a successful
2929
3030
### Install from Binary
3131
- Install from Pypi
32-
```Shell
33-
# install stable basic version from pypi
34-
pip install neural-compressor
35-
```
36-
```Shell
37-
# [Experimental] install stable basic + PyTorch framework extension API from pypi
38-
pip install neural-compressor[pt]
39-
```
40-
```Shell
41-
# [Experimental] install stable basic + TensorFlow framework extension API from pypi
42-
pip install neural-compressor[tf]
43-
```
44-
45-
- Install from test Pypi
46-
```Shell
47-
# install nightly version
48-
git clone https://github.com/intel/neural-compressor.git
49-
cd neural-compressor
50-
pip install -r requirements.txt
51-
# install nightly basic version from pypi
52-
pip install -i https://test.pypi.org/simple/ neural-compressor
53-
```
32+
```Shell
33+
# Install 2.X API + Framework extension API + PyTorch dependency
34+
pip install neural-compressor[pt]
35+
```
36+
```Shell
37+
# Install 2.X API + Framework extension API + TensorFlow dependency
38+
pip install neural-compressor[tf]
39+
```
40+
```Shell
41+
# Install 2.X API + Framework extension API
42+
# With this install CMD, some dependencies for framework extension API not installed,
43+
# you can install them separately by `pip install -r requirements_pt.txt` or `pip install -r requirements_tf.txt`.
44+
pip install neural-compressor
45+
```
46+
```Shell
47+
# Framework extension API + TensorFlow dependency
48+
pip install neural-compressor-pt
49+
```
50+
```Shell
51+
# Framework extension API + TensorFlow dependency
52+
pip install neural-compressor-tf
53+
```
5454

5555
### Install from Source
5656

@@ -76,15 +76,20 @@ The AI Kit is distributed through many common channels, including from Intel's w
7676
## System Requirements
7777

7878
### Validated Hardware Environment
79+
80+
#### Intel® Neural Compressor supports HPUs based on heterogeneous architecture with two compute engines (MME and TPC):
81+
* Intel Gaudi Al Accelerators (Gaudi2)
82+
7983
#### Intel® Neural Compressor supports CPUs based on [Intel 64 architecture or compatible processors](https://en.wikipedia.org/wiki/X86-64):
8084

81-
* Intel Xeon Scalable processor (formerly Skylake, Cascade Lake, Cooper Lake, Ice Lake, and Sapphire Rapids)
82-
* Intel Xeon CPU Max Series (formerly Sapphire Rapids HBM)
85+
* Intel Xeon Scalable processor (Skylake, Cascade Lake, Cooper Lake, Ice Lake, and Sapphire Rapids)
86+
* Intel Xeon CPU Max Series (Sapphire Rapids HBM)
87+
* Intel Core Ultra Processors (Meteor Lake)
8388

8489
#### Intel® Neural Compressor supports GPUs built on Intel's Xe architecture:
8590

86-
* Intel Data Center GPU Flex Series (formerly Arctic Sound-M)
87-
* Intel Data Center GPU Max Series (formerly Ponte Vecchio)
91+
* Intel Data Center GPU Flex Series (Arctic Sound-M)
92+
* Intel Data Center GPU Max Series (Ponte Vecchio)
8893

8994
#### Intel® Neural Compressor quantized ONNX models support multiple hardware vendors through ONNX Runtime:
9095

neural_compressor/onnxrt/__init__.py

Lines changed: 0 additions & 56 deletions
This file was deleted.

neural_compressor/onnxrt/algorithms/__init__.py

Lines changed: 0 additions & 22 deletions
This file was deleted.

neural_compressor/onnxrt/algorithms/layer_wise/__init__.py

Lines changed: 0 additions & 17 deletions
This file was deleted.

0 commit comments

Comments
 (0)