Skip to content

Commit 43af08b

Browse files
authored
Add example of PyTorch model tensor dump with Neural Insights (#1305)
* Add example of PyTorch model tensor dump with Neural Insights Signed-off-by: bmyrcha <bartosz.myrcha@intel.com> * Move instruction to correct directory Signed-off-by: bmyrcha <bartosz.myrcha@intel.com> --------- Signed-off-by: bmyrcha <bartosz.myrcha@intel.com>
1 parent 87e5bbb commit 43af08b

File tree

2 files changed

+66
-0
lines changed

2 files changed

+66
-0
lines changed

neural_insights/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,9 @@ When the quantization is started, the workload should appear on the Neural Insig
114114

115115
> Note that above example uses dummy data which is used to describe usage of Neural Insights. For diagnosis purposes you should use real dataset specific for your use case.
116116

117+
## Tensor dump examples
118+
- [Step by step example how to dump weights data for PyTorch model with Neural Insights](docs/source/pytorch_nlp_cli_mode.md)
119+
117120
## Step by Step Diagnosis Example
118121
Refer to [Step by Step Diagnosis Example with TensorFlow](https://github.com/intel/neural-compressor/tree/master/neural_insights/docs/source/tf_accuracy_debug.md) and [Step by Step Diagnosis Example with ONNXRT](https://github.com/intel/neural-compressor/tree/master/neural_insights/docs/source/onnx_accuracy_debug.md) to get started with some basic quantization accuracy diagnostic skills.
119122

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# Step by step example how to dump weights data for PyTorch model with Neural Insights
2+
1. [Introduction](#introduction)
3+
2. [Preparation](#preparation)
4+
3. [Running the quantization](#running-the-quantization)
5+
6+
# Introduction
7+
In this instruction weight data will be dumped using Neural Insights. PyTorch GPT-J-6B model will be used as an example.
8+
9+
# Preparation
10+
## Source
11+
First you need to install Intel® Neural Compressor.
12+
```shell
13+
# Install Neural Compressor
14+
git clone https://github.com/intel/neural-compressor.git
15+
cd neural-compressor
16+
pip install -r requirements.txt
17+
python setup.py install
18+
19+
# Install Neural Insights
20+
pip install -r neural_insights/requirements.txt
21+
python setup.py install neural_insights
22+
```
23+
24+
## Requirements
25+
```shell
26+
cd /examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq_static/fx
27+
pip install -r requirements.txt
28+
```
29+
30+
# Running the quantization
31+
Before applying quantization, modify some code in `run_clm.py` file to enable Neural Insights:
32+
1. Set the argument `diagnosis` to be `True` in `PostTrainingQuantConfig` so that Neural Insights will dump weights of quantizable Ops in this model.
33+
34+
```python
35+
conf = PostTrainingQuantConfig(
36+
accuracy_criterion=accuracy_criterion,
37+
diagnosis=True,
38+
)
39+
```
40+
2. Quantize the model with following command:
41+
```shell
42+
python run_clm.py \
43+
--model_name_or_path EleutherAI/gpt-j-6B \
44+
--dataset_name wikitext\
45+
--dataset_config_name wikitext-2-raw-v1 \
46+
--do_train \
47+
--do_eval \
48+
--tune \
49+
--output_dir saved_results
50+
```
51+
52+
Results would be dumped into `nc_workspace` directory in similar structure:
53+
```
54+
├── history.snapshot
55+
├── input_model.pt
56+
├── inspect_saved
57+
│ ├── fp32
58+
│ │ └── inspect_result.pkl
59+
│ └── quan
60+
│ └── inspect_result.pkl
61+
├── model_summary.txt
62+
└── weights_table.csv
63+
```

0 commit comments

Comments
 (0)