|
| 1 | +# Step by step example how to dump weights tensors for PyTorch model with Neural Insights |
| 2 | +1. [Introduction](#introduction) |
| 3 | +2. [Preparation](#preparation) |
| 4 | +3. [Running the quantization](#running-the-quantization) |
| 5 | + |
| 6 | +# Introduction |
| 7 | +In this instruction weight data will be dumped using Neural Insights. PyTorch GPT-J-6B model will be used as an example. |
| 8 | + |
| 9 | +# Preparation |
| 10 | +## Source |
| 11 | +First you need to install Intel® Neural Compressor. |
| 12 | +```shell |
| 13 | +# Install Neural Compressor |
| 14 | +git clone https://github.com/intel/neural-compressor.git |
| 15 | +cd neural-compressor |
| 16 | +pip install -r requirements.txt |
| 17 | +python setup.py install |
| 18 | + |
| 19 | +# Install Neural Insights |
| 20 | +pip install -r neural_insights/requirements.txt |
| 21 | +python setup.py install neural_insights |
| 22 | +``` |
| 23 | + |
| 24 | +## Requirements |
| 25 | +```shell |
| 26 | +cd /examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq_static/fx |
| 27 | +pip install -r requirements.txt |
| 28 | +``` |
| 29 | + |
| 30 | +# Running the quantization |
| 31 | +Before applying quantization, modify some code to enable Neural Insights: |
| 32 | +1. Set the argument `diagnosis` to be `True` in `PostTrainingQuantConfig` so that Neural Insights will dump weights of quantizable Ops in this model. |
| 33 | + |
| 34 | +```python |
| 35 | +conf = PostTrainingQuantConfig( |
| 36 | + accuracy_criterion=accuracy_criterion, |
| 37 | + diagnosis=True, |
| 38 | +) |
| 39 | +``` |
| 40 | +2. Quantize the model with following command: |
| 41 | +```shell |
| 42 | +python run_clm.py \ |
| 43 | + --model_name_or_path EleutherAI/gpt-j-6B \ |
| 44 | + --dataset_name wikitext\ |
| 45 | + --dataset_config_name wikitext-2-raw-v1 \ |
| 46 | + --do_train \ |
| 47 | + --do_eval \ |
| 48 | + --tune \ |
| 49 | + --output_dir saved_results |
| 50 | +``` |
| 51 | + |
| 52 | +Results would be dumped into `nc_workspace` directory in similar structure: |
| 53 | +``` |
| 54 | +├── history.snapshot |
| 55 | +├── input_model.pt |
| 56 | +├── inspect_saved |
| 57 | +│ ├── fp32 |
| 58 | +│ │ └── inspect_result.pkl |
| 59 | +│ └── quan |
| 60 | +│ └── inspect_result.pkl |
| 61 | +├── model_summary.txt |
| 62 | +└── weights_table.csv |
| 63 | +``` |
0 commit comments