Skip to content

Add vlm examples, bugfix #2012

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Oct 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
3eaaa57
add VLM examples
WeiweiZhang1 Sep 26, 2024
8cc273b
bugfix, add utils
WeiweiZhang1 Sep 26, 2024
8597597
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 26, 2024
53b4b32
fix docstring issues
WeiweiZhang1 Sep 26, 2024
f915e49
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 26, 2024
c5127a3
bugfix
WeiweiZhang1 Sep 26, 2024
e3a28e6
resolve bugs
WeiweiZhang1 Sep 26, 2024
6b4c2ff
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 26, 2024
ecd5410
refine examples
WeiweiZhang1 Sep 26, 2024
9212575
fix scan issue
WeiweiZhang1 Sep 26, 2024
533afd0
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 26, 2024
21e1dbb
refine shell
WeiweiZhang1 Sep 26, 2024
64d1b3e
refine scripts & requirements
WeiweiZhang1 Sep 29, 2024
dc368b2
typofix
WeiweiZhang1 Sep 29, 2024
22082df
refine docs
WeiweiZhang1 Sep 30, 2024
4045b36
Merge branch 'master' into add_vlm_examples
WeiweiZhang1 Oct 8, 2024
a3b381d
set attn_implementation for Phi3-vision
WeiweiZhang1 Oct 8, 2024
b827c11
refine phi3 example
WeiweiZhang1 Oct 8, 2024
995c914
Merge branch 'add_vlm_examples' of https://github.com/intel/neural-co…
WeiweiZhang1 Oct 8, 2024
8767ffc
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 8, 2024
afbfcfa
fix code coverage
WeiweiZhang1 Oct 8, 2024
5aa584c
fix code coverage
WeiweiZhang1 Oct 8, 2024
6b8cc73
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 8, 2024
9555321
update config
XuehaoSun Oct 9, 2024
5dcb9bd
refine shells, docs and example. enable qwen2-vl quantization
WeiweiZhang1 Oct 15, 2024
ce514db
Merge branch 'master' into add_vlm_examples
WeiweiZhang1 Oct 15, 2024
335f29e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 15, 2024
b3bea7f
fix ci
WeiweiZhang1 Oct 15, 2024
33d49e1
fix EOF error
XuehaoSun Oct 17, 2024
414811a
update qwen dir
XuehaoSun Oct 17, 2024
3630267
refine shell, add llama3.2 inference to doc
WeiweiZhang1 Oct 17, 2024
e75719e
refine shell, add llama3.2 inference to doc
WeiweiZhang1 Oct 17, 2024
4190934
bugfix
WeiweiZhang1 Oct 17, 2024
dd9a4be
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 17, 2024
d4cd3bd
bugfix
WeiweiZhang1 Oct 18, 2024
d9193cf
bugfix
WeiweiZhang1 Oct 18, 2024
d4b9f52
refine eval shell
WeiweiZhang1 Oct 18, 2024
dade1f6
fix eval device issue
WeiweiZhang1 Oct 18, 2024
749812c
refine eval dtype
WeiweiZhang1 Oct 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
418 changes: 221 additions & 197 deletions examples/.config/model_params_pytorch_3x.json

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@

Step-by-Step
============
This document describes the step-by-step instructions to run [VLM quantization for Llava](https://huggingface.co/liuhaotian/llava-v1.5-7b) using AutoRound Quantization.

# Run Quantization on Multimodal Models

In this example, we introduce an straight-forward way to execute quantization on some popular multimodal models such as LLaVA.

Please note that LLAVA quantized model is currently only support inference with **auto_round** format.

## Install
If you are not using Linux, do NOT proceed, see instructions for [macOS](https://github.com/haotian-liu/LLaVA/blob/main/docs/macOS.md) and [Windows](https://github.com/haotian-liu/LLaVA/blob/main/docs/Windows.md).

1. Clone this repository and navigate to LLaVA folder
```shell
git clone https://github.com/haotian-liu/LLaVA.git
cd LLaVA
```

2. Install Package
```
pip install --upgrade pip # enable PEP 660 support
pip install -e .
```

## Download the calibration/Evaluation data

Our calibration process resembles the official visual instruction tuning process. To align the official implementation of [LLaVA](https://github.com/haotian-liu/LLaVA/tree/main?tab=readme-ov-file#visual-instruction-tuning)

Please download the annotation of the final mixture our instruction tuning data [llava_v1_5_mix665k.json](https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K/blob/main/llava_v1_5_mix665k.json), and download the images from constituting datasets:

COCO: [train2017](http://images.cocodataset.org/zips/train2017.zip), and unzip the image folder to any directory you desire.

Please refer to [llava_eval_datasets](https://github.com/haotian-liu/LLaVA/blob/main/docs/Evaluation.md#scripts) to download the textVQA dataset for evaluation usage

<br />

## 2. Run Examples
Enter into the examples folder and install requirements

```bash
pip install -r requirements.txt
```

- **Default Settings:**
```bash
CUDA_VISIBLE_DEVICES=0 python3 main.py --model_name liuhaotian/llava-v1.5-7b --bits 4 --group_size 128 --quantize
```

## 3. Results
Using [COCO 2017](https://cocodataset.org/) and [LLaVA-Instruct-150K](https://huggingface.co/datasets/liuhaotian/LLaVA-Instruct-150K) datasets for quantization calibration, and TextVQA dataset for evaluation. When the vision components are not involved in quantization, it is able to achieve accuracy loss within 1%. The results for fake quantized LLava-7b are as follows:
| Model | Config | Precision | Hyperparameter | Accuracy% | Relative drop |
| :----: | :----: | :----: | :----: | :----: | :----: |
| liuhaotian/llava-v1.5-7b | - | FP16 | - | 58.21 | - |
| liuhaotian/llava-v1.5-7b | W4G128 | FP16 | with vision | 56.39 | -3.13% |
| liuhaotian/llava-v1.5-7b | W4G128 | FP16 | w/o vision | 58.08 | -0.22% |


## 4. Known Issues
* huggingface format model is not support yet, e.g. llava-1.5-7b-hf
* Setting seqlen to 2048 is not working yet.


## 5. Environment

PyTorch 1.8 or higher version is needed


## Reference
If you find SignRound useful for your research, please cite our paper:
```bash
@article{cheng2023optimize,
title={Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs},
author={Cheng, Wenhua and Zhang, Weiwei and Shen, Haihao and Cai, Yiyang and He, Xin and Lv, Kaokao},
journal={arXiv preprint arXiv:2309.05516},
year={2023}
}
```



Loading