Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix woq example and update document for v1.19.0 #2097

Merged
merged 6 commits into from
Dec 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .azure-pipelines/template/docker-template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ steps:

- ${{ if eq(parameters.imageSource, 'pull') }}:
- script: |
docker pull vault.habana.ai/gaudi-docker/1.18.0/ubuntu22.04/habanalabs/pytorch-installer-2.4.0:latest
docker pull vault.habana.ai/gaudi-docker/1.19.0/ubuntu24.04/habanalabs/pytorch-installer-2.5.1:latest
displayName: "Pull habana docker image"

- script: |
Expand All @@ -95,7 +95,7 @@ steps:
else
docker run -dit --disable-content-trust --privileged --name=${{ parameters.containerName }} --shm-size="2g" \
--runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host \
-v ${BUILD_SOURCESDIRECTORY}:/neural-compressor vault.habana.ai/gaudi-docker/1.18.0/ubuntu22.04/habanalabs/pytorch-installer-2.4.0:latest
-v ${BUILD_SOURCESDIRECTORY}:/neural-compressor vault.habana.ai/gaudi-docker/1.19.0/ubuntu24.04/habanalabs/pytorch-installer-2.5.1:latest
fi
echo "Show the container list after docker run ... "
docker ps -a
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ Following example code demonstrates FP8 Quantization, it is supported by Intel G
To try on Intel Gaudi2, docker image with Gaudi Software Stack is recommended, please refer to following script for environment setup. More details can be found in [Gaudi Guide](https://docs.habana.ai/en/latest/Installation_Guide/Bare_Metal_Fresh_OS.html#launch-docker-image-that-was-built).
```bash
# Run a container with an interactive shell
docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host vault.habana.ai/gaudi-docker/1.18.0/ubuntu22.04/habanalabs/pytorch-installer-2.4.0:latest
docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host vault.habana.ai/gaudi-docker/1.19.0/ubuntu24.04/habanalabs/pytorch-installer-2.5.1:latest
```
Run the example:
```python
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,5 @@ lm_eval==0.4.3
peft
numba
tbb
# TODO: (Yi) SW-208079 replace auto-round with the released version
auto-round-hpu @ git+https://github.com/intel/auto-round.git@hpu_only_pkg
optimum-habana==1.14.1
auto-round @ git+https://github.com/intel/auto-round.git@v0.4.2
optimum-habana==1.14.1
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ function init_params {
batch_size=16
tuned_checkpoint=saved_results
task=lambada_openai
incbench_cmd="incbench --num_cores_per_instance 4"
echo ${max_eval_samples}
for var in "$@"
do
Expand Down Expand Up @@ -104,6 +105,7 @@ function run_benchmark {
elif [ "${topology}" = "opt_125m_woq_autoround_int4_hpu" ]; then
model_name_or_path="facebook/opt-125m"
extra_cmd=$extra_cmd" --woq_algo AutoRound"
incbench_cmd="incbench --num_instances 1"
elif [ "${topology}" = "opt_125m_woq_autotune_int4" ]; then
model_name_or_path="facebook/opt-125m"
fi
Expand All @@ -116,7 +118,7 @@ function run_benchmark {
--batch_size ${batch_size} \
${extra_cmd} ${mode_cmd}
elif [[ ${mode} == "performance" ]]; then
incbench --num_cores_per_instance 4 run_clm_no_trainer.py \
${incbench_cmd} run_clm_no_trainer.py \
--model ${model_name_or_path} \
--batch_size ${batch_size} \
--output_dir ${tuned_checkpoint} \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -270,8 +270,9 @@ def get_user_model():
torchscript = True
if args.woq_algo == "AutoRound" and is_habana_framework_installed():
print("Quantizing model with AutoRound on HPU")
check_torch_compile_with_hpu_backend()
set_envs_for_torch_compile_with_hpu_backend()
if args.quantize:
xin3he marked this conversation as resolved.
Show resolved Hide resolved
check_torch_compile_with_hpu_backend()
set_envs_for_torch_compile_with_hpu_backend()
user_model = AutoModelForCausalLM.from_pretrained(
args.model,
trust_remote_code=args.trust_remote_code,
Expand Down Expand Up @@ -570,7 +571,7 @@ def run_fn_for_gptq(model, dataloader_for_calibration, *args):


if is_hpex_available():
from habana_frameworks.torch.hpu import wrap_in_hpu_graph
from habana_frameworks.torch.hpu.graphs import wrap_in_hpu_graph
user_model = user_model.to(torch.bfloat16)
wrap_in_hpu_graph(user_model, max_graphs=10)

Expand Down
2 changes: 2 additions & 0 deletions neural_compressor/evaluation/lm_eval/models/huggingface.py
Original file line number Diff line number Diff line change
Expand Up @@ -969,6 +969,8 @@ def _model_call(self, inps, attn_mask=None, labels=None):
output = output.logits
if self.pad_to_buckets and padding_length != 0: # use buckets to pad inputs
output = output[:, :-padding_length, :]
if "hpu" in output.device.type: # make sure return fp32 tensor for HPU, TODO: root cause
output = output.to(torch.float32)
return output

def _model_generate(self, context, max_length, stop, **generation_kwargs):
Expand Down
2 changes: 1 addition & 1 deletion test/3x/torch/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
auto_round
deepspeed @ git+https://github.com/HabanaAI/DeepSpeed.git@1.18.0
deepspeed @ git+https://github.com/HabanaAI/DeepSpeed.git@1.19.0
expecttest
intel_extension_for_pytorch
numpy
Expand Down
Loading