Open
Description
Describe the bug
hello,im trying useing ipex on n305 ,but throw a error when using,my environment is complexing,i would share my case to analysis to make ipex strong and hope give some advice or solution on my case.
Environment
i ran pytorch in a docker container which instanced from pthon:3.9.18 under ubuntu 20.04 and this ubuntu is installed in a PVE VM.
PVE VM> UBUNTU 20.04> Docker From Python[Python:3.9,Pytorch 2.10,ipex 2.1.0]
and i was configured GPU direct to VM
# command on ubuntu
lspci | grep VGA
00:02.0 VGA compatible controller: Device 1234:1111 (rev 02)
00:10.0 VGA compatible controller: Intel Corporation Device 46d0
Problem
when i install ipex as official get stared
# on container
pip install intel_extension_for_pytorch
and modify the function where do modle load
#!/usr/bin/env python
# coding=utf-8
## From: https://github.com/THUDM/ChatGLM-6B
import torch
import os
##### import ipex
import intel_extension_for_pytorch as ipex
##### import ipex
from typing import Dict, Union, Optional
from torch.nn import Module
from transformers import AutoModel, AutoTokenizer
from .chat import do_chat, do_chat_stream
def init_chatglm(model_path: str, running_device: str, gpus: int):
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
if running_device.upper() == "GPU":
model = load_model_on_gpus(model_path, gpus)
else:
model = AutoModel.from_pretrained(model_path, trust_remote_code=True)
model = model.float()
model.eval()
##### follow as manual
model = ipex.optimize(model)
##### follow as manual
model.do_chat = do_chat
model.do_chat_stream = do_chat_stream
return tokenizer, model
def auto_configure_device_map(num_gpus: int) -> Dict[str, int]:
num_trans_layers = 28
per_gpu_layers = 30 / num_gpus
device_map = {'transformer.word_embeddings': 0,
'transformer.final_layernorm': 0, 'lm_head': 0}
used = 2
gpu_target = 0
for i in range(num_trans_layers):
if used >= per_gpu_layers:
gpu_target += 1
used = 0
assert gpu_target < num_gpus
device_map[f'transformer.layers.{i}'] = gpu_target
used += 1
return device_map
def load_model_on_gpus(checkpoint_path: Union[str, os.PathLike], num_gpus: int = 2,
device_map: Optional[Dict[str, int]] = None, **kwargs) -> Module:
if num_gpus < 2 and device_map is None:
model = AutoModel.from_pretrained(
checkpoint_path, trust_remote_code=True, **kwargs).half().cuda()
else:
if num_gpus > torch.cuda.device_count():
raise Exception(f"need {num_gpus} GPU, but only has {torch.cuda.device_count()}")
from accelerate import dispatch_model
model = AutoModel.from_pretrained(
checkpoint_path, trust_remote_code=True, **kwargs).half()
if device_map is None:
device_map = auto_configure_device_map(num_gpus)
model = dispatch_model(model, device_map=device_map)
print(f"Device Map: {model.hf_device_map}\n")
return model
the full code is reference from https://github.com/ninehills/chatglm-openai-api
upper code in chatglm/chatglm.py
when i run command
python main.py --device=cpu or xpu
it will throw
root@faed2ef52605:/app# python main.py --device=xpu
> Load config and arguments...
Config file: config.toml
Language Model: chatglm-6b-int4
Embeddings Model:
Device: xpu
GPUs: 1
Port: 8080
Tunneling:
Config:
{'models': {'llm': {'chatglm-6b': {'type': 'chatglm', 'path': 'THUDM/chatglm-6b'}, 'chatglm-6b-int8': {'type': 'chatglm', 'path': 'THUDM/chatglm-6b-int8'}, 'chatglm-6b-int4': {'type': 'chatglm', 'path': '/app/model/chatglm2-6b-int4'}, 'chatglm2-6b': {'type': 'chatglm', 'path': 'THUDM/chatglm2-6b'}, 'chatglm2-6b-int8': {'type': 'chatglm', 'path': 'THUDM/chatglm2-6b-int8'}, 'chatglm2-6b-int4': {'type': 'chatglm', 'path': 'THUDM/chatglm2-6b-int4'}, 'phoenix-inst-chat-7b': {'type': 'phoenix', 'path': 'FreedomIntelligence/phoenix-inst-chat-7b'}, 'phoenix-inst-chat-7b-int4': {'type': 'phoenix', 'path': 'FreedomIntelligence/phoenix-inst-chat-7b-int4'}}, 'embeddings': {'text2vec-large-chinese': {'type': 'default', 'path': 'GanymedeNil/text2vec-large-chinese'}}}, 'auth': {'tokens': ['token1']}}
> Start LLM model chatglm-6b-int4
>> Use chatglm llm model /app/model/chatglm2-6b-int4
Illegal instruction (core dumped)
Versions
when i ran collect_env.py also
root@faed2ef52605:/app# python env.py
Illegal instruction (core dumped)
maybe is my environment is not adaptable for ipex,the collect env also crushed