Skip to content

Illegal instruction (core dumped) in intel N305 #450

Open
@joebnb

Description

@joebnb

Describe the bug

hello,im trying useing ipex on n305 ,but throw a error when using,my environment is complexing,i would share my case to analysis to make ipex strong and hope give some advice or solution on my case.

Environment

i ran pytorch in a docker container which instanced from pthon:3.9.18 under ubuntu 20.04 and this ubuntu is installed in a PVE VM.

PVE VM> UBUNTU 20.04> Docker From Python[Python:3.9,Pytorch 2.10,ipex 2.1.0]

and i was configured GPU direct to VM

# command on ubuntu
lspci | grep VGA
00:02.0 VGA compatible controller: Device 1234:1111 (rev 02)
00:10.0 VGA compatible controller: Intel Corporation Device 46d0

Problem

when i install ipex as official get stared

# on container
pip install intel_extension_for_pytorch

and modify the function where do modle load

#!/usr/bin/env python
# coding=utf-8
## From: https://github.com/THUDM/ChatGLM-6B
import torch
import os
##### import ipex
import intel_extension_for_pytorch as ipex
##### import ipex
from typing import Dict, Union, Optional

from torch.nn import Module
from transformers import AutoModel, AutoTokenizer

from .chat import do_chat, do_chat_stream

def init_chatglm(model_path: str, running_device: str, gpus: int):
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

    if running_device.upper() == "GPU":
        model = load_model_on_gpus(model_path, gpus)
    else:
        model = AutoModel.from_pretrained(model_path, trust_remote_code=True)
        model = model.float()

    model.eval()
##### follow as manual
    model = ipex.optimize(model)
##### follow as manual
    model.do_chat = do_chat
    model.do_chat_stream = do_chat_stream
    return tokenizer, model


def auto_configure_device_map(num_gpus: int) -> Dict[str, int]:
    num_trans_layers = 28
    per_gpu_layers = 30 / num_gpus

    device_map = {'transformer.word_embeddings': 0,
                  'transformer.final_layernorm': 0, 'lm_head': 0}

    used = 2
    gpu_target = 0
    for i in range(num_trans_layers):
        if used >= per_gpu_layers:
            gpu_target += 1
            used = 0
        assert gpu_target < num_gpus
        device_map[f'transformer.layers.{i}'] = gpu_target
        used += 1

    return device_map


def load_model_on_gpus(checkpoint_path: Union[str, os.PathLike], num_gpus: int = 2,
                       device_map: Optional[Dict[str, int]] = None, **kwargs) -> Module:
    if num_gpus < 2 and device_map is None:
        model = AutoModel.from_pretrained(
            checkpoint_path, trust_remote_code=True, **kwargs).half().cuda()
    else:
        if num_gpus > torch.cuda.device_count():
            raise Exception(f"need {num_gpus} GPU, but only has {torch.cuda.device_count()}")

        from accelerate import dispatch_model

        model = AutoModel.from_pretrained(
            checkpoint_path, trust_remote_code=True, **kwargs).half()

        if device_map is None:
            device_map = auto_configure_device_map(num_gpus)

        model = dispatch_model(model, device_map=device_map)
        print(f"Device Map: {model.hf_device_map}\n")

    return model

the full code is reference from https://github.com/ninehills/chatglm-openai-api
upper code in chatglm/chatglm.py
when i run command

python main.py --device=cpu or xpu

it will throw

root@faed2ef52605:/app# python main.py --device=xpu
> Load config and arguments...
Config file: config.toml
Language Model: chatglm-6b-int4
Embeddings Model:
Device: xpu
GPUs: 1
Port: 8080
Tunneling:
Config:
{'models': {'llm': {'chatglm-6b': {'type': 'chatglm', 'path': 'THUDM/chatglm-6b'}, 'chatglm-6b-int8': {'type': 'chatglm', 'path': 'THUDM/chatglm-6b-int8'}, 'chatglm-6b-int4': {'type': 'chatglm', 'path': '/app/model/chatglm2-6b-int4'}, 'chatglm2-6b': {'type': 'chatglm', 'path': 'THUDM/chatglm2-6b'}, 'chatglm2-6b-int8': {'type': 'chatglm', 'path': 'THUDM/chatglm2-6b-int8'}, 'chatglm2-6b-int4': {'type': 'chatglm', 'path': 'THUDM/chatglm2-6b-int4'}, 'phoenix-inst-chat-7b': {'type': 'phoenix', 'path': 'FreedomIntelligence/phoenix-inst-chat-7b'}, 'phoenix-inst-chat-7b-int4': {'type': 'phoenix', 'path': 'FreedomIntelligence/phoenix-inst-chat-7b-int4'}}, 'embeddings': {'text2vec-large-chinese': {'type': 'default', 'path': 'GanymedeNil/text2vec-large-chinese'}}}, 'auth': {'tokens': ['token1']}}
> Start LLM model chatglm-6b-int4
>> Use chatglm llm model /app/model/chatglm2-6b-int4
Illegal instruction (core dumped)

Versions

when i ran collect_env.py also

root@faed2ef52605:/app# python env.py
Illegal instruction (core dumped)

maybe is my environment is not adaptable for ipex,the collect env also crushed

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions