Illegal instruction (core dumped) in intel N305

### Describe the bug

hello,im trying useing ipex on n305 ,but throw a error when using,my environment is complexing,i would share my case to analysis to make ipex strong and hope give some advice or solution on my case.

## Environment
 i ran pytorch in a docker container which instanced from pthon:3.9.18 under ubuntu 20.04 and this ubuntu is installed in a PVE VM.
```
PVE VM> UBUNTU 20.04> Docker From Python[Python:3.9,Pytorch 2.10,ipex 2.1.0]
```
and i was configured GPU direct to VM
```shell
# command on ubuntu
lspci | grep VGA
00:02.0 VGA compatible controller: Device 1234:1111 (rev 02)
00:10.0 VGA compatible controller: Intel Corporation Device 46d0
```

## Problem
when i install ipex as official get stared
```shell
# on container
pip install intel_extension_for_pytorch
```
and modify the function where do modle load
```python
#!/usr/bin/env python
# coding=utf-8
## From: https://github.com/THUDM/ChatGLM-6B
import torch
import os
##### import ipex
import intel_extension_for_pytorch as ipex
##### import ipex
from typing import Dict, Union, Optional

from torch.nn import Module
from transformers import AutoModel, AutoTokenizer

from .chat import do_chat, do_chat_stream

def init_chatglm(model_path: str, running_device: str, gpus: int):
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

    if running_device.upper() == "GPU":
        model = load_model_on_gpus(model_path, gpus)
    else:
        model = AutoModel.from_pretrained(model_path, trust_remote_code=True)
        model = model.float()

    model.eval()
##### follow as manual
    model = ipex.optimize(model)
##### follow as manual
    model.do_chat = do_chat
    model.do_chat_stream = do_chat_stream
    return tokenizer, model


def auto_configure_device_map(num_gpus: int) -> Dict[str, int]:
    num_trans_layers = 28
    per_gpu_layers = 30 / num_gpus

    device_map = {'transformer.word_embeddings': 0,
                  'transformer.final_layernorm': 0, 'lm_head': 0}

    used = 2
    gpu_target = 0
    for i in range(num_trans_layers):
        if used >= per_gpu_layers:
            gpu_target += 1
            used = 0
        assert gpu_target < num_gpus
        device_map[f'transformer.layers.{i}'] = gpu_target
        used += 1

    return device_map


def load_model_on_gpus(checkpoint_path: Union[str, os.PathLike], num_gpus: int = 2,
                       device_map: Optional[Dict[str, int]] = None, **kwargs) -> Module:
    if num_gpus < 2 and device_map is None:
        model = AutoModel.from_pretrained(
            checkpoint_path, trust_remote_code=True, **kwargs).half().cuda()
    else:
        if num_gpus > torch.cuda.device_count():
            raise Exception(f"need {num_gpus} GPU, but only has {torch.cuda.device_count()}")

        from accelerate import dispatch_model

        model = AutoModel.from_pretrained(
            checkpoint_path, trust_remote_code=True, **kwargs).half()

        if device_map is None:
            device_map = auto_configure_device_map(num_gpus)

        model = dispatch_model(model, device_map=device_map)
        print(f"Device Map: {model.hf_device_map}\n")

    return model
```
the full code is reference from  https://github.com/ninehills/chatglm-openai-api
upper code in `chatglm/chatglm.py`
when i run command
```shell
python main.py --device=cpu or xpu
```
it will throw
```shell
root@faed2ef52605:/app# python main.py --device=xpu
> Load config and arguments...
Config file: config.toml
Language Model: chatglm-6b-int4
Embeddings Model:
Device: xpu
GPUs: 1
Port: 8080
Tunneling:
Config:
{'models': {'llm': {'chatglm-6b': {'type': 'chatglm', 'path': 'THUDM/chatglm-6b'}, 'chatglm-6b-int8': {'type': 'chatglm', 'path': 'THUDM/chatglm-6b-int8'}, 'chatglm-6b-int4': {'type': 'chatglm', 'path': '/app/model/chatglm2-6b-int4'}, 'chatglm2-6b': {'type': 'chatglm', 'path': 'THUDM/chatglm2-6b'}, 'chatglm2-6b-int8': {'type': 'chatglm', 'path': 'THUDM/chatglm2-6b-int8'}, 'chatglm2-6b-int4': {'type': 'chatglm', 'path': 'THUDM/chatglm2-6b-int4'}, 'phoenix-inst-chat-7b': {'type': 'phoenix', 'path': 'FreedomIntelligence/phoenix-inst-chat-7b'}, 'phoenix-inst-chat-7b-int4': {'type': 'phoenix', 'path': 'FreedomIntelligence/phoenix-inst-chat-7b-int4'}}, 'embeddings': {'text2vec-large-chinese': {'type': 'default', 'path': 'GanymedeNil/text2vec-large-chinese'}}}, 'auth': {'tokens': ['token1']}}
> Start LLM model chatglm-6b-int4
>> Use chatglm llm model /app/model/chatglm2-6b-int4
Illegal instruction (core dumped)
```




### Versions

when i ran collect_env.py also
```
root@faed2ef52605:/app# python env.py
Illegal instruction (core dumped)
```

maybe is my environment is not adaptable for ipex,the collect env also crushed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Illegal instruction (core dumped) in intel N305 #450

Describe the bug

Environment

Problem

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Illegal instruction (core dumped) in intel N305 #450

Description

Describe the bug

Environment

Problem

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions