Native Intel IPEX-LLM Support

# Prerequisites

Please answer the following questions for yourself before submitting an issue.

- [X ] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
- [ X] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md).
- [ X] I [searched using keywords relevant to my issue](https://docs.github.com/en/issues/tracking-your-work-with-issues/filtering-and-searching-issues-and-pull-requests) to make sure that I am creating a new issue that is not already open (or closed).
- [X ] I reviewed the [Discussions](https://github.com/ggerganov/llama.cpp/discussions), and have a new bug or useful enhancement to share.

# Feature Description

I have found this closed issue where someone manually (?how?) implemented IPEX-LLM. However, looking forward to native IPEX-LLM support for Intel Xe iGPUs + Intel Arc dGPUs on Windows and Linux 

https://github.com/ggerganov/llama.cpp/issues/7042

TL;DR is IPEX-LLM now provides a C++ interface, which can be used as a backend for running llama.cpp on Intel GPUs. Incorporating this interface into llama.cpp would allow for leveraging the optimized performance of IPEX-LLM.

# Motivation

Intel Xe graphics launched in 2020. Flex, Max Datacenter and Arc Consumer cards for laptop and desktop launched in 2022. This is a lot of devices in production/circulation.  

This would "permit" llama.cpp users to utilize their integrated Xe GPUs and dedicated Arc GPUs, Datacenter Flex and Max cards with llama.cpp on BOTH Windows and Linux natively (without a confusing manual build). 

# Possible Implementation

The implementation of native Intel IPEX-LLM support would be something like... Integrate --> Test --> Document --> Release. 

1. **Integration with IPEX**: Since IPEX-LLM is built on top of Intel Extension for PyTorch (IPEX), the first step would be to ensure seamless integration with IPEX. This would involve linking the llama.cpp build system with the IPEX library and ensuring that all dependencies are correctly managed. Here is a link for using llama.cpp with Intel GPUs... 

Full manual/guide: https://ipex-llm.readthedocs.io/en/latest/doc/LLM/Quickstart/llama_cpp_quickstart.html 
Full verified model list: https://ipex-llm.readthedocs.io/en/latest/#verified-models
Github: https://github.com/intel-analytics/ipex-llm

The "owners" of this process will be the devs and engineers here; in this Github (simple nerds such as myself do not have the expertise to tackle something like this... even locally)

For example from the documentation it looks like this would be create a new conda envioronment --> set up environment --> configure oneapi variables --> update cmakelists.txt or makefile with paths to IPEX-LLM library and headers --> then ??map llama.cpp functionalities to ipex apis (which Intel has already done). 

2. **Testing Across Platforms**: Ensuring that the implementation works across different versions of Windows and Linux is crucial... This includes testing on various Intel iGPUs and Arc dGPUs to guarantee broad compatibility. This effort would involve the community here, various Discords, subreddits, and perhaps trying to "rope in" as many laptop/desktop Xe iGPU users and dGPU users as possible -- so that means gamers, too. 

The "owners" of this step would be wide-ranging overall. 

3. **Documentation and Examples**: Someone would have to "own" updating the documentation to guide users on how to enable and use the new IPEX-LLM support. Providing examples and quickstart guides can significantly help; but ultimately for independent users it will be up to them and then for GUI and TUI/CLI frontends, the documentation will need to be updated by them. 

4. **Release** After all of this has been done, going forward to launch woot woot. 

I'm sure there are many, many steps I am missing here. Just wanted to "kick off" the process. 




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Native Intel IPEX-LLM Support #7190

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Native Intel IPEX-LLM Support #7190

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions