Adding Native Support of SYCL for Intel GPUs

# Feature Description

Hi the community, following the discussion [#3965](https://github.com/ggerganov/llama.cpp/discussions/3965), we plan to contribute native SYCL backend to llama.cpp.

# Motivation

Intel Arc series GPU provides accountable VRAM size and bandwidth, which the current OpenCL backend can't fully utilize especially on LLM. We expect a significant performance improvement with native SYCL backend.

References:
* [SYCL](https://www.khronos.org/sycl/)
* [DPC++](https://www.intel.com/content/www/us/en/developer/tools/oneapi/data-parallel-c-plus-plus.html)

# Possible Implementation

### Native Kernels

We will implement the key operators of GGML in SYCL similar to the approach of supporting [Metal](https://github.com/ggerganov/llama.cpp/blob/master/ggml-metal.metal) and [Vulkan](https://github.com/ggerganov/llama.cpp/pull/2059). Basically, the steps are described as below:

1. new backend; h2d & d2h
2. oneMKL-dpcpp based FP32 & FP16 GEMM
3. native SYCL kernels for de-quantization
4. native SYCL kernels for other operators

>Note:
<br>Since llama.cpp has been evolving rapidly and new features will probably be supported through CUDA first, we plan to enable [SYCLomatic](https://github.com/oneapi-src/SYCLomatic) to help migrate the code from CUDA to SYCL.</br>

We plan to further introduce the template-based library e.g., [XeTLA](https://github.com/intel/xetla) as mentioned in [#3965](https://github.com/ggerganov/llama.cpp/discussions/3965) as the next stage, while we will be focusing on native SYCL support in this proposal.

## Summary

We started working on native SYCL kernels and enabling SYCL backend in llama.cpp for Intel GPUs. Please feel free to drop a note. Thanks.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding Native Support of SYCL for Intel GPUs #4749

Feature Description

Motivation

Possible Implementation

Native Kernels

Summary

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Adding Native Support of SYCL for Intel GPUs #4749

Description

Feature Description

Motivation

Possible Implementation

Native Kernels

Summary

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions