Skip to content

XPU use stock pytorch instead of Intel Extension for PyTorch#7877

Open
delock wants to merge 17 commits intomasterfrom
gma/xpu_use_stock_pytorch
Open

XPU use stock pytorch instead of Intel Extension for PyTorch#7877
delock wants to merge 17 commits intomasterfrom
gma/xpu_use_stock_pytorch

Conversation

@delock
Copy link
Collaborator

@delock delock commented Feb 27, 2026

With Intel Extension for PyTorch retiring, XPU device would be supported by PyTorch 2.8+ and dependency to Intel Extension for PyTorch would not be needed.

This PR removed IPEX dependency, adapt to builder protocol in PyTorch for XPU, and updated documents and tests accordingly.

Note after this update, DeepSpeed will not work with previous PyTorch+IPEX on XPU devices. Suggest user to upgrade to latest PyTorch to get latest XPU features on XPU devices.

Come with this PR is removal of InferenceBuilder, the kernel needed by InferenceBuilder is supported through Intel Extension for PyTorch.

Replace all intel_extension_for_pytorch (IPEX) imports and APIs with
stock PyTorch equivalents for XPU accelerator detection, device
operations, and JIT compilation of SYCL kernels.

Key changes:
- Use torch.xpu.is_available() for XPU detection instead of ipex._C._has_xpu()
- Replace DPCPPExtension/DpcppBuildExtension with SyclExtension/BuildExtension
- Replace IPEX's cpp_extension.load() with torch.utils.cpp_extension.load()
- Set CXX=icpx for JIT compilation of .cpp/.dp.cpp SYCL source files
- Add _sycl_env_paths() to resolve SYCL ABI mismatch between system
  oneAPI compiler and Python environment's libsycl.so
- Replace ipex.h includes with c10/xpu/XPUStream.h in C++ sources
- Use torch.nn.functional.scaled_dot_product_attention for flash attention
- Keep IPEX dependency only in inference.py for pre-compiled kernels

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
…b mismatch

Replace AOT compilation (spir64_gen + device-specific targets like pvc/bmg)
with pure JIT compilation (spir64) for portability across Intel GPU
architectures.

Add a wrapper header csrc/xpu/includes/sycl/feature_test.hpp that undefs
SYCL_EXT_ONEAPI_BFLOAT16_MATH_FUNCTIONS after including the real header.
This prevents c10::BFloat16 from using sycl::ext::oneapi::bfloat16
intrinsics, which would emit __devicelib_ConvertBF16ToFINTEL calls that
fail at JIT time when the icpx compiler version (2025.3) is newer than
the SYCL runtime shipped with PyTorch (2025.1). The bitwise fallback
path is functionally identical.

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
The feature_test.hpp wrapper that #undef'd SYCL_EXT_ONEAPI_BFLOAT16_MATH_FUNCTIONS
is no longer needed: PyTorch 2.10+xpu ships intel-sycl-rt 2025.3.1 which matches
the system icpx 2025.3.1, eliminating the compiler/runtime version mismatch that
caused unresolved __devicelib_ConvertBF16ToFINTEL symbols.

Update the Intel XPU section of the accelerator setup guide to reflect the switch
from IPEX to stock PyTorch, document the icpx version matching requirement, and
add Arc Pro B60 to the verified GPU list.

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
…d dependency

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
…Torch

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2d30d045f3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@delock delock force-pushed the gma/xpu_use_stock_pytorch branch from 2d30d04 to e0954c6 Compare February 27, 2026 06:32
Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
@delock
Copy link
Collaborator Author

delock commented Feb 27, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant