XPU use stock pytorch instead of Intel Extension for PyTorch by delock · Pull Request #7877 · deepspeedai/DeepSpeed

delock · 2026-02-27T06:24:56Z

With Intel Extension for PyTorch retiring, XPU device would be supported by PyTorch 2.8+ and dependency to Intel Extension for PyTorch would not be needed.

This PR removed IPEX dependency, adapt to builder protocol in PyTorch for XPU, and updated documents and tests accordingly.

Note after this update, DeepSpeed will not work with previous PyTorch+IPEX on XPU devices. Suggest user to upgrade to latest PyTorch to get latest XPU features on XPU devices.

Come with this PR is removal of InferenceBuilder, the kernel needed by InferenceBuilder is supported through Intel Extension for PyTorch.

Replace all intel_extension_for_pytorch (IPEX) imports and APIs with stock PyTorch equivalents for XPU accelerator detection, device operations, and JIT compilation of SYCL kernels. Key changes: - Use torch.xpu.is_available() for XPU detection instead of ipex._C._has_xpu() - Replace DPCPPExtension/DpcppBuildExtension with SyclExtension/BuildExtension - Replace IPEX's cpp_extension.load() with torch.utils.cpp_extension.load() - Set CXX=icpx for JIT compilation of .cpp/.dp.cpp SYCL source files - Add _sycl_env_paths() to resolve SYCL ABI mismatch between system oneAPI compiler and Python environment's libsycl.so - Replace ipex.h includes with c10/xpu/XPUStream.h in C++ sources - Use torch.nn.functional.scaled_dot_product_attention for flash attention - Keep IPEX dependency only in inference.py for pre-compiled kernels Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

…b mismatch Replace AOT compilation (spir64_gen + device-specific targets like pvc/bmg) with pure JIT compilation (spir64) for portability across Intel GPU architectures. Add a wrapper header csrc/xpu/includes/sycl/feature_test.hpp that undefs SYCL_EXT_ONEAPI_BFLOAT16_MATH_FUNCTIONS after including the real header. This prevents c10::BFloat16 from using sycl::ext::oneapi::bfloat16 intrinsics, which would emit __devicelib_ConvertBF16ToFINTEL calls that fail at JIT time when the icpx compiler version (2025.3) is newer than the SYCL runtime shipped with PyTorch (2025.1). The bitwise fallback path is functionally identical. Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

The feature_test.hpp wrapper that #undef'd SYCL_EXT_ONEAPI_BFLOAT16_MATH_FUNCTIONS is no longer needed: PyTorch 2.10+xpu ships intel-sycl-rt 2025.3.1 which matches the system icpx 2025.3.1, eliminating the compiler/runtime version mismatch that caused unresolved __devicelib_ConvertBF16ToFINTEL symbols. Update the Intel XPU section of the accelerator setup guide to reflect the switch from IPEX to stock PyTorch, document the icpx version matching requirement, and add Arc Pro B60 to the verified GPU list. Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

…d dependency Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

…Torch Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2d30d045f3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

accelerator/xpu_accelerator.py

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

delock · 2026-02-27T07:04:21Z

@rogerxfeng8 @YizhouZ @Liangliang-Ma FYI

…vice_type=...) Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

delock added 13 commits February 26, 2026 22:10

Remove unnecessary -fsycl-max-parallel-link-jobs=8 flag

a1b71b7

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

Restore -fsycl-max-parallel-link-jobs=8 flag

a9e26f6

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

Remove XPU InferenceBuilder (last runtime IPEX dependency)

ae8373d

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

Update XPU CI workflows: remove IPEX, use stock PyTorch 2.10

8de0bfc

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

Remove unused IPEX dependency from requirements-cpu.txt

40495b1

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

Remove xpu.external accelerator path and intel_extension_for_deepspee…

4ddcfc8

…d dependency Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

Guard xpu_accelerator.amp() against missing torch.xpu.amp in stock Py…

8e1a996

…Torch Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

Code review update

dd64859

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

Remove 'stock' wording from PyTorch references

0769c13

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

fix UT error

e0954c6

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

delock requested review from loadams, tjruwase and tohtana as code owners February 27, 2026 06:24

chatgpt-codex-connector bot reviewed Feb 27, 2026

View reviewed changes

accelerator/xpu_accelerator.py Show resolved Hide resolved

delock force-pushed the gma/xpu_use_stock_pytorch branch from 2d30d04 to e0954c6 Compare February 27, 2026 06:32

delock added 2 commits February 26, 2026 22:38

Fix yapf formatting in modified files

894b432

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

Fix end-of-file in empty requirements-cpu.txt

390551d

Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

delock and others added 2 commits February 26, 2026 23:19

Replace get_accelerator().amp().autocast() with torch.amp.autocast(de…

9b30894

…vice_type=...) Signed-off-by: Ma, Guokai <guokai.ma@intel.com>

Merge branch 'master' into gma/xpu_use_stock_pytorch

4584a65

delock mentioned this pull request Feb 27, 2026

Remove amp() from abstract accelerator #7879

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XPU use stock pytorch instead of Intel Extension for PyTorch#7877

XPU use stock pytorch instead of Intel Extension for PyTorch#7877
delock wants to merge 17 commits intomasterfrom
gma/xpu_use_stock_pytorch

delock commented Feb 27, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

delock commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

delock commented Feb 27, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

delock commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant