Intel GPU #66

ali-vaziri · 2023-10-04T16:10:35Z

Hi,

I tried the diffusion3D example (attached file) on one Intel GPU using oneAPI/oneArray and did not use any of the ImplicitGlobalGrid native functions, as the signature only accepts CUDA/AMD arrays.

For nt=100000, I get ZE_RESULT_ERROR_DEVICE_LOST (device hung, reset, was removed, or driver update occurred). For nt=100, it works, but the results are not correct.

Any idea what could possibly go wrong?

Thanks!

P.S. oneAPI.versioninfo()

Binary dependencies:

NEO: 23.17.26241+0
libigc: 1.0.13822+0
gmmlib: 22.3.0+0
SPIRV_LLVM_Translator_unified: 0.3.0+0
SPIRV_Tools: 2023.2.0+0

Toolchain:

Julia: 1.9.3
LLVM: 14.0.6

1 driver:

00000000-0000-0000-178a-f44f01036794 (v1.3.26516, API v1.3.0)

2 devices:

Intel(R) Data Center GPU Max 1100
Intel(R) Data Center GPU Max 1100

luraess · 2023-10-05T05:50:05Z

Hi @ali-vaziri, it seems that the issue you are encountering are not relates to ImplicitGlobalGrid (IGG), but rather issues with oneAPI.jl itself, since as you are stating, you're not using any features from IGG, but mostly broadcasting over oneArrays.

I would recommend you remove the IGG-related calls in your example, and further reduce it until you figure out which part causes oneAPI to error. Then this should rather be reported to https://github.com/JuliaGPU/oneAPI.jl.

For IGG to work with oneAPI.jl, we would indeed need to add some features to it in a similar fashion to CUDA and AMDGPU.

pengtu · 2023-10-10T03:14:46Z

@luraess, Can you give us a general idea what it takes to support oneAPI.jl in IGG?

kballeda · 2023-10-10T04:21:09Z

@luraess when we tried to import IGG following errors are encountered, based on the log IGG may need to be updated to use CUDA 5.0.0. Can you make a release after bumping up the CUDA support version?

julia> import Pkg;Pkg.add("ImplicitGlobalGrid")
Updating registry at ~/.julia/registries/General.toml
Resolving package versions...
ERROR: Unsatisfiable requirements detected for package CUDA [052768ef]:
CUDA [052768ef] log:
├─possible versions are: 0.1.0-5.0.0 or uninstalled
├─restricted by julia compatibility requirements to versions: [2.3.0, 2.5.0-5.0.0] or uninstalled
├─restricted by compatibility requirements with ImplicitGlobalGrid [4d7a3746] to versions: [1.0.0-1.3.3, 3.1.0-4.4.1], leaving only versions: 3.1.0-4.4.1
│ └─ImplicitGlobalGrid [4d7a3746] log:
│ ├─possible versions are: 0.11.0-0.13.0 or uninstalled
│ └─restricted to versions * by an explicit requirement, leaving only versions: 0.11.0-0.13.0
└─restricted by compatibility requirements with GPUArrays [0c68f7d7] to versions: 5.0.0 or uninstalled — no versions left
└─GPUArrays [0c68f7d7] log:
├─possible versions are: 0.3.0-9.0.0 or uninstalled
└─restricted to versions 9 by oneAPI [8f75cd03], leaving only versions: 9.0.0
└─oneAPI [8f75cd03] log:
├─possible versions are: 1.4.0 or uninstalled
└─oneAPI [8f75cd03] is fixed to version 1.4.0

luraess · 2023-10-10T05:55:37Z

@luraess when we tried to import IGG following errors are encountered, based on the log IGG may need to be updated to use CUDA 5.0.0. Can you make a release after bumping up the CUDA support version?

We will work on upgrading CUDA compatibility to latest. Thanks

omlins · 2024-10-28T14:38:00Z

@luraess, Can you give us a general idea what it takes to support oneAPI.jl in IGG?

I can see that this message has not been answered so far. It will be pretty straightforward because it requires almost only to add some code (not modify) and analogue as we have it for CUDA.jl. Concretely, it meansto add an extension for oneAPI.jl, as we have one for CUDA.jl: https://github.com/eth-cscs/ImplicitGlobalGrid.jl/tree/master/src/CUDAExt

kballeda · 2024-10-29T16:22:47Z

@luraess, Can you give us a general idea what it takes to support oneAPI.jl in IGG?

I can see that this message has not been answered so far. It will be pretty straightforward because it requires almost only to add some code (not modify) and analogue as we have it for CUDA.jl. Concretely, it meansto add an extension for oneAPI.jl, as we have one for CUDA.jl: https://github.com/eth-cscs/ImplicitGlobalGrid.jl/tree/master/src/CUDAExt

Thanks for the update, I have created a PR contains changes to support oneAPI.jl for IGG (#98) this needs to be validated. It would be great if you could share the steps to test CUDA flow.

luraess · 2024-10-29T17:27:26Z

share the steps to test CUDA flow

If you want to test on another backend than CPU, you can achieve this by running the test on a machine where the backend of interest is functional. There is yet no fully automated way to test the parallel features (mostly the update_halo tests); best is to run these tests launching the test scripts with mpirun or analogue.

kballeda mentioned this issue Oct 29, 2024

support oneAPIExt (#66) #98

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intel GPU #66

Intel GPU #66

ali-vaziri commented Oct 4, 2023 •

edited

Loading

luraess commented Oct 5, 2023

pengtu commented Oct 10, 2023

kballeda commented Oct 10, 2023 •

edited

Loading

luraess commented Oct 10, 2023

omlins commented Oct 28, 2024

kballeda commented Oct 29, 2024 •

edited

Loading

luraess commented Oct 29, 2024

Intel GPU #66

Intel GPU #66

Comments

ali-vaziri commented Oct 4, 2023 • edited Loading

luraess commented Oct 5, 2023

pengtu commented Oct 10, 2023

kballeda commented Oct 10, 2023 • edited Loading

luraess commented Oct 10, 2023

omlins commented Oct 28, 2024

kballeda commented Oct 29, 2024 • edited Loading

luraess commented Oct 29, 2024

ali-vaziri commented Oct 4, 2023 •

edited

Loading

kballeda commented Oct 10, 2023 •

edited

Loading

kballeda commented Oct 29, 2024 •

edited

Loading