Skip to content

[SYCL] Print device aspects in "sycl-ls --verbose" #8433

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 2, 2023

Conversation

aelovikov-intel
Copy link
Contributor

@aelovikov-intel aelovikov-intel commented Feb 22, 2023

This also adds some error handling to return false for queries for not fully implemented ones.

@aelovikov-intel
Copy link
Contributor Author

Looks like the CI is failing because Unified Runtime doesn't support required APIs for this change.

@smaslov-intel , @igchor , @jandres742 , what would be the best way to proceed with a change like this?

@smaslov-intel
Copy link
Contributor

Looks like the CI is failing because Unified Runtime doesn't support required APIs for this change.

@smaslov-intel , @igchor , @jandres742 , what would be the best way to proceed with a change like this?

What are the failing tests that are run with UR? and why?

@jandres742
Copy link
Contributor

jandres742 commented Feb 23, 2023

@smaslov-intel : seems like CI is now automatically testing the UR. Of course, this is gonna fail until we have the adapters running. this is sycl-ls with UR L0.

  [ext_oneapi_unified_runtime:gpu:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) UHD Graphics [0x9bca] 1.3 [1.3.24595]
  [ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) UHD Graphics [0x9bca] 1.3 [1.3.24595]
  
  Platforms: 4
  Platform [#1]:
      Version  : OpenCL 3.0 
      Name     : Intel(R) OpenCL HD Graphics
      Vendor   : Intel(R) Corporation
      Devices  : 1
          Device [#0]:
          Type       : gpu
          Version    : 3.0
          Name       : Intel(R) UHD Graphics [0x9bca]
          Vendor     : Intel(R) Corporation
  die: Unified Runtime: functionality is not supported

@aelovikov-intel
Copy link
Contributor Author

What are the failing tests that are run with UR? and why?

It doesn't reach the tests - fails at the sycl-ls --verbose command because device.has(aspect) goes through unimplemented APIs in the UR (I assume). In https://github.com/intel/llvm/actions/runs/4247720228/jobs/7387076851, line 544:

die: Unified Runtime: functionality is not supported

That said, I've just realized that a similar thing happens with CUDA/HIP plugins as well...

@jandres742
Copy link
Contributor

What are the failing tests that are run with UR? and why?

It doesn't reach the tests - fails at the sycl-ls --verbose command because device.has(aspect) goes through unimplemented APIs in the UR (I assume). In https://github.com/intel/llvm/actions/runs/4247720228/jobs/7387076851, line 544:

die: Unified Runtime: functionality is not supported

That said, I've just realized that a similar thing happens with CUDA/HIP plugins as well...

right, all the UR adapters are being tested. @smaslov-intel : were we planning on doing that?

@smaslov-intel
Copy link
Contributor

fails at the sycl-ls --verbose command

What triggers that command? Can we use device filter and disable UR for that path?
Perhaps the best resolution though is to support whatever is missing for device.has(aspect) in UR (even if conservatively return false for now).

std::cout << Prepend << "Aspects :";
#define __SYCL_ASPECT(ASPECT, ID) \
try { \
if (Device.has(aspect::ASPECT)) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can also filter out UR programmatically here

@aelovikov-intel
Copy link
Contributor Author

What triggers that command?

Currently the CI task itself has it in order to produce entry in logs.

Can we use device filter and disable UR for that path?

I don't think we can meaningfully do that without ugly workarounds.

@jandres742
Copy link
Contributor

fails at the sycl-ls --verbose command

What triggers that command? Can we use device filter and disable UR for that path? Perhaps the best resolution though is to support whatever is missing for device.has(aspect) in UR (even if conservatively return false for now).

that's kind of what I was trying to do here #8402. I am looking at adding stubs returning an error for now for unimplemented. But issue there would be: are testing only sycl-ls with UR? or is the whole suite?

dont think we are ready to have UR validated on premerge. At most, only UR L0 once stable.

@aelovikov-intel
Copy link
Contributor Author

are testing only sycl-ls with UR

I think the answer to this is "yes" (AFAIK, of course), we are only using sycl-ls without narrowing down the list of available devices (via ONEAPI_DEVICE_SELECTOR).

@jandres742
Copy link
Contributor

are testing only sycl-ls with UR

I think the answer to this is "yes" (AFAIK, of course), we are only using sycl-ls without narrowing down the list of available devices (via ONEAPI_DEVICE_SELECTOR).

right, I just confirmed in my local machine, sycl-ls automatically runs over available backends, including UR, so suggestion from @smaslov-intel of filtering for UR in the code seems good. In our of the upcoming patches for UR L0 we could add some basic non-fatal info here, and then we could remove the filter for that one at least.

@smaslov-intel
Copy link
Contributor

fails at the sycl-ls --verbose command

What triggers that command? Can we use device filter and disable UR for that path? Perhaps the best resolution though is to support whatever is missing for device.has(aspect) in UR (even if conservatively return false for now).

that's kind of what I was trying to do here #8402. I am looking at adding stubs returning an error for now for unimplemented. But issue there would be: are testing only sycl-ls with UR? or is the whole suite?

dont think we are ready to have UR validated on premerge. At most, only UR L0 once stable.

For UR "testing": currently only "sycl-ls" which runs for all backends and this single LIT test that runs with UR specifically: https://github.com/intel/llvm-test-suite/blob/intel/SYCL/Plugin/sycl-ls-unified-runtime.cpp

We will be adding more UR testing as it is getting ready for more :)

For that, I prefer that we try to support what's missing in UR for running "sycl-ls" rather than trying to temporarily disable UR elsewhere.

@jandres742
Copy link
Contributor

@smaslov-intel . Agree.

I cloned @aelovikov-intel branch but cannot reproduce the error in an Intel Max Device with Linux.

@aelovikov-intel were you able to reproduce locally? do you have backtrace or SYCL_PI_TRACE=2 log to see what is happening?

$ SYCL_PI_TRACE=1 LD_LIBRARY_PATH=~/llvm/lib/ ~/llvm/bin/sycl-ls --verbose
SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_opencl.so [ PluginVersion: 12.23.1 ]
SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_unified_runtime.so [ PluginVersion: 12.23.1 ]
SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_level_zero.so [ PluginVersion: 12.23.1 ]
[opencl:gpu:0] Intel(R) OpenCL HD Graphics, Intel(R) Data Center GPU Max 1550 3.0 [23.08.25733]
[ext_oneapi_unified_runtime:gpu:0] Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Data Center GPU Max 1550 1.3 [1.3.25733]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Data Center GPU Max 1550 1.3 [1.3.25733]

Platforms: 3
Platform [#1]:
    Version  : OpenCL 3.0
    Name     : Intel(R) OpenCL HD Graphics
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#0]:
        Type       : gpu
        Version    : 3.0
        Name       : Intel(R) Data Center GPU Max 1550
        Vendor     : Intel(R) Corporation
        Driver     : 23.08.25733
        Aspects    : gpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations atomic64 ext_oneapi_srgb ext_intel_device_id
Platform [#2]:
    Version  : 1.3
    Name     : Intel(R) oneAPI Unified Runtime over Level-Zero
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#0]:
        Type       : gpu
        Version    : 1.3
        Name       : Intel(R) Data Center GPU Max 1550
        Vendor     : Intel(R) Corporation
        Driver     : 1.3.25733
        Aspects    : gpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width
Platform [#3]:
    Version  : 1.3
    Name     : Intel(R) Level-Zero
    Vendor   : Intel(R) Corporation
    Devices  : 1
        Device [#0]:
        Type       : gpu
        Version    : 1.3
        Name       : Intel(R) Data Center GPU Max 1550
        Vendor     : Intel(R) Corporation
        Driver     : 1.3.25733
        Aspects    : gpu fp16 fp64 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations ext_intel_gpu_eu_count ext_intel_gpu_eu_simd_width ext_intel_gpu_slices ext_intel_gpu_subslices_per_slice ext_intel_gpu_eu_count_per_subslice atomic64 ext_intel_device_info_uuid ext_intel_gpu_hw_threads_per_eu ext_intel_device_id ext_intel_memory_clock_rate ext_intel_memory_bus_width
SYCL_PI_TRACE[all]: Requested device_type: info::device_type::automatic
SYCL_PI_TRACE[all]: Requested device_type: info::device_type::automatic
SYCL_PI_TRACE[all]: Requested device_type: info::device_type::automatic
SYCL_PI_TRACE[all]: Selected device: -> final score = 550
SYCL_PI_TRACE[all]:   platform: Intel(R) Level-Zero
SYCL_PI_TRACE[all]:   device: Intel(R) Data Center GPU Max 1550
default_selector()      : gpu, Intel(R) Level-Zero, Intel(R) Data Center GPU Max 1550 1.3 [1.3.25733]
SYCL_PI_TRACE[all]: Requested device_type: info::device_type::accelerator
SYCL_PI_TRACE[all]: Requested device_type: info::device_type::accelerator
SYCL_PI_TRACE[all]: Requested device_type: info::device_type::accelerator
accelerator_selector()  : No device of requested type available. -1 (PI_ERRO...
SYCL_PI_TRACE[all]: Requested device_type: info::device_type::cpu
SYCL_PI_TRACE[all]: Requested device_type: info::device_type::cpu
SYCL_PI_TRACE[all]: Requested device_type: info::device_type::cpu
cpu_selector()          : No device of requested type available. -1 (PI_ERRO...
SYCL_PI_TRACE[all]: Requested device_type: info::device_type::gpu
SYCL_PI_TRACE[all]: Requested device_type: info::device_type::gpu
SYCL_PI_TRACE[all]: Requested device_type: info::device_type::gpu
SYCL_PI_TRACE[all]: Selected device: -> final score = 1050
SYCL_PI_TRACE[all]:   platform: Intel(R) Level-Zero
SYCL_PI_TRACE[all]:   device: Intel(R) Data Center GPU Max 1550
gpu_selector()          : gpu, Intel(R) Level-Zero, Intel(R) Data Center GPU Max 1550 1.3 [1.3.25733]
SYCL_PI_TRACE[all]: Selected device: -> final score = 1
SYCL_PI_TRACE[all]:   platform: Intel(R) Level-Zero
SYCL_PI_TRACE[all]:   device: Intel(R) Data Center GPU Max 1550
custom_selector(gpu)    : gpu, Intel(R) Level-Zero, Intel(R) Data Center GPU Max 1550 1.3 [1.3.25733]
custom_selector(cpu)    : No device of requested type available. -1 (PI_ERRO...
custom_selector(acc)    : No device of requested type available. -1 (PI_ERRO...

@al42and
Copy link
Contributor

al42and commented Feb 23, 2023

Crashes with CUDA PI for me:

$ ONEAPI_DEVICE_SELECTOR=cuda:gpu SYCL_PI_TRACE=1 sycl-ls --verbose
Warning: ONEAPI_DEVICE_SELECTOR environment variable is set to cuda:gpu.
To see the correct device id, please unset ONEAPI_DEVICE_SELECTOR.

SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_cuda.so [ PluginVersion: 12.23.1 ]
[ext_oneapi_cuda:gpu:0] NVIDIA CUDA BACKEND, NVIDIA GeForce RTX 3060 0.0 [CUDA 12.0]

Platforms: 1
Platform [#1]:
    Version  : CUDA 12.0
    Name     : NVIDIA CUDA BACKEND
    Vendor   : NVIDIA Corporation
    Devices  : 1
        Device [#0]:
        Type       : gpu
        Version    : 0.0
        Name       : NVIDIA GeForce RTX 3060
        Vendor     : NVIDIA Corporation
        Driver     : CUDA 12.0
        Aspects    : gpu fp16 fp64pi_print: Images are not fully supported by the CUDA BE, their support is disabled by default. Their partial support can be activated by setting SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT environment variable at runtime.
 online_compiler online_linker queue_profiling usm_device_allocations usm_host_allocations usm_shared_allocations usm_atomic_host_allocations usm_atomic_shared_allocations atomic64 ext_intel_device_info_uuidpi_die: Unknown parameter 65575 passed to cuda_piDeviceGetInfo

terminate called without an active exception
Aborted (core dumped)

Log with SYCL_PI_TRACE=2:
sycl_ls_verbose_pi_trace_2.log

Also, note the slightly garbled output about SYCL_PI_CUDA_ENABLE_IMAGE_SUPPORT.

@jandres742
Copy link
Contributor

jandres742 commented Feb 24, 2023

thanks @al42and . So in the log we are failing in:

ext_intel_device_info_uuid---> piDeviceGetInfo(
	<unknown> : 0x55f5a1d31120
	<unknown> : 65575
	<unknown> : 4
	<unknown> : 0x7ffd782591d8
	<nullptr>
pi_die: Unknown parameter 65575 passed to cuda_piDeviceGetInfo

and for the L0 adapter we have it:

https://github.com/aelovikov-intel/llvm/blob/4ac84791365dd8f0328bf36ee2a92310029e3a72/sycl/plugins/unified_runtime/ur/adapters/level_zero/ur_level_zero.cpp#L566

  case ZER_DEVICE_INFO_UUID: {
    // Intel extension for device UUID. This returns the UUID as
    // std::array<std::byte, 16>. For details about this extension,
    // see sycl/doc/extensions/supported/sycl_ext_intel_device_info.md.
    const auto &UUID = Device->ZeDeviceProperties->uuid.id;
    return ReturnValue(UUID, sizeof(UUID));
  }

so maybe the error is that this is not implemented for other adapters than L0? should we just filter the others and run sycl-ls with UR L0 for now? @smaslov-intel ?

@smaslov-intel
Copy link
Contributor

so maybe the error is that this is not implemented for other adapters than L0? should we just filter the others and run sycl-ls with UR L0 for now? @smaslov-intel ?

Right. Application should check if "uuid" query (extension) is supported:
https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/supported/sycl_ext_intel_device_info.md#device-uuid

@abagusetty
Copy link
Contributor

        Driver     : 1.3.25733

@jandres742 I am trying to figure out what does the driver version printed for L0 and/or UR plugin Driver : 1.3.25733 corresponds to. With the L0 loader and header dowloaded by the DPC++/LLVM by default corresponds to Driver 1.3.25242. I am a bit confused if this is the L0 spec version or a L0 release version. This doesn't seem like the compute-runtime version. Can you help on this.

@jandres742
Copy link
Contributor

        Driver     : 1.3.25733

@jandres742 I am trying to figure out what does the driver version printed for L0 and/or UR plugin Driver : 1.3.25733 corresponds to. With the L0 loader and header dowloaded by the DPC++/LLVM by default corresponds to Driver 1.3.25242. I am a bit confused if this is the L0 spec version or a L0 release version. This doesn't seem like the compute-runtime version. Can you help on this.

@abagusetty it is actually the latter, the compute-driver version. See here, the latest stable public version is https://github.com/intel/compute-runtime/releases/tag/22.53.25242.13 = 25242. You are running a newer versions (which is ok, just that is newer).

@aelovikov-intel aelovikov-intel temporarily deployed to aws April 19, 2023 20:07 — with GitHub Actions Inactive
againull pushed a commit that referenced this pull request Apr 19, 2023
Adds `PI_DEVICE_INFO_IMAGE_SRGB` to the switch case in
`piDeviceGetInfo`.
It's a required fix for #8433 .
Before, `sycl-ls --verbose` in #8433 terminated with an error since the
default case was invoked (`pi::die`)
@aelovikov-intel aelovikov-intel temporarily deployed to aws April 20, 2023 21:16 — with GitHub Actions Inactive
@aelovikov-intel aelovikov-intel temporarily deployed to aws April 21, 2023 00:34 — with GitHub Actions Inactive
@aelovikov-intel aelovikov-intel temporarily deployed to aws April 21, 2023 22:27 — with GitHub Actions Inactive
@aelovikov-intel aelovikov-intel temporarily deployed to aws April 22, 2023 00:42 — with GitHub Actions Inactive
@aelovikov-intel aelovikov-intel temporarily deployed to aws April 24, 2023 15:46 — with GitHub Actions Inactive
@aelovikov-intel aelovikov-intel temporarily deployed to aws April 24, 2023 18:04 — with GitHub Actions Inactive
@aelovikov-intel aelovikov-intel marked this pull request as draft April 27, 2023 18:12
@aelovikov-intel aelovikov-intel temporarily deployed to aws April 27, 2023 19:53 — with GitHub Actions Inactive
@aelovikov-intel aelovikov-intel temporarily deployed to aws April 27, 2023 20:58 — with GitHub Actions Inactive
@aelovikov-intel aelovikov-intel marked this pull request as ready for review April 28, 2023 16:23
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! LGTM!

@aelovikov-intel aelovikov-intel temporarily deployed to aws April 28, 2023 16:44 — with GitHub Actions Inactive
@aelovikov-intel aelovikov-intel temporarily deployed to aws April 28, 2023 17:22 — with GitHub Actions Inactive
@aelovikov-intel aelovikov-intel temporarily deployed to aws May 2, 2023 14:51 — with GitHub Actions Inactive
@aelovikov-intel aelovikov-intel temporarily deployed to aws May 2, 2023 16:04 — with GitHub Actions Inactive
@steffenlarsen steffenlarsen merged commit e32ab17 into intel:sycl May 2, 2023
@aelovikov-intel aelovikov-intel deleted the sycl-ls-aspects branch May 10, 2023 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants