Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support UR program creation from multiple device binaries #2147

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

againull
Copy link
Contributor

@againull againull commented Sep 28, 2024

To support multi-device AOT scenario in SYCL we need an ability to create UR program from multiple device binaries.

Changes in this PR:

  • Modify the core function urProgramCreateWithBinary to support program creation from multiple device binaries.
  • Add methods to ur_program to get/set per-device data like L0 module handle, build log etc. Otherwise any change in the structure of the class requires multiple changes in all UR functions which work with ur_program, in addition to this it allows to handle interop case (which implementation is an exception right now) in a single place. Also make State and some other info per-device because it is possible that UR program is associated with multiple devices and urProgramBuildExp is called multiple times for subsets, and we have to know the state per-device.

Corresponding intel/llvm PR with E2E tests: intel/llvm#15546
Also outlined minimal changes just to uplift UR tag: intel/llvm#15637

@againull againull force-pushed the ur/draft/multi_device_create_with_binary branch 2 times, most recently from 89e8d9d to 47fafe3 Compare September 30, 2024 05:35
@againull againull marked this pull request as ready for review September 30, 2024 06:02
@againull againull requested review from a team as code owners September 30, 2024 06:02
@aarongreig
Copy link
Contributor

There's an ongoing effort to remove this extension by making it the default behaviour for the affected entry points #1195 as it was a workaround for an issue discovered late in the last release cycle. Since it seems like there's a compelling use case I think this should be a change to urProgramCreateWithBinary, maybe with a device info flag for adapters report whether they support creating binaries for more than one device at a time

@againull againull force-pushed the ur/draft/multi_device_create_with_binary branch 2 times, most recently from 2b1cf6e to 584869e Compare October 1, 2024 07:25
@againull
Copy link
Contributor Author

againull commented Oct 1, 2024

There's an ongoing effort to remove this extension by making it the default behaviour for the affected entry points #1195 as it was a workaround for an issue discovered late in the last release cycle. Since it seems like there's a compelling use case I think this should be a change to urProgramCreateWithBinary, maybe with a device info flag for adapters report whether they support creating binaries for more than one device at a time

Thank you, I've changed the PR to modify the core urProgramCreateWithBinary instead.

@againull againull force-pushed the ur/draft/multi_device_create_with_binary branch 3 times, most recently from 550df8a to 44a2f02 Compare October 1, 2024 15:40

UR_CHECK_ERROR(
createProgram(hContext, hDevice, size, pBinary, pProperties, phProgram));
if (numDevices == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should add this to the yml so it gets generated and used by all adapters, I think it should be an INVALID_SIZE return:

diff --git a/scripts/core/program.yml b/scripts/core/program.yml
index 3fe2632c..2a0bc14a 100644
--- a/scripts/core/program.yml
+++ b/scripts/core/program.yml
@@ -158,6 +158,7 @@ returns:
         - "`NULL != pProperties && pProperties->count > 0 && NULL == pProperties->pMetadatas`"
     - $X_RESULT_ERROR_INVALID_SIZE:
         - "`NULL != pProperties && NULL != pProperties->pMetadatas && pProperties->count == 0`"
+        - "`numDevices == 0`"
     - $X_RESULT_ERROR_INVALID_NATIVE_BINARY:
         - "If any binary in `ppBinaries` isn't a valid binary for the corresponding device in `phDevices.`"
 --- #--------------------------------------------------------------------------

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.

@kbenzie
Copy link
Contributor

kbenzie commented Oct 2, 2024

@againull looks like there are conformance tests which still need updated. Some conformance tests are only enabled when the UR_DPCXX CMake variable points at clang++ from an intel/llvm build as they depend on generating program binaries.

@againull
Copy link
Contributor Author

againull commented Oct 2, 2024

@againull looks like there are conformance tests which still need updated. Some conformance tests are only enabled when the UR_DPCXX CMake variable points at clang++ from an intel/llvm build as they depend on generating program binaries.

Oh, I see, thanks, I will check and fix that.

@againull
Copy link
Contributor Author

againull commented Oct 8, 2024

@againull looks like there are conformance tests which still need updated. Some conformance tests are only enabled when the UR_DPCXX CMake variable points at clang++ from an intel/llvm build as they depend on generating program binaries.

Fixed remaining failures.

@kbenzie kbenzie added loader Loader related feature/bug conformance Conformance test suite issues. specification Changes or additions to the specification level-zero L0 adapter specific issues cuda CUDA adapter specific issues hip HIP adapter specific issues opencl OpenCL adapter specific issues native-cpu Native CPU adapter specific issues labels Oct 8, 2024
@againull
Copy link
Contributor Author

Sorry, I was formatting using .clang-format in unified-runtime and latest clang-format, it turns out cppformat target needs to be used together with clang-format-15 to get the correct formatting, fixed that.

Other than that only e2e tasks are failing because at least these changes intel/llvm#15637 are required in intel/llvm to use new function signature.

@againull againull force-pushed the ur/draft/multi_device_create_with_binary branch from a6c5a64 to b44bc43 Compare October 10, 2024 22:24
@againull
Copy link
Contributor Author

Can I ask UR team to take a look at this PR please. Most of the changes are related to the L0 adapter, so it would be nice if @pbalcer or @nrspruit could provide a review.

Copy link
Contributor

@aarongreig aarongreig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec, loader, CTS and CL adapter changes LGTM. It would be nice to have a test that creates a program with multiple devices and binaries but that's going to require a bit of CTS work to accommodate so I've added it to #1197 for future work.

@nrspruit
Copy link
Contributor

nrspruit commented Oct 11, 2024

So far, everything looks good to me, however, @againull can you create a devops branch to kick off testing with multi devices in intel/llvm? Since this is specifically for multi device, then running on PVC for the e2e tests would be ideal. By default I don't think the pre-commit runs on PVC or DG2 or anything other than GEN12 for the most part.

@againull
Copy link
Contributor Author

So far, everything looks good to me, however, @againull can you create a devops branch to kick off testing with multi devices in intel/llvm? Since this is specifically for multi device, then running on PVC for the e2e tests would be ideal. By default I don't think the pre-commit runs on PVC or DG2 or anything other than GEN12 for the most part.

Ok, sure, I will invoke such testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
conformance Conformance test suite issues. cuda CUDA adapter specific issues hip HIP adapter specific issues level-zero L0 adapter specific issues loader Loader related feature/bug native-cpu Native CPU adapter specific issues opencl OpenCL adapter specific issues specification Changes or additions to the specification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants