Skip to content

[SYCL][New offload model] Add SYCL E2E tests for --offload-new-driver option and fix failing tests #14730

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jul 29, 2024

Conversation

asudarsa
Copy link
Contributor

@asudarsa asudarsa commented Jul 23, 2024

This PR is used to add a set of SYCL E2E tests with --offload-new-driver enabled.
Some failures related to offloading to 3rd party hardware have been resolved.

@asudarsa asudarsa requested a review from a team as a code owner July 23, 2024 18:54
@asudarsa asudarsa marked this pull request as draft July 23, 2024 18:54
@asudarsa asudarsa force-pushed the add_e2e_test_for_new_offload_model branch from 1954f0f to 20e81e2 Compare July 24, 2024 18:41
@asudarsa asudarsa changed the title [SYCL][New offload model] Add SYCL E2E tests for --offload-new-driver option [SYCL][New offload model] Add SYCL E2E tests for --offload-new-driver option and fix failing tests Jul 25, 2024
@asudarsa asudarsa marked this pull request as ready for review July 25, 2024 05:19
@asudarsa asudarsa requested a review from a team as a code owner July 25, 2024 05:19
@asudarsa
Copy link
Contributor Author

Hi @sarnex @maksimsab and @mdtoguchi

Can you please take a look? I have aligned new offloading model flow with old offloading flow for 3rd party hardware.

Thanks

@asudarsa
Copy link
Contributor Author

Hi @Naghasan,

Can you please take a look?

Thanks

Copy link
Contributor

@sarnex sarnex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

initial review, thanks!

CmdArgs.push_back(*TempFileOrErr);
CmdArgs.push_back(InputFile);
if (Error Err = executeCommands(*PtxasPath, CmdArgs))
return std::move(Err);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if we are returning, do we need to std::move?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that std::move has been used wherever we are returning Err. I followed the trend.

@@ -1272,6 +1313,8 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args) {
if (!Triple.isNVPTX())
CmdArgs.push_back("-Wl,--no-undefined");

if (IsSYCLKind && Triple.isNVPTX())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we know why this is SYCL specific and not NVPTX specific?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I am trying to match the behavior of new offloading model with that found in the new offloading model (for SYCL). This particular flow (getting the assembly file from the clang call and then explicitly calling ptxas inside the linker-wrapper tool) matches with what is found in the old offloading model, but does not match the community flow. Hence the SYCL specificity. This surely warrants revisiting at a later time.

if (!ClangOutputOrErr)
return ClangOutputOrErr.takeError();
if (Triple.isNVPTX()) {
auto VirtualArch =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move the NVPTX code inside the if into a separate function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason for this change?

Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought for the loop was getting a bit complicated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I revisited the code. I agree that the loop is getting a bit complicated. But I am not able to come up with a 'logical' split to move the NVPTX case into a new function.

@@ -0,0 +1,23 @@
#include "split-per-source.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a copyright header?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file was lifted from here: ./sycl/test-e2e/DeviceCodeSplit/Inputs/split-per-source-second-file.cpp where I do not find a copyright header.

@@ -0,0 +1,6 @@
// REQUIRES: opencl-aot, cpu
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also need any-device-is-cpu i think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not aware of this setting. But I can add that here. Again this test was lifted from ./sycl/test-e2e/DeviceCodeSplit/aot-cpu.cpp and I added --offload-new-driver

//
//===---------------------------------------------------------------------===//

// REQUIRES: opencl-aot, cpu
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need any-device-is-cpu here too i think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above comments. Thanks

// Test with `--offload-new-driver`
// RUN: %{build} --offload-new-driver -c -o %t.kernel.o -DINIT_KERNEL -DCALC_KERNEL
// RUN: %{build} --offload-new-driver -c -o %t.main.o -DMAIN_APP
// RUN: %clangxx -fsycl -fsycl-targets=%{sycl_triple} --offload-new-driver %t.kernel.o %t.main.o -o %t.fat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might be able to use %{build} here

// RUN: --offload-new-driver -fsycl-dead-args-optimization
// RUN: %{run} %t.out
//
// XFAIL: hip_nvidia
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a tracker for this XFAIL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. I see the PR marking this as XFAIL here: intel/llvm-test-suite#525 But there is no related 'issue' for this.


// Test with `--offload-new-driver`
// RUN: %clangxx -fsycl -fsycl-device-code-split=per_source -fsycl-targets=spir64_x86_64 -I %S/Inputs -o %t.out %S/split-per-source-main.cpp %S/Inputs/split-per-source-second-file.cpp \
// RUN: -fsycl-dead-args-optimization --offload-new-driver
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-fsycl-dead-args-optimization seems to be enabled by default, so do we really need to pass it everywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have lifted this file again from ./sycl/test-e2e/DeviceCodeSplit/aot-cpu.cpp. May be we can have a cleanup PR later on?

@@ -11065,7 +11065,7 @@ void LinkerWrapper::ConstructJob(Compilation &C, const JobAction &JA,
ArgStringList CmdArgs;

// Pass the CUDA path to the linker wrapper tool.
for (Action::OffloadKind Kind : {Action::OFK_Cuda, Action::OFK_OpenMP}) {
for (Action::OffloadKind Kind : {Action::OFK_Cuda, Action::OFK_OpenMP, Action::OFK_SYCL}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a test we can update to verify that --cuda-path is being passed from the driver?

Copy link
Contributor Author

@asudarsa asudarsa Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Coming soon!!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in the latest commit

@asudarsa
Copy link
Contributor Author

Protex IP scan reports are based on 'already' existing tests. It is unclear to me about the action items.

@asudarsa asudarsa force-pushed the add_e2e_test_for_new_offload_model branch from a5fcb73 to 8fdc107 Compare July 25, 2024 18:26
@asudarsa
Copy link
Contributor Author

Hi @sarnex

Thanks so much for the feedback. A quick summary of my responses to your comments about the 'newly' added tests.
All newly added tests have been 'derived' from existing tests with only the addition of '--offload-new-driver' option to use the new offloading model. I agree that the comments about issues in the test need to be addressed. Since this PR did not introduce these issues, I propose that we resolve them in a subsequent PR. I can set up a tracker for that. Hope the resolution is agreeable.

Thanks
Sincerely

@sarnex
Copy link
Contributor

sarnex commented Jul 25, 2024

Okay with me!

@asudarsa asudarsa requested review from sarnex and mdtoguchi July 25, 2024 18:34
Copy link
Contributor

@sarnex sarnex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm as incremental change, hopefully we can address the cleanup tasks soon

@asudarsa asudarsa force-pushed the add_e2e_test_for_new_offload_model branch from 47c3185 to 97fbc98 Compare July 26, 2024 03:41
Copy link
Contributor

@maksimsab maksimsab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

asudarsa added 9 commits July 28, 2024 09:25
… option

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
…ckend compilation

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@asudarsa asudarsa force-pushed the add_e2e_test_for_new_offload_model branch from 2b6e88d to 151909a Compare July 28, 2024 16:25
asudarsa added 3 commits July 29, 2024 03:07
This reverts commit 151909a.
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@maarquitos14
Copy link
Contributor

@intel/llvm-gatekeepers precommit failure is a known-issue due to hwloc not being available in RHEL. Can we get this merged?

@dm-vodopyanov dm-vodopyanov merged commit 1de77c3 into intel:sycl Jul 29, 2024
14 of 15 checks passed
AlexeySachkov pushed a commit to AlexeySachkov/llvm that referenced this pull request Nov 26, 2024
… option and fix failing tests (intel#14730)

This PR is used to add a set of SYCL E2E tests with --offload-new-driver
enabled.
Some failures related to offloading to 3rd party hardware have been
resolved.

---------

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants