-
Notifications
You must be signed in to change notification settings - Fork 790
[SYCL][New offload model] Add SYCL E2E tests for --offload-new-driver option and fix failing tests #14730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][New offload model] Add SYCL E2E tests for --offload-new-driver option and fix failing tests #14730
Conversation
1954f0f
to
20e81e2
Compare
Hi @sarnex @maksimsab and @mdtoguchi Can you please take a look? I have aligned new offloading model flow with old offloading flow for 3rd party hardware. Thanks |
Hi @Naghasan, Can you please take a look? Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial review, thanks!
CmdArgs.push_back(*TempFileOrErr); | ||
CmdArgs.push_back(InputFile); | ||
if (Error Err = executeCommands(*PtxasPath, CmdArgs)) | ||
return std::move(Err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if we are returning, do we need to std::move
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that std::move has been used wherever we are returning Err. I followed the trend.
@@ -1272,6 +1313,8 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args) { | |||
if (!Triple.isNVPTX()) | |||
CmdArgs.push_back("-Wl,--no-undefined"); | |||
|
|||
if (IsSYCLKind && Triple.isNVPTX()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we know why this is SYCL specific and not NVPTX specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, I am trying to match the behavior of new offloading model with that found in the new offloading model (for SYCL). This particular flow (getting the assembly file from the clang call and then explicitly calling ptxas inside the linker-wrapper tool) matches with what is found in the old offloading model, but does not match the community flow. Hence the SYCL specificity. This surely warrants revisiting at a later time.
if (!ClangOutputOrErr) | ||
return ClangOutputOrErr.takeError(); | ||
if (Triple.isNVPTX()) { | ||
auto VirtualArch = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move the NVPTX code inside the if into a separate function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific reason for this change?
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought for the loop was getting a bit complicated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I revisited the code. I agree that the loop is getting a bit complicated. But I am not able to come up with a 'logical' split to move the NVPTX case into a new function.
@@ -0,0 +1,23 @@ | |||
#include "split-per-source.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need a copyright header?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file was lifted from here: ./sycl/test-e2e/DeviceCodeSplit/Inputs/split-per-source-second-file.cpp where I do not find a copyright header.
@@ -0,0 +1,6 @@ | |||
// REQUIRES: opencl-aot, cpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also need any-device-is-cpu
i think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not aware of this setting. But I can add that here. Again this test was lifted from ./sycl/test-e2e/DeviceCodeSplit/aot-cpu.cpp and I added --offload-new-driver
// | ||
//===---------------------------------------------------------------------===// | ||
|
||
// REQUIRES: opencl-aot, cpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need any-device-is-cpu
here too i think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above comments. Thanks
// Test with `--offload-new-driver` | ||
// RUN: %{build} --offload-new-driver -c -o %t.kernel.o -DINIT_KERNEL -DCALC_KERNEL | ||
// RUN: %{build} --offload-new-driver -c -o %t.main.o -DMAIN_APP | ||
// RUN: %clangxx -fsycl -fsycl-targets=%{sycl_triple} --offload-new-driver %t.kernel.o %t.main.o -o %t.fat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you might be able to use %{build}
here
// RUN: --offload-new-driver -fsycl-dead-args-optimization | ||
// RUN: %{run} %t.out | ||
// | ||
// XFAIL: hip_nvidia |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have a tracker for this XFAIL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure. I see the PR marking this as XFAIL here: intel/llvm-test-suite#525 But there is no related 'issue' for this.
|
||
// Test with `--offload-new-driver` | ||
// RUN: %clangxx -fsycl -fsycl-device-code-split=per_source -fsycl-targets=spir64_x86_64 -I %S/Inputs -o %t.out %S/split-per-source-main.cpp %S/Inputs/split-per-source-second-file.cpp \ | ||
// RUN: -fsycl-dead-args-optimization --offload-new-driver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-fsycl-dead-args-optimization
seems to be enabled by default, so do we really need to pass it everywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have lifted this file again from ./sycl/test-e2e/DeviceCodeSplit/aot-cpu.cpp. May be we can have a cleanup PR later on?
@@ -11065,7 +11065,7 @@ void LinkerWrapper::ConstructJob(Compilation &C, const JobAction &JA, | |||
ArgStringList CmdArgs; | |||
|
|||
// Pass the CUDA path to the linker wrapper tool. | |||
for (Action::OffloadKind Kind : {Action::OFK_Cuda, Action::OFK_OpenMP}) { | |||
for (Action::OffloadKind Kind : {Action::OFK_Cuda, Action::OFK_OpenMP, Action::OFK_SYCL}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a test we can update to verify that --cuda-path
is being passed from the driver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Coming soon!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in the latest commit
Protex IP scan reports are based on 'already' existing tests. It is unclear to me about the action items. |
a5fcb73
to
8fdc107
Compare
Hi @sarnex Thanks so much for the feedback. A quick summary of my responses to your comments about the 'newly' added tests. Thanks |
Okay with me! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm as incremental change, hopefully we can address the cleanup tasks soon
47c3185
to
97fbc98
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
… option Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
…ckend compilation Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
2b6e88d
to
151909a
Compare
This reverts commit 151909a.
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
@intel/llvm-gatekeepers precommit failure is a known-issue due to |
… option and fix failing tests (intel#14730) This PR is used to add a set of SYCL E2E tests with --offload-new-driver enabled. Some failures related to offloading to 3rd party hardware have been resolved. --------- Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
This PR is used to add a set of SYCL E2E tests with --offload-new-driver enabled.
Some failures related to offloading to 3rd party hardware have been resolved.