Skip to content

[SYCL] Match explicit offload arch for AMD and NVIDIA #7028

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 18, 2022

Conversation

jchlanda
Copy link
Contributor

Fixes: #6792

When specifying multiple SYCL targets make sure that we correctly match offload arch with the target. Normally this is fixed up later on (when calling SYCLActionBuilder::withBoundArchForToolChain), but in case of creating libraries we might end up in a broken state, as the code relies on ordering of the gpu map.
See the phases of the following clang invocation:
clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda,amdgcn-amd-amdhsa -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx908 -Xsycl-target-backend=nvptx64-nvidia-cuda --offload-arch=sm_86 -c in.cpp -o out.o -ccc-print-phases

         +- 0: input, "/dash_c_multiple_targets.cpp", c++, (device-sycl, gfx908)
      +- 1: preprocessor, {0}, c++-cpp-output, (device-sycl, gfx908)
   +- 2: compiler, {1}, ir, (device-sycl, gfx908)
+- 3: offload, "device-sycl (nvptx64-nvidia-cuda:gfx908)" {2}, ir
|        +- 4: input, "/dash_c_multiple_targets.cpp", c++, (device-sycl, sm_86)
|     +- 5: preprocessor, {4}, c++-cpp-output, (device-sycl, sm_86)
|  +- 6: compiler, {5}, ir, (device-sycl, sm_86)
|- 7: offload, "device-sycl (amdgcn-amd-amdhsa:sm_86)" {6}, ir
|                 +- 8: input, "/dash_c_multiple_targets.cpp", c++, (host-sycl)
|              +- 9: append-footer, {8}, c++, (host-sycl)
|           +- 10: preprocessor, {9}, c++-cpp-output, (host-sycl)
|        +- 11: offload, "host-sycl (x86_64-unknown-linux-gnu)" {10}, "device-sycl (amdgcn-amd-amdhsa:sm_86)" {6}, c++-cpp-output
|     +- 12: compiler, {11}, ir, (host-sycl)
|  +- 13: backend, {12}, assembler, (host-sycl)
|- 14: assembler, {13}, object, (host-sycl)
15: clang-offload-bundler, {3, 7, 14}, object, (host-sycl)

where we end up in mismatched offload arch.

@jchlanda jchlanda requested a review from a team as a code owner October 12, 2022 11:24
@jchlanda
Copy link
Contributor Author

@mdtoguchi I don't have access to an Intel GPU, but I suspect that the creation of Devicename (https://github.com/jchlanda/llvm/blob/jakub/dash_c_arch_fix/clang/lib/Driver/Driver.cpp#L5801) could suffer from the same problem.

@mdtoguchi
Copy link
Contributor

@mdtoguchi I don't have access to an Intel GPU, but I suspect that the creation of Devicename (https://github.com/jchlanda/llvm/blob/jakub/dash_c_arch_fix/clang/lib/Driver/Driver.cpp#L5801) could suffer from the same problem.

Thanks for the pointer - I'll take a look.

@jchlanda
Copy link
Contributor Author

My bad, sorry. Added now.

@jchlanda jchlanda requested a review from mdtoguchi October 17, 2022 08:51
Copy link
Contributor

@mdtoguchi mdtoguchi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bader bader merged commit 4189858 into intel:sycl Oct 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

-Xsycl-target-backend=a-b-c passes flags to all backends
3 participants