-
Notifications
You must be signed in to change notification settings - Fork 787
[Driver][SYCL] Add support for large device code via a linker script #6584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Driver][SYCL] Add support for large device code via a linker script #6584
Conversation
Add -fsycl-huge-device-code as a driver option. When enabled, the driver generates a linker script which directs the linker to place device code later in the binary, which makes it less likely to create a distance larger than that which a PC32 relocation can span. For example, if a 3GB __clang_offload_bundle__ section is placed betwen .text and .rodata, a PC32 relocation to .rodata in the .text is not possible. With this option enabled the device code will be placed after both rather than between them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM,Thanks!
There is 1 failing LIT test in "SYCL / Linux / OCL x64 LLVM Test Suite" which appears unrelated. Others have mentioned that it should be fixed by intel/llvm-test-suite#1040. |
Hi @intel/dpcpp-doc-reviewers -- could you please take a look at the |
@intel/llvm-gatekeepers, I think this PR is ready in terms of reviews. However, it had one unrelated LIT failure in "SYCL / Linux / OCL x64 LLVM Test Suite" which should be fixed by intel/llvm-test-suite#1040. Can the PR be merged with testing in this state, or is it recommended that I re-run it? |
Previously approved by intel/dpcpp-doc-reviewers and only minor amendments since. Merging this. |
Add -fsycl-link-huge-device-code as a driver option. When enabled, the driver
generates a linker script which directs the linker to place device code
later in the binary, which makes it less likely to create a distance
larger than that which a PC32 relocation can span.
For example, if a 3GB __clang_offload_bundle__ section is placed betwen
.text and .rodata, a PC32 relocation to .rodata in the .text is not
possible. With this option enabled the device code will be placed after
both rather than between them.