Skip to content

[SYCL] Always do internalization in sycl-post-link #14976

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

AlexeySachkov
Copy link
Contributor

We already had Internalize pass being launched as part of a module cleanup phase in sycl-post-link, but it was only invoked when shared libraries/dynamic linking is enabled.

However, there are other features that need this internalization and one (and the only, at least for now) is virtual functions. When virtual functions are used in a program it could be necessary to dynamically link several device images containing SYCL kernels together.

The problem with that is that there are ITT instrumentation functions added to modules that have external linkage - they cause multiple definitions error when we try to link two kernels together, because both of them are instrumented.

This problem is solved by running Internalize pass uncoditionally, but the criteria of what we can internalize depends on whether we enable shared libraries/dynamic linking support:

  • in regular flow, we can internalize any symbol that is not considered to be an entry point, because all device images are self-contained and there are no dependencies between them.
  • when dynamic linking is enabled, we should not internalize functions that can be imported/exported, even if they are not considered as module entry points.

Tests were updated where necessary to expect or ignore linkage changes of some functions. InvokeSIMD pass had to be updated: ESIMD is handled through two-level device code split, meaning two cleanup phases where each of them could mistakenly drop some compiler-generated functions unless they are properly marked in LLVM IR.

We already had `Internalize` pass being launched as part of a module
cleanup phase in `sycl-post-link`, but it was only invoked when shared
libraries/dynamic linking is enabled.

However, there are other features that need this internalization and one
(and the only, at least for now) is virtual functions. When virtual
functions are used in a program it could be necessary to dynamically
link several device images containing SYCL kernels together.

The problem with that is that there are ITT instrumentation functions
added to modules that have external linkage - they cause multiple
definitions error when we try to link two kernels together, because both
of them are instrumented.

This problem is solved by running `Internalize` pass uncoditionally, but
the criteria of what we can internalize depends on whether we enable
shared libraries/dynamic linking support:
- in regular flow, we can internalize any symbol that is not considered
  to be an entry point, because all device images are self-contained and
  there are no dependencies between them.
- when dynamic linking is enabled, we should not internalize functions
  that can be imported/exported, even if they are not considered as
  module entry points.

Tests were updated where necessary to expect or ignore linkage changes
of some functions. InvokeSIMD pass had to be updated: ESIMD is handled
through two-level device code split, meaning two cleanup phases where
each of them could mistakenly drop some compiler-generated functions
unless they are properly marked in LLVM IR.
@AlexeySachkov AlexeySachkov marked this pull request as ready for review August 7, 2024 09:55
@AlexeySachkov AlexeySachkov requested review from a team as code owners August 7, 2024 09:55
@sarnex
Copy link
Contributor

sarnex commented Aug 7, 2024

Looks like the E2E test failures are related. I'll take a look once those are fixed :)

@AlexeySachkov
Copy link
Contributor Author

Looks like the E2E test failures are related. I'll take a look once those are fixed :)

The problem is that I can't reproduce them locally :)

@sarnex
Copy link
Contributor

sarnex commented Aug 7, 2024

I'll try quickly.

@AlexeySachkov
Copy link
Contributor Author

I'll try quickly.

We debugged this offline, turned out that we incorrectly call sycl-post-link in new offloading model with thinLTO. The fix was submitted in #14991, so those failures are expected to be gone after I "rebase" the PR.

Copy link
Contributor

@sarnex sarnex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests passing now! lgtm minus one borderline irrelevant comment!

@AlexeySachkov AlexeySachkov merged commit 2137ff0 into intel:sycl Aug 10, 2024
13 checks passed
@AlexeySachkov AlexeySachkov deleted the private/asachkov/always-run-internalize branch October 9, 2024 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants