Skip to content

[SYCL] Store the kernel object size in the integration header #5862

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 31, 2022

Conversation

premanandrao
Copy link
Contributor

Sometimes when the host compiler is different from the device compiler,
it is possible that the size of the kernel object on the host will differ from the
size of the kernel object on the device. One such case is when constexpr
variables are used; compilers differ on whether they need to be captured and
this causes a discrepancy in the size of the kernel object and leads to failures
at runtime.

This change adds an interface to the integration header which returns the size
of the kernel object as computed on the device. The expectation is that the
library will assert the size of a kernel when compiling the host to detect
potential mismatches.

@premanandrao premanandrao requested a review from a team as a code owner March 22, 2022 22:05
@@ -335,7 +335,8 @@ class SYCLIntegrationHeader {
/// Signals that subsequent parameter descriptor additions will go to
/// the kernel with given name. Starts new kernel invocation descriptor.
void startKernel(const FunctionDecl *SyclKernel, QualType KernelNameType,
SourceLocation Loc, bool IsESIMD, bool IsUnnamedKernel);
SourceLocation Loc, bool IsESIMD, bool IsUnnamedKernel,
unsigned long ObjSize);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsigned long is likely the wrong type for these. This is 32 bits on windows. Note that the CharUnits underlying type is int64_t.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't appear to be here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, git error on my part.

@@ -4777,6 +4782,10 @@ void SYCLIntegrationHeader::emit(raw_ostream &O) {
O << " return 0;\n";
O << "#endif\n";
O << " }\n";
O << " // Returns the size of the kernel object in bytes.\n";
O << " __SYCL_DLL_LOCAL\n";
O << " static constexpr int64_t getKernelSizeof() { return "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was ok with getKernelSize (commented this was an old suggestion), but let @Fznamznon choose which she prefers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is int64_t going to be ok here? I dont know what amount of 'includes' we have in the header (since int64_t is a typedef).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was ok with getKernelSize (commented this was an old suggestion), but let @Fznamznon choose which she prefers.

Okay, will wait for @Fznamznon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is int64_t going to be ok here? I dont know what amount of 'includes' we have in the header (since int64_t is a typedef).

Good point. I will check on both Linux and Windows.

Copy link
Contributor

@Fznamznon Fznamznon Mar 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, will wait for @Fznamznon.

It LGTM now.

Copy link
Contributor

@Fznamznon Fznamznon Mar 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LIT tests show that Windows is not ok with that, see buildbot/sycl-win-x64-pr job result . We have a test that compiles integration header.
check-sycl tests are also not ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you'll have to make sure that int64_t is in the 'typedefs' section. You can look up the type in ASTContext to see if it is the same as LongLongTy or LongTy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the name: I changed it to getKernelSize. getKernelSizeof felt a little weird.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the name: I changed it to getKernelSize. getKernelSizeof felt a little weird.

SGTM, I don't disagree.

@@ -4782,9 +4782,12 @@ void SYCLIntegrationHeader::emit(raw_ostream &O) {
O << " return 0;\n";
O << "#endif\n";
O << " }\n";
StringRef ReturnType =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful to just put the typedef at the top of the int-header with the nullptr_t typedef (and a couple of others?) so that you can just use it? WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I see what you mean now. Sure, I can do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't see where we do that, but I could have sworn Jeff was working on that at one point? IF we ended up doing some of those typedefs that is perhaps my suggestion, otherwise this LGTM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you meant add them anew, but my recollection of Jeff's attempt is that he had to abandon it.
I will leave it as is for now.

Copy link
Contributor

@Fznamznon Fznamznon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@premanandrao
Copy link
Contributor Author

Thanks for the reviews.
@bader, the failure in OCL Gen9 suite (an unexpected pass) appears to be unrelated to my change. Other PRs (#5879 e.g.) show the same failure.

@premanandrao
Copy link
Contributor Author

@bader or @intel/llvm-gatekeepers, this has been ready for a merge for a while now.

According to @Fznamznon's comments in #5899, the failure is being addressed in: https://github.com/intel/llvm-test-suite/pull/958/files

@bader bader merged commit ef90e6a into intel:sycl Mar 31, 2022
alexbatashev added a commit to alexbatashev/llvm that referenced this pull request Apr 2, 2022
* sycl: (3343 commits)
  [SYCL][L0] Disable round-robin submissions to multiple CCSs (intel#5945)
  [SYCL][CUDA] Don't link pi_cuda against libsycl (intel#5908)
  [CI] Disable -Werror by default (intel#5889)
  [BuildBot] Uplift CPU/FPGAEMU RT version to 2022.13.3.0.16 (intel#5883)
  [SYCL][CUDA][libclc] Add support for atomic fp exchange and compare exchange (intel#5937)
  [SYCL] Fix device code outlining for static local variables (intel#5915)
  [SYCL][NFC] Refactor plugin CMakeLists.txt (intel#5799)
  [SPIR-V][Doc] Add JointMatrixWorkItemLengthINTEL instruction to joint matrix extension (intel#5781)
  [SYCL] Expand device_global map and make initialization order agnostic (intel#5902)
  [SYCL][CUDA] Add IPSCCP pass to O0 by default (intel#5900)
  [ESIMD] Disable ABI changes warnings in host compiler. (intel#5931)
  [SYCL] Make properties constructor constexpr (intel#5928)
  [NFC][SYCL] Fix static analysis warning (intel#5933)
  [CODEOWNERS][NFC] Assign code owners for CI scripts (intel#5873)
  [SYCL] Store the kernel object size in the integration header (intel#5862)
  [SYCL][ESIMD] Change esimd-verifier logic for detecting valid SYCL calls (intel#5914)
  [SYCL][CUDA][DOC] GettingStartedGuide.md to recommend cuda 11.6 (intel#5917)
  [SYCL][L0] Move command list cache usage under mutex (intel#5874)
  [SYCL][FPGA] Prepare future implementation of experimental pipe properties (intel#5886)
  [CI] Roll back intel driver to the latest version (intel#5925)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants