-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] Store the kernel object size in the integration header #5862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
clang/include/clang/Sema/Sema.h
Outdated
@@ -335,7 +335,8 @@ class SYCLIntegrationHeader { | |||
/// Signals that subsequent parameter descriptor additions will go to | |||
/// the kernel with given name. Starts new kernel invocation descriptor. | |||
void startKernel(const FunctionDecl *SyclKernel, QualType KernelNameType, | |||
SourceLocation Loc, bool IsESIMD, bool IsUnnamedKernel); | |||
SourceLocation Loc, bool IsESIMD, bool IsUnnamedKernel, | |||
unsigned long ObjSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsigned long
is likely the wrong type for these. This is 32 bits on windows. Note that the CharUnits underlying type is int64_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't appear to be here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, git error on my part.
clang/lib/Sema/SemaSYCL.cpp
Outdated
@@ -4777,6 +4782,10 @@ void SYCLIntegrationHeader::emit(raw_ostream &O) { | |||
O << " return 0;\n"; | |||
O << "#endif\n"; | |||
O << " }\n"; | |||
O << " // Returns the size of the kernel object in bytes.\n"; | |||
O << " __SYCL_DLL_LOCAL\n"; | |||
O << " static constexpr int64_t getKernelSizeof() { return " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was ok with getKernelSize
(commented this was an old suggestion), but let @Fznamznon choose which she prefers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is int64_t going to be ok here? I dont know what amount of 'includes' we have in the header (since int64_t is a typedef).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was ok with
getKernelSize
(commented this was an old suggestion), but let @Fznamznon choose which she prefers.
Okay, will wait for @Fznamznon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is int64_t going to be ok here? I dont know what amount of 'includes' we have in the header (since int64_t is a typedef).
Good point. I will check on both Linux and Windows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, will wait for @Fznamznon.
It LGTM now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LIT tests show that Windows is not ok with that, see buildbot/sycl-win-x64-pr job result . We have a test that compiles integration header.
check-sycl tests are also not ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you'll have to make sure that int64_t is in the 'typedefs' section. You can look up the type in ASTContext to see if it is the same as LongLongTy or LongTy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the name: I changed it to getKernelSize. getKernelSizeof felt a little weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the name: I changed it to getKernelSize. getKernelSizeof felt a little weird.
SGTM, I don't disagree.
@@ -4782,9 +4782,12 @@ void SYCLIntegrationHeader::emit(raw_ostream &O) { | |||
O << " return 0;\n"; | |||
O << "#endif\n"; | |||
O << " }\n"; | |||
StringRef ReturnType = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be useful to just put the typedef at the top of the int-header with the nullptr_t
typedef (and a couple of others?) so that you can just use it? WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! I see what you mean now. Sure, I can do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually don't see where we do that, but I could have sworn Jeff was working on that at one point? IF we ended up doing some of those typedefs that is perhaps my suggestion, otherwise this LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you meant add them anew, but my recollection of Jeff's attempt is that he had to abandon it.
I will leave it as is for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
@bader or @intel/llvm-gatekeepers, this has been ready for a merge for a while now. According to @Fznamznon's comments in #5899, the failure is being addressed in: https://github.com/intel/llvm-test-suite/pull/958/files |
* sycl: (3343 commits) [SYCL][L0] Disable round-robin submissions to multiple CCSs (intel#5945) [SYCL][CUDA] Don't link pi_cuda against libsycl (intel#5908) [CI] Disable -Werror by default (intel#5889) [BuildBot] Uplift CPU/FPGAEMU RT version to 2022.13.3.0.16 (intel#5883) [SYCL][CUDA][libclc] Add support for atomic fp exchange and compare exchange (intel#5937) [SYCL] Fix device code outlining for static local variables (intel#5915) [SYCL][NFC] Refactor plugin CMakeLists.txt (intel#5799) [SPIR-V][Doc] Add JointMatrixWorkItemLengthINTEL instruction to joint matrix extension (intel#5781) [SYCL] Expand device_global map and make initialization order agnostic (intel#5902) [SYCL][CUDA] Add IPSCCP pass to O0 by default (intel#5900) [ESIMD] Disable ABI changes warnings in host compiler. (intel#5931) [SYCL] Make properties constructor constexpr (intel#5928) [NFC][SYCL] Fix static analysis warning (intel#5933) [CODEOWNERS][NFC] Assign code owners for CI scripts (intel#5873) [SYCL] Store the kernel object size in the integration header (intel#5862) [SYCL][ESIMD] Change esimd-verifier logic for detecting valid SYCL calls (intel#5914) [SYCL][CUDA][DOC] GettingStartedGuide.md to recommend cuda 11.6 (intel#5917) [SYCL][L0] Move command list cache usage under mutex (intel#5874) [SYCL][FPGA] Prepare future implementation of experimental pipe properties (intel#5886) [CI] Roll back intel driver to the latest version (intel#5925) ...
Sometimes when the host compiler is different from the device compiler,
it is possible that the size of the kernel object on the host will differ from the
size of the kernel object on the device. One such case is when constexpr
variables are used; compilers differ on whether they need to be captured and
this causes a discrepancy in the size of the kernel object and leads to failures
at runtime.
This change adds an interface to the integration header which returns the size
of the kernel object as computed on the device. The expectation is that the
library will assert the size of a kernel when compiling the host to detect
potential mismatches.