-
Notifications
You must be signed in to change notification settings - Fork 787
[ESIMD] Re-work loads from globals in sycl-post-link #4718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1) The re-work in the lowering of loads from globals was required because the previous implementation did not allow handling the loads from scalar globals. 2) The previous implementation generated duplicated vector loads for each of users. Fixed it here. 3) Added lowering for __spirv_BuiltInSubgroupLocalInvocationId(), which must always return 0 for ESIMD. Signed-off-by: Vyacheslav N Klochkov <vyacheslav.n.klochkov@intel.com>
It looks like I have broken something in this patch at the very last minute before uploading to re-view. I'll fix it shortly. |
Signed-off-by: Vyacheslav N Klochkov <vyacheslav.n.klochkov@intel.com>
Per our discussion, the new approach when {{@llvm.genx.XXX}} is generated once in the place where a load from __spirv_XXX was, rather than at every __spirv_XXX use (as it was before) looks a bit questionable. Adding @kychendev. |
…f GenX calls This commit also implements loads from SubgroupSize and SubgroupMaxSize SPIRV globals. Signed-off-by: Vyacheslav N Klochkov <vyacheslav.n.klochkov@intel.com>
Signed-off-by: Vyacheslav N Klochkov <vyacheslav.n.klochkov@intel.com>
llvm/lib/SYCLLowerIR/LowerESIMD.cpp
Outdated
assert(isa<ConstantInt>(IndexV) && | ||
"Expected a const index in extract element instruction"); | ||
uint64_t IndexValue = cast<ConstantInt>(IndexV)->getZExtValue(); | ||
assert(IndexValue <= 2 && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to rename 3 to MAX_DIMS, 2 - to MAX_DIMS-1 in such context. It is not always clear why 3 or 2 is used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will change.
llvm/lib/SYCLLowerIR/LowerESIMD.cpp
Outdated
// uint32_t __spirv_BuiltIn SubgroupId; | ||
|
||
// Translate the loads from _scalar_ SPIRV globals in the next block. | ||
// Such globals require the replacement of the load only because the users |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the main distinguishing factor from vector intrinsics is that these scalar intrinsics always generate a constant, that's why there is no need to materialize genx intrinsic calls at every use. Otherwise there are no problems with such materialization.
I suggest to adjust the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, changed. Will upload additional commit soon.
llvm/lib/SYCLLowerIR/LowerESIMD.cpp
Outdated
"Extract element index should be a constant"); | ||
// Only loads from _vector_ SPIRV globals reach here. Replace their users now. | ||
for (User *LU : LI->users()) { | ||
ExtractElementInst *EEI = dyn_cast<ExtractElementInst>(LU); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simple cast can be used. It is also better, because it will crash if LU is not ExtractElementInst even in release builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you mean by 'simple' cast? The cast needs to be dyn_cast as it works everywhere in llvm when instruction types are casted.
Regarding 'assert', it was copied from the previous sources, I did not change it. Do you recommend replacing it with 'llvm::report_fatal_error' ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you use EEI = dyn_cast
followed by assert(EEI)
, I suggest replace these two with EEI = cast<ExtractElementInst>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I forgot that cast<> included the check/assert. Thank you. Fixed in d5dbf7b
Signed-off-by: Vyacheslav N Klochkov <vyacheslav.n.klochkov@intel.com>
Signed-off-by: Vyacheslav N Klochkov <vyacheslav.n.klochkov@intel.com>
The single test failure in Jenkins is unrelated to the changes. Merging. @bader, @vladimirlaz - FYI [2021-10-09T03:22:44.035Z] $ "env" "SYCL_DEVICE_FILTER=opencl:gpu,host" "W:\jenkins-dir\workspace\SYCL_CI\intel\Win\Test_Suite\build\SYCL\Basic\image\Output\image_max_size.cpp.tmp.out" |
Please, fix the issue to pass pre-commit. I also noticed that this change breaks the build with disabled assertions - https://github.com/intel/llvm/runs/3845316243. |
I uploaded the fix: #4733 |
I'm pretty sure this issue wasn't caused by this patch, so I think the owner of the code should investigate. So option 2 is more reasonable. |
The re-work in the lowering of loads from globals was required
because the previous implementation did not allow handling
the loads from scalar globals.
Added lowering for __spirv_BuiltInSubgroupLocalInvocationId(),
which must always return 0 for ESIMD.
Added lowering for __spirv_BuiltInSubgroupSize() and __spirv_BuiltInSubgroupMaxSize()
which must always return 1 for ESIMD.
The corresponding LIT test: intel/llvm-test-suite#503
Signed-off-by: Vyacheslav N Klochkov vyacheslav.n.klochkov@intel.com