-
Notifications
You must be signed in to change notification settings - Fork 769
[SYCL] Fix compiler crash. #12324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL] Fix compiler crash. #12324
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
clang/lib/CodeGen/CGBuiltin.cpp
Outdated
@@ -22872,6 +22872,10 @@ llvm::CallInst *CodeGenFunction::EmitFPBuiltinIndirectCall( | |||
// only if it has an fpbuiltin intrinsic. | |||
unsigned BuiltinID = getCurrentBuiltinID(); | |||
Name = CGM.getContext().BuiltinInfo.getName(BuiltinID); | |||
if (!FD->getNameInfo().getName().isIdentifier() || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in Teams, this doesn't look like the right fix to me. I think the problem is that we somehow got here when the call wasn't an indirect call. I think it would be better to check that in EmitCall before EmitFPBuiltinIndirectCall is called.
clang/lib/CodeGen/CGCall.cpp
Outdated
@@ -5692,7 +5692,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, | |||
!getLangOpts().FPAccuracyVal.empty()) { | |||
const auto *FD = dyn_cast_if_present<FunctionDecl>(TargetDecl); | |||
if (FD) { | |||
CI = EmitFPBuiltinIndirectCall(IRFuncTy, IRCallArgs, CalleePtr, FD); | |||
CI = EmitFPBuiltinofFD(IRFuncTy, IRCallArgs, CalleePtr, FD); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to add a check here to see if this is a function that needs fpbuiltin support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am checking this in the function EmitFPBuiltinofFD. If the function FD has not BuiltinID but is one of the function listed in EmitFPBuiltinofFD (~ line# 22861), an attribute is generated for it. Is that what you mean, or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just thought that if you checked it here you wouldn't need to call EmitPDBuiltinofFD at all in many cases.
// RUN: %clangxx -%fsycl-host-only -c -ffp-accuracy=high \ | ||
// RUN: -faltmathlib=SVMLAltMathLibrary -fno-math-errno %s | ||
|
||
// RUN: %clangxx -fsycl %s -isystem %sycl_include/sycl -isystem %sycl_include \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need extra -isystem
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
// CHECK-LABEL: define {{.*}}spir_kernel void @{{.*}}Kernel2 | ||
// CHECK: tail call double @llvm.fpbuiltin.log.f64(double %2) #[[ATTR_HIGH:[0-9]+]] | ||
// CHECK: tail call double @llvm.fpbuiltin.exp.f64(double %3) #[[ATTR_HIGH]] | ||
// CHECK: tail call double @llvm.fpbuiltin.cos.f64(double %4) #[[ATTR_HIGH]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you trying to verify device or host LLVM IR? If the former, then it should go into https://github.com/intel/llvm/tree/sycl/sycl/test/check_device_code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the test to test/check_device_code.
clang/lib/CodeGen/CGCall.cpp
Outdated
@@ -5031,6 +5031,23 @@ static unsigned getMaxVectorWidth(const llvm::Type *Ty) { | |||
return MaxVectorWidth; | |||
} | |||
|
|||
static bool shouldCreateFPBuiltinForFD(const FunctionDecl *FD, StringRef Name) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is close to what I was thinking, but if getBuiltinID() returns a non-zero value, it still might be a non-fp-related builtin, and if we have an accuracy map, the name might not be in the map. So we could still be calling EmitFPBuiltinofFD() for a lot of functions that don't need it.
On the other hand, I see from even just this limited implementation that there's necessarily going to be duplication of the list of handled functions and builtins between this check and the EmitFPBuiltinofFD() implementation, so maybe it would be better to rename it something like MaybeEmitFPBuiltinofFD() and skip the check I had requested.
This seems like a lot of work for the calls that don't need this, but I don't have an idea to avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's the dilemma! There will be duplication. I will remove the check and rename the function unless I come up with something else. But at this point I have looked at it from every corner!
@@ -0,0 +1,48 @@ | |||
// RUN: %clangxx -fsycl %s -isystem %sycl_include/sycl \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isystem
is still here. Is that intentional? If so, why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. Removed one and not the other. Done now.
deviceQueue.submit([&](handler &cgh) { | ||
accessor in_vals{in, cgh, read_only}; | ||
accessor out_vals{out, cgh, write_only}; | ||
cgh.single_task<class Kernel1>([=]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need main/queue/submit here at all. Just use
#include <sycl/sycl.hpp>
SYCL_EXTERNAL auto foo(double x) {
using namespace sycl;
return cos(exp(log(x)));
}
@@ -0,0 +1,47 @@ | |||
// RUN: %clangxx -fsycl %s -ffp-accuracy=high -fno-math-errno \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also need -fsycl-device-only
to avoid host compilation. I'd also recommend to structure RUN lines somewhat like this:
// RUN: %clangxx -fsycl -fsycl-device-only %s -fno-math-errno -ffp-accuracy-high \
// RUN: -S -emit-llvm -o - | FileCheck %s
// RUN: %clangxx -fsycl -fsycl-device-only %s -fno-math-errno -ffp-accuracy-high \
// RUN: -ffp-accuracy=low:exp \
// RUN: -S -emit-llvm -o - | FileCheck %s --check-prefix=CHECK-LOW-EXP
to ensure that common options appear the same and differences are easy to spot.
Maybe even use:
// DEFINE: %{common_opts} = -fsycl -fsycl-device-only -fno-math-errno -ffp-accuracy-high -S -emit-llvm -o -
// RUN: %clangxx %{common_opts} | FileCheck %s
// RUN: %clangxx %{common_opts} -ffp-accuracy=low:exp | FileCheck %s --check-prefix=CHECK-LOW-EXP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aelovikov-intel Thanks. Modified per your 2nd suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the test under sycl/test
@intel/dpcpp-cfe-reviewers can you please take a look? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with floating point semantics and so am not really sure how to review the functionality. @premanandrao @andykaylor can you take a look?
I am okay with the FE changes. Thanks for the ping @elizabethandrews; I thought I had approved it yesterday, but guess I didn't. |
@intel/llvm-gatekeepers Can this be merged in please? Thanks. |
This isn't ready. There are several questions unresolved. Please check with @andykaylor for resolution. Then I'd be happy to merge |
Elizabeth has asked Prem to review who has given his approval. @elizabethandrews can you please review? Thanks. |
Sorry. Managed to edit your comment rather than responding. There's also this big comment:
That's a valid and serious question about the fundamental nature of the fix. I see you've fixed up many other comments, but the above needs to be corrected. |
I looked though the history of our conversation with @andykaylor. The reason we can't do what Andy is proposing is that we would have to create additional llvm::builtin for some of the math functions such as llvm.acos and that would diverge from community. At any case, let's see what Andy has to say about it. Thanks. |
If that's the case it may be better to fix this upstream? |
This is not an option that's used upstream. |
Yes, I should have marked that as resolved. Zahira and I have talked about this, and I'm satisfied with the current implementation. |
Thanks for the clarifications @andykaylor @zahiraam merged |
The compiler was crashing when the user requested fp-accuracy for the functions in a call of the form f1(f2(f3 ...), where f1, f2 and f3 were fpbuiltin but the innermost function didn't have an fpbuiltin. The current builtinID was used instead of getting the builtinID from the current function. that created a crash in the compiler.
This patch fixes the issue and renames the function EmitFPBuiltinIndirectCall to MaybeEmitFPBuiltinofFD .