Skip to content

[SYCL] Fix compiler crash. #12324

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 1, 2024
Merged

[SYCL] Fix compiler crash. #12324

merged 8 commits into from
Feb 1, 2024

Conversation

zahiraam
Copy link
Contributor

@zahiraam zahiraam commented Jan 8, 2024

The compiler was crashing when the user requested fp-accuracy for the functions in a call of the form f1(f2(f3 ...), where f1, f2 and f3 were fpbuiltin but the innermost function didn't have an fpbuiltin. The current builtinID was used instead of getting the builtinID from the current function. that created a crash in the compiler.
This patch fixes the issue and renames the function EmitFPBuiltinIndirectCall to MaybeEmitFPBuiltinofFD .

Copy link
Contributor

github-actions bot commented Jan 8, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@@ -22872,6 +22872,10 @@ llvm::CallInst *CodeGenFunction::EmitFPBuiltinIndirectCall(
// only if it has an fpbuiltin intrinsic.
unsigned BuiltinID = getCurrentBuiltinID();
Name = CGM.getContext().BuiltinInfo.getName(BuiltinID);
if (!FD->getNameInfo().getName().isIdentifier() ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in Teams, this doesn't look like the right fix to me. I think the problem is that we somehow got here when the call wasn't an indirect call. I think it would be better to check that in EmitCall before EmitFPBuiltinIndirectCall is called.

@@ -5692,7 +5692,7 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
!getLangOpts().FPAccuracyVal.empty()) {
const auto *FD = dyn_cast_if_present<FunctionDecl>(TargetDecl);
if (FD) {
CI = EmitFPBuiltinIndirectCall(IRFuncTy, IRCallArgs, CalleePtr, FD);
CI = EmitFPBuiltinofFD(IRFuncTy, IRCallArgs, CalleePtr, FD);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to add a check here to see if this is a function that needs fpbuiltin support?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am checking this in the function EmitFPBuiltinofFD. If the function FD has not BuiltinID but is one of the function listed in EmitFPBuiltinofFD (~ line# 22861), an attribute is generated for it. Is that what you mean, or am I missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just thought that if you checked it here you wouldn't need to call EmitPDBuiltinofFD at all in many cases.

@zahiraam zahiraam requested a review from andykaylor January 19, 2024 21:03
@zahiraam zahiraam marked this pull request as ready for review January 19, 2024 21:04
@zahiraam zahiraam requested review from a team as code owners January 19, 2024 21:04
// RUN: %clangxx -%fsycl-host-only -c -ffp-accuracy=high \
// RUN: -faltmathlib=SVMLAltMathLibrary -fno-math-errno %s

// RUN: %clangxx -fsycl %s -isystem %sycl_include/sycl -isystem %sycl_include \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need extra -isystem here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

Comment on lines 46 to 49
// CHECK-LABEL: define {{.*}}spir_kernel void @{{.*}}Kernel2
// CHECK: tail call double @llvm.fpbuiltin.log.f64(double %2) #[[ATTR_HIGH:[0-9]+]]
// CHECK: tail call double @llvm.fpbuiltin.exp.f64(double %3) #[[ATTR_HIGH]]
// CHECK: tail call double @llvm.fpbuiltin.cos.f64(double %4) #[[ATTR_HIGH]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you trying to verify device or host LLVM IR? If the former, then it should go into https://github.com/intel/llvm/tree/sycl/sycl/test/check_device_code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the test to test/check_device_code.

@@ -5031,6 +5031,23 @@ static unsigned getMaxVectorWidth(const llvm::Type *Ty) {
return MaxVectorWidth;
}

static bool shouldCreateFPBuiltinForFD(const FunctionDecl *FD, StringRef Name) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is close to what I was thinking, but if getBuiltinID() returns a non-zero value, it still might be a non-fp-related builtin, and if we have an accuracy map, the name might not be in the map. So we could still be calling EmitFPBuiltinofFD() for a lot of functions that don't need it.

On the other hand, I see from even just this limited implementation that there's necessarily going to be duplication of the list of handled functions and builtins between this check and the EmitFPBuiltinofFD() implementation, so maybe it would be better to rename it something like MaybeEmitFPBuiltinofFD() and skip the check I had requested.

This seems like a lot of work for the calls that don't need this, but I don't have an idea to avoid it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the dilemma! There will be duplication. I will remove the check and rename the function unless I come up with something else. But at this point I have looked at it from every corner!

@@ -0,0 +1,48 @@
// RUN: %clangxx -fsycl %s -isystem %sycl_include/sycl \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isystem is still here. Is that intentional? If so, why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. Removed one and not the other. Done now.

deviceQueue.submit([&](handler &cgh) {
accessor in_vals{in, cgh, read_only};
accessor out_vals{out, cgh, write_only};
cgh.single_task<class Kernel1>([=]() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need main/queue/submit here at all. Just use

#include <sycl/sycl.hpp>
SYCL_EXTERNAL auto foo(double x) {
  using namespace sycl;
  return cos(exp(log(x)));
}

@@ -0,0 +1,47 @@
// RUN: %clangxx -fsycl %s -ffp-accuracy=high -fno-math-errno \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need -fsycl-device-only to avoid host compilation. I'd also recommend to structure RUN lines somewhat like this:

// RUN: %clangxx -fsycl -fsycl-device-only %s -fno-math-errno -ffp-accuracy-high \
// RUN:  -S -emit-llvm -o - | FileCheck %s
// RUN: %clangxx -fsycl -fsycl-device-only %s -fno-math-errno -ffp-accuracy-high \
// RUN: -ffp-accuracy=low:exp \
// RUN:  -S -emit-llvm -o - | FileCheck %s --check-prefix=CHECK-LOW-EXP

to ensure that common options appear the same and differences are easy to spot.

Maybe even use:

// DEFINE: %{common_opts} = -fsycl -fsycl-device-only -fno-math-errno -ffp-accuracy-high -S -emit-llvm -o -
// RUN: %clangxx %{common_opts} | FileCheck %s
// RUN: %clangxx %{common_opts} -ffp-accuracy=low:exp | FileCheck %s --check-prefix=CHECK-LOW-EXP

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aelovikov-intel Thanks. Modified per your 2nd suggestion.

Copy link
Contributor

@aelovikov-intel aelovikov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the test under sycl/test

@zahiraam
Copy link
Contributor Author

@intel/dpcpp-cfe-reviewers can you please take a look? Thanks.

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with floating point semantics and so am not really sure how to review the functionality. @premanandrao @andykaylor can you take a look?

@premanandrao
Copy link
Contributor

I'm not familiar with floating point semantics and so am not really sure how to review the functionality. @premanandrao @andykaylor can you take a look?

I am okay with the FE changes.

Thanks for the ping @elizabethandrews; I thought I had approved it yesterday, but guess I didn't.

@zahiraam
Copy link
Contributor Author

zahiraam commented Feb 1, 2024

@intel/llvm-gatekeepers Can this be merged in please? Thanks.

@ldrumm
Copy link
Contributor

ldrumm commented Feb 1, 2024

@intel/llvm-gatekeepers Can this be merged in please? Thanks.

This isn't ready. There are several questions unresolved. Please check with @andykaylor for resolution. Then I'd be happy to merge

@zahiraam
Copy link
Contributor Author

zahiraam commented Feb 1, 2024

Elizabeth has asked Prem to review who has given his approval. @elizabethandrews can you please review? Thanks.
Andy asked to rename the function (which was done) after many discussions offline. @andykaylor can you please review?

@ldrumm
Copy link
Contributor

ldrumm commented Feb 1, 2024

Sorry. Managed to edit your comment rather than responding.
What I meant to say as a response, rather than an edit

There's also this big comment:

As I mentioned in Teams, this doesn't look like the right fix to me. I think the problem is that we somehow got here when the call wasn't an indirect call. I think it would be better to check that in EmitCall before EmitFPBuiltinIndirectCall is called.

That's a valid and serious question about the fundamental nature of the fix. I see you've fixed up many other comments, but the above needs to be corrected.

@zahiraam
Copy link
Contributor Author

zahiraam commented Feb 1, 2024

Sorry. Managed to edit your comment rather than responding. What I meant to say as a response, rather than an edit

There's also this big comment:

As I mentioned in Teams, this doesn't look like the right fix to me. I think the problem is that we somehow got here when the call wasn't an indirect call. I think it would be better to check that in EmitCall before EmitFPBuiltinIndirectCall is called.

That's a valid and serious question about the fundamental nature of the fix. I see you've fixed up many other comments, but the above needs to be corrected.

I looked though the history of our conversation with @andykaylor. The reason we can't do what Andy is proposing is that we would have to create additional llvm::builtin for some of the math functions such as llvm.acos and that would diverge from community. At any case, let's see what Andy has to say about it. Thanks.

@ldrumm
Copy link
Contributor

ldrumm commented Feb 1, 2024

If that's the case it may be better to fix this upstream?

@zahiraam
Copy link
Contributor Author

zahiraam commented Feb 1, 2024

If that's the case it may be better to fix this upstream?

This is not an option that's used upstream.

@andykaylor
Copy link
Contributor

Sorry. Managed to edit your comment rather than responding. What I meant to say as a response, rather than an edit
There's also this big comment:

As I mentioned in Teams, this doesn't look like the right fix to me. I think the problem is that we somehow got here when the call wasn't an indirect call. I think it would be better to check that in EmitCall before EmitFPBuiltinIndirectCall is called.

That's a valid and serious question about the fundamental nature of the fix. I see you've fixed up many other comments, but the above needs to be corrected.

I looked though the history of our conversation with @andykaylor. The reason we can't do what Andy is proposing is that we would have to create additional llvm::builtin for some of the math functions such as llvm.acos and that would diverge from community. At any case, let's see what Andy has to say about it. Thanks.

Yes, I should have marked that as resolved. Zahira and I have talked about this, and I'm satisfied with the current implementation.

@ldrumm
Copy link
Contributor

ldrumm commented Feb 1, 2024

Thanks for the clarifications @andykaylor @zahiraam

merged

@ldrumm ldrumm merged commit 4fdcb58 into intel:sycl Feb 1, 2024
@zahiraam zahiraam deleted the FixAssertion branch August 12, 2024 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants