Skip to content

[SYCL] Allow _Bitint of size greater than 128 bits when -fintelfpga i… #6152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 2, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions clang/include/clang/Basic/LangOptions.def
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,7 @@ LANGOPT(GPUExcludeWrongSideOverloads, 1, 0, "always exclude wrong side overloads

LANGOPT(SYCLIsDevice , 1, 0, "Generate code for SYCL device")
LANGOPT(SYCLIsHost , 1, 0, "SYCL host compilation")
LANGOPT(IntelFPGA , 1, 0, "Perform ahead-of-time compilation for FPGA")
LANGOPT(SYCLAllowFuncPtr , 1, 0, "Allow function pointers in SYCL device code")
LANGOPT(SYCLStdLayoutKernelParams, 1, 0, "Enable standard layout requirement for SYCL kernel parameters")
LANGOPT(SYCLUnnamedLambda , 1, 0, "Allow unnamed lambda SYCL kernels")
Expand Down
3 changes: 2 additions & 1 deletion clang/include/clang/Driver/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -2738,7 +2738,8 @@ defm strict_vtable_pointers : BoolFOption<"strict-vtable-pointers",
NegFlag<SetFalse>>;
def fstrict_overflow : Flag<["-"], "fstrict-overflow">, Group<f_Group>;
def fintelfpga : Flag<["-"], "fintelfpga">, Group<f_Group>,
Flags<[CC1Option, CoreOption]>, HelpText<"Perform ahead of time compilation for FPGA">;
Flags<[CC1Option, CoreOption]>, MarshallingInfoFlag<LangOpts<"IntelFPGA">>,
HelpText<"Perform ahead-of-time compilation for FPGA">;
def fsycl_device_only : Flag<["-"], "fsycl-device-only">, Flags<[CoreOption]>,
HelpText<"Compile SYCL kernels for device">;
def fsycl_targets_EQ : CommaJoined<["-"], "fsycl-targets=">, Flags<[NoXarchOption, CC1Option, CoreOption]>,
Expand Down
5 changes: 4 additions & 1 deletion clang/lib/Driver/ToolChains/Clang.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4943,8 +4943,11 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
HasFPGA = true;
break;
}
if (HasFPGA)
if (HasFPGA) {
CmdArgs.push_back("-fsycl-disable-range-rounding");
// Pass -fintelfpga to both the host and device SYCL compilations if set.
CmdArgs.push_back("-fintelfpga");
}

// Add any options that are needed specific to SYCL offload while
// performing the host side compilation.
Expand Down
2 changes: 1 addition & 1 deletion clang/lib/Sema/SemaType.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2306,7 +2306,7 @@ QualType Sema::BuildBitIntType(bool IsUnsigned, Expr *BitWidth,
}

const TargetInfo &TI = getASTContext().getTargetInfo();
if (NumBits > TI.getMaxBitIntWidth()) {
if (NumBits > TI.getMaxBitIntWidth() && !Context.getLangOpts().IntelFPGA) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be some code handling the larger size for this target? Is just removing the error enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is enough. The actual handling is done by the backend or via library calls.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is not enough. We need to make sure we don't emit anything too large for the LLVM backend.

IMO, the RIGHT place for this is probably to get the proper target-info for the FPGA to change its answer to 'getMaxBitIntWidth' based on the FPGA status to the LLVM max value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is not enough. We need to make sure we don't emit anything too large for the LLVM backend.

IMO, the RIGHT place for this is probably to get the proper target-info for the FPGA to change its answer to 'getMaxBitIntWidth' based on the FPGA status to the LLVM max value.

If we go with this way in any following patch, 2048 is a reasonable upper-bound limit for the FPGA target.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: My concern is this: https://llvm.org/doxygen/classllvm_1_1IntegerType.html#af13295ff815293f6c5708ebdbae1338da1ccefdf8a7414a6829f888e5071e0379

if there is a reasonable max for the FPGA target, we should definitely do that too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks wrong to me -- LLVM has a max bit width size that we must ensure the user stays within the bounds of. Passing the intel FPGA flag won't make a bit of difference in that case, it'll just crash in the backend.

Diag(Loc, diag::err_bit_int_max_size)
<< IsUnsigned << static_cast<uint64_t>(TI.getMaxBitIntWidth());
return QualType();
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
// RUN: %clang_cc1 -no-opaque-pointers -fsycl-is-host -fintelfpga -triple x86_64 -emit-llvm %s -o - | FileCheck %s

// This test checks that we generate appropriate code for division
// operations of _BitInts of size greater than 128 bits, since it
// is allowed when -fintelfpga is enabled.

// CHECK: define{{.*}} void @_Z3fooDB211_S_(i211* {{.*}} sret(i211) align 8 %agg.result, i211* {{.*}} byval(i211) align 8 %[[ARG1:[0-9]+]], i211* {{.*}} byval(i211) align 8 %[[ARG2:[0-9]+]])
signed _BitInt(211) foo(signed _BitInt(211) a, signed _BitInt(211) b) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good test, now try it with _BitInt(123456). Do the results still seem reasonable? (Basically, we should be testing the edge cases much more heavily than we seem to be doing.)

// CHECK: %[[VAR_A:a]].addr = alloca i211, align 8
// CHECK: %[[VAR_B:b]].addr = alloca i211, align 8
// CHECK: %[[VAR_A]] = load i211, i211* %[[ARG1]], align 8
// CHECK: %[[VAR_B]] = load i211, i211* %[[ARG2]], align 8
// CHECK: store i211 %[[VAR_A]], i211* %[[VAR_A]].addr, align 8
// CHECK: store i211 %[[VAR_B]], i211* %[[VAR_B]].addr, align 8
// CHECK: %[[TEMP1:[0-9]+]] = load i211, i211* %[[VAR_A]].addr, align 8
// CHECK: %[[TEMP2:[0-9]+]] = load i211, i211* %[[VAR_B]].addr, align 8
// CHECK: %div = sdiv i211 %[[TEMP1]], %[[TEMP2]]
// CHECK: store i211 %div, i211* %agg.result, align 8
// CHECK: %[[RES:[0-9+]]] = load i211, i211* %agg.result, align 8
// CHECK: store i211 %[[RES]], i211* %agg.result, align 8
// CHECK: ret void
return a / b;
}
26 changes: 26 additions & 0 deletions clang/test/CodeGenSYCL/no-opaque-ptrs-sycl-intelfpga-bitint.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
// RUN: %clang_cc1 -no-opaque-pointers -fsycl-is-device -fintelfpga -triple spir64_fpga -IInputs -emit-llvm %s -o - | FileCheck %s

// This test checks that we generate appropriate code for division
// operations of _BitInts of size greater than 128 bits, since it
// is allowed when -fintelfpga is enabled.

#include "Inputs/sycl.hpp"

// CHECK: define{{.*}} void @_Z3fooDB211_S_(i211 addrspace(4)* {{.*}} sret(i211) align 8 %agg.result, i211* {{.*}} byval(i211) align 8 %[[ARG1:[0-9]+]], i211* {{.*}} byval(i211) align 8 %[[ARG2:[0-9]+]])
signed _BitInt(211) foo(signed _BitInt(211) a, signed _BitInt(211) b) {
// CHECK: %[[VAR_A:a]] = load i211, i211* %[[ARG1]], align 8
// CHECK: %[[VAR_B:b]] = load i211, i211* %[[ARG2]], align 8
// CHECK: %[[RES:div]] = sdiv i211 %[[VAR_A]], %[[VAR_B]]
// CHECK: store i211 %[[RES]], i211 addrspace(4)* %agg.result, align 8
// CHECK: ret void
return a / b;
}

int main() {
sycl::handler h;
auto lambda = []() {
_BitInt(211) a, b = 3, c = 4;
a = foo(b, c);
};
h.single_task(lambda);
}
23 changes: 23 additions & 0 deletions clang/test/CodeGenSYCL/sycl-host-intelfpga-bitint.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
// RUN: %clang_cc1 -opaque-pointers -fsycl-is-host -fintelfpga -triple x86_64 -emit-llvm %s -o - | FileCheck %s

// This test checks that we generate appropriate code for division
// operations of _BitInts of size greater than 128 bits, since it
// is allowed when -fintelfpga is enabled.

// CHECK: define{{.*}} void @_Z3fooDB211_S_(ptr {{.*}} sret(i211) align 8 %agg.result, ptr {{.*}} byval(i211) align 8 %[[ARG1:[0-9]+]], ptr {{.*}} byval(i211) align 8 %[[ARG2:[0-9]+]])
signed _BitInt(211) foo(signed _BitInt(211) a, signed _BitInt(211) b) {
// CHECK: %[[VAR_A:a]].addr = alloca i211, align 8
// CHECK: %[[VAR_B:b]].addr = alloca i211, align 8
// CHECK: %[[VAR_A]] = load i211, ptr %[[ARG1]], align 8
// CHECK: %[[VAR_B]] = load i211, ptr %[[ARG2]], align 8
// CHECK: store i211 %[[VAR_A]], ptr %[[VAR_A]].addr, align 8
// CHECK: store i211 %[[VAR_B]], ptr %[[VAR_B]].addr, align 8
// CHECK: %[[TEMP1:[0-9]+]] = load i211, ptr %[[VAR_A]].addr, align 8
// CHECK: %[[TEMP2:[0-9]+]] = load i211, ptr %[[VAR_B]].addr, align 8
// CHECK: %div = sdiv i211 %[[TEMP1]], %[[TEMP2]]
// CHECK: store i211 %div, ptr %agg.result, align 8
// CHECK: %[[RES:[0-9+]]] = load i211, ptr %agg.result, align 8
// CHECK: store i211 %[[RES]], ptr %agg.result, align 8
// CHECK: ret void
return a / b;
}
26 changes: 26 additions & 0 deletions clang/test/CodeGenSYCL/sycl-intelfpga-bitint.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
// RUN: %clang_cc1 -opaque-pointers -fsycl-is-device -fintelfpga -triple spir64_fpga -IInputs -emit-llvm %s -o - | FileCheck %s

// This test checks that we generate appropriate code for division
// operations of _BitInts of size greater than 128 bits, since it
// is allowed when -fintelfpga is enabled.

#include "Inputs/sycl.hpp"

// CHECK: define{{.*}} void @_Z3fooDB211_S_(ptr addrspace(4) {{.*}} sret(i211) align 8 %agg.result, ptr {{.*}} byval(i211) align 8 %[[ARG1:[0-9]+]], ptr {{.*}} byval(i211) align 8 %[[ARG2:[0-9]+]])
signed _BitInt(211) foo(signed _BitInt(211) a, signed _BitInt(211) b) {
// CHECK: %[[VAR_A:a]] = load i211, ptr %[[ARG1]], align 8
// CHECK: %[[VAR_B:b]] = load i211, ptr %[[ARG2]], align 8
// CHECK: %[[RES:div]] = sdiv i211 %[[VAR_A]], %[[VAR_B]]
// CHECK: store i211 %[[RES]], ptr addrspace(4) %agg.result, align 8
// CHECK: ret void
return a / b;
}

int main() {
sycl::handler h;
auto lambda = []() {
_BitInt(211) a, b = 3, c = 4;
a = foo(b, c);
};
h.single_task(lambda);
}
6 changes: 6 additions & 0 deletions clang/test/Driver/sycl-offload-intelfpga.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,12 @@
// RUN: | FileCheck -check-prefix=CHK-TOOLS-INTELFPGA-G0 %s
// CHK-TOOLS-INTELFPGA-G0-NOT: clang{{.*}} "-debug-info-kind=constructor"

/// -fintelfpga passes it to host and device cc1 compilations
// RUN: %clangxx -### -fsycl -fintelfpga %s 2>&1 \
// RUN: | FileCheck -check-prefix=CHK-HOST-DEVICE %s
// CHK-HOST-DEVICE: clang{{.*}} "-cc1"{{.*}} "-fsycl-is-device"{{.*}} "-fintelfpga"
// CHK-HOST-DEVICE: clang{{.*}} "-cc1"{{.*}} "-fintelfpga"{{.*}} "-fsycl-is-host"

/// FPGA target implies -fsycl-disable-range-rounding
// RUN: %clangxx -### -target x86_64-unknown-linux-gnu -fsycl -fintelfpga %s 2>&1 \
// RUN: | FileCheck -check-prefix=CHK-RANGE-ROUNDING %s
Expand Down
16 changes: 16 additions & 0 deletions clang/test/SemaSYCL/sycl-intelfpga.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
// RUN: %clang_cc1 -fsycl-is-device -fintelfpga -verify=device-intelfpga -fsyntax-only %s
// RUN: %clang_cc1 -fsycl-is-host -fintelfpga -verify=host-intelfpga -fsyntax-only %s
// RUN: %clang_cc1 -fsycl-is-device -verify=device -fsyntax-only %s
// RUN: %clang_cc1 -fsycl-is-host -verify=host -fsyntax-only %s

// Tests that we do not issue errors for _Bitints of size greater than 128
// when -fintelfpga is enabled. The backend is expected to be able to handle
// this. When -fintelfpga is not passed, we continue to diagnose.

// device-intelfpga-no-diagnostics
// host-intelfpga-no-diagnostics
// device-error@+2 3{{signed _BitInt of bit sizes greater than 128 not supported}}
// host-error@+1 3{{signed _BitInt of bit sizes greater than 128 not supported}}
signed _BitInt(211) foo(signed _BitInt(211) a, signed _BitInt(211) b) {
return a / b;
}