Skip to content

[SYCL] Add intel::kernel_args_restrict attribute #744

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions clang/include/clang/Basic/Attr.td
Original file line number Diff line number Diff line change
Expand Up @@ -1091,6 +1091,12 @@ def SYCLDeviceIndirectlyCallable : InheritableAttr {
let LangOpts = [SYCLIsDevice];
let Documentation = [SYCLDeviceIndirectlyCallableDocs];
}
def SYCLIntelKernelArgsRestrict : InheritableAttr {
let Spellings = [ CXX11<"intel", "kernel_args_restrict"> ];
let Subjects = SubjectList<[Function], ErrorDiag>;
let LangOpts = [ SYCLIsDevice, SYCLIsHost ];
let Documentation = [ SYCLIntelKernelArgsRestrictDocs ];
}

def C11NoReturn : InheritableAttr {
let Spellings = [Keyword<"_Noreturn">];
Expand Down
37 changes: 37 additions & 0 deletions clang/include/clang/Basic/AttrDocs.td
Original file line number Diff line number Diff line change
Expand Up @@ -1835,6 +1835,43 @@ loads).
}];
}

def SYCLIntelKernelArgsRestrictDocs : Documentation {
let Category = DocCatVariable;
let Heading = "kernel_args_restrict";
let Content = [{
The attribute ``intel::kernel_args_restrict`` is legal on device functions, and
can be ignored on non-device functions. When applied to a function, lambda, or
function call operator (of a function object), the attribute is a hint to the
compiler equivalent to specifying the C99 restrict attribute on all pointer
arguments or the pointer member of any accessors, which are a function argument,
lambda capture, or function object member, of the callable to which the
attribute was applied. This effect is equivalent to annotating restrict on
**all** kernel pointer arguments in an OpenCL or SPIR-V kernel.

If ``intel::kernel_args_restrict`` is applied to a function called from a device
kernel, the attribute is ignored and it is not propagated to a kernel.

The attribute forms an unchecked assertion, in that implementations
do not need to check/confirm the pre-condition in any way. If a user applies
``intel::_kernel_args_restrict`` to a kernel, but there is in fact aliasing
between kernel pointer arguments at runtime, the behavior is undefined.

The attribute-token ``intel::kernel_args_restrict`` shall appear at most once in
each attribute-list and no attribute-argument-clause shall be present. The
attribute may be applied to the function-type in a function declaration. The
first declaration of a function shall specify the
``intel::kernel_args_restrict`` attribute if any declaration of that function
specifies the ``intel::kernel_args_restrict`` attribute. If a function is
declared with the ``intel::kernel_args_restrict`` attribute in one translation
unit and the same function is declared without the
``intel::kernel_args_restrict`` attribute in another translation unit, the
program is ill-formed and no diagnostic is required.

The ``intel::kernel_args_restrict`` attribute has an effect when applied to a
function, and no effect otherwise.
}];
}

def SYCLIntelFPGAIVDepAttrDocs : Documentation {
let Category = DocCatVariable;
let Heading = "ivdep";
Expand Down
7 changes: 7 additions & 0 deletions clang/include/clang/Basic/AttributeCommonInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,13 @@ class AttributeCommonInfo {
return SyntaxUsed == AS_CXX11 || isAlignasAttribute();
}

bool isAllowedOnLambdas() const {
// FIXME: Eventually we want to do a list here populated via tablegen. But
// we want C++ attributes to be permissible on Lambdas, and get propagated
// to the call operator declaration.
return getParsedKind() == AT_SYCLIntelKernelArgsRestrict;
}

bool isC2xAttribute() const { return SyntaxUsed == AS_C2x; }

bool isKeywordAttribute() const {
Expand Down
5 changes: 4 additions & 1 deletion clang/lib/CodeGen/CGCall.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2412,7 +2412,10 @@ void CodeGenFunction::EmitFunctionProlog(const CGFunctionInfo &FI,
}
}

if (Arg->getType().isRestrictQualified())
if (Arg->getType().isRestrictQualified() ||
(CurCodeDecl &&
CurCodeDecl->hasAttr<SYCLIntelKernelArgsRestrictAttr>() &&
Arg->getType()->isPointerType()))
AI->addAttr(llvm::Attribute::NoAlias);

// LLVM expects swifterror parameters to be used in very restricted
Expand Down
13 changes: 12 additions & 1 deletion clang/lib/Sema/SemaDeclAttr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6752,6 +6752,13 @@ static void handleMSAllocatorAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
// Top Level Sema Entry Points
//===----------------------------------------------------------------------===//

static bool IsDeclLambdaCallOperator(Decl *D) {
if (const auto *MD = dyn_cast<CXXMethodDecl>(D))
return MD->getParent()->isLambda() &&
MD->getOverloadedOperator() == OverloadedOperatorKind::OO_Call;
return false;
}

/// ProcessDeclAttribute - Apply the specific attribute to the specified decl if
/// the attribute applies to decls. If the attribute is a type attribute, just
/// silently ignore it if a GNU attribute.
Expand All @@ -6763,7 +6770,8 @@ static void ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D,

// Ignore C++11 attributes on declarator chunks: they appertain to the type
// instead.
if (AL.isCXX11Attribute() && !IncludeCXX11Attributes)
if (AL.isCXX11Attribute() && !IncludeCXX11Attributes &&
(!IsDeclLambdaCallOperator(D) || !AL.isAllowedOnLambdas()))
return;

// Unknown attributes are automatically warned on. Target-specific attributes
Expand Down Expand Up @@ -7516,6 +7524,9 @@ static void ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D,
case ParsedAttr::AT_RenderScriptKernel:
handleSimpleAttribute<RenderScriptKernelAttr>(S, D, AL);
break;
case ParsedAttr::AT_SYCLIntelKernelArgsRestrict:
handleSimpleAttribute<SYCLIntelKernelArgsRestrictAttr>(S, D, AL);
break;
// XRay attributes.
case ParsedAttr::AT_XRayInstrument:
handleSimpleAttribute<XRayInstrumentAttr>(S, D, AL);
Expand Down
26 changes: 22 additions & 4 deletions clang/lib/Sema/SemaSYCL.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -411,12 +411,14 @@ class MarkDeviceFunction : public RecursiveASTVisitor<MarkDeviceFunction> {
// Attributes applied to SYCLKernel are also included
void CollectPossibleKernelAttributes(FunctionDecl *SYCLKernel,
llvm::SmallPtrSet<Attr *, 4> &Attrs) {
typedef std::pair<FunctionDecl *, FunctionDecl *> ChildParentPair;
llvm::SmallPtrSet<FunctionDecl *, 16> Visited;
llvm::SmallVector<FunctionDecl *, 16> WorkList;
WorkList.push_back(SYCLKernel);
llvm::SmallVector<ChildParentPair, 16> WorkList;
WorkList.push_back({SYCLKernel, nullptr});

while (!WorkList.empty()) {
FunctionDecl *FD = WorkList.back();
FunctionDecl *FD = WorkList.back().first;
FunctionDecl *ParentFD = WorkList.back().second;
WorkList.pop_back();
if (!Visited.insert(FD).second)
continue; // We've already seen this Decl
Expand All @@ -425,6 +427,18 @@ class MarkDeviceFunction : public RecursiveASTVisitor<MarkDeviceFunction> {
Attrs.insert(A);
else if (auto *A = FD->getAttr<ReqdWorkGroupSizeAttr>())
Attrs.insert(A);
else if (auto *A = FD->getAttr<SYCLIntelKernelArgsRestrictAttr>()) {
// Allow the intel::kernel_args_restrict only on the lambda (function
// object) function, that is called directly from a kernel (i.e. the one
// passed to the parallel_for function). Emit a warning and ignore all
// other cases.
if (ParentFD == SYCLKernel) {
Attrs.insert(A);
} else {
SemaRef.Diag(A->getLocation(), diag::warn_attribute_ignored) << A;
FD->dropAttr<SYCLIntelKernelArgsRestrictAttr>();
}
}

// TODO: vec_len_hint should be handled here

Expand All @@ -436,7 +450,7 @@ class MarkDeviceFunction : public RecursiveASTVisitor<MarkDeviceFunction> {
if (auto *Callee = dyn_cast<FunctionDecl>(CI->getDecl())) {
Callee = Callee->getCanonicalDecl();
if (!Visited.count(Callee))
WorkList.push_back(Callee);
WorkList.push_back({Callee, FD});
}
}
}
Expand Down Expand Up @@ -1296,6 +1310,10 @@ void Sema::MarkDevice(void) {
}
break;
}
case attr::Kind::SYCLIntelKernelArgsRestrict: {
SYCLKernel->addAttr(A);
break;
}
// TODO: vec_len_hint should be handled here
default:
// Seeing this means that CollectPossibleKernelAttributes was
Expand Down
8 changes: 7 additions & 1 deletion clang/lib/Sema/SemaType.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,11 @@ namespace {
return chunkIndex == declarator.getNumTypeObjects();
}

bool isProcessingLambdaExpr() const {
return declarator.isFunctionDeclarator() &&
declarator.getContext() == DeclaratorContext::LambdaExprContext;
}

unsigned getCurrentChunkIndex() const {
return chunkIndex;
}
Expand Down Expand Up @@ -7582,7 +7587,8 @@ static void processTypeAttrs(TypeProcessingState &state, QualType &type,
switch (attr.getKind()) {
default:
// A C++11 attribute on a declarator chunk must appertain to a type.
if (attr.isCXX11Attribute() && TAL == TAL_DeclChunk) {
if (attr.isCXX11Attribute() && TAL == TAL_DeclChunk &&
(!state.isProcessingLambdaExpr() || !attr.isAllowedOnLambdas())) {
state.getSema().Diag(attr.getLoc(), diag::err_attribute_not_type_attr)
<< attr;
attr.setUsedAsTypeAttr();
Expand Down
68 changes: 68 additions & 0 deletions clang/test/CodeGenSYCL/intel-restrict.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
// RUN: %clang %s -S -emit-llvm --sycl -o - | FileCheck %s

#include "CL/sycl.hpp"

constexpr auto sycl_read_write = cl::sycl::access::mode::read_write;
constexpr auto sycl_global_buffer = cl::sycl::access::target::global_buffer;

template <typename Acc1Ty, typename Acc2Ty>
struct foostr {
Acc1Ty A;
Acc2Ty B;
foostr(Acc1Ty A, Acc2Ty B): A(A), B(B) {}
[[intel::kernel_args_restrict]]
void operator()() {
A[0] = B[0];
}
};

int foo(int X) {
int A[] = { 42 };
int B[] = { 0 };
{
cl::sycl::queue Q;
cl::sycl::buffer<int, 1> BufA(A, 1);
cl::sycl::buffer<int, 1> BufB(B, 1);

// CHECK: define {{.*}} spir_kernel {{.*}}kernel_norestrict{{.*}}(i32 addrspace(1)* %{{.*}} i32 addrspace(1)* %{{.*}}

Q.submit([&](cl::sycl::handler& cgh) {
auto AccA = BufA.get_access<sycl_read_write, sycl_global_buffer>(cgh);
auto AccB = BufB.get_access<sycl_read_write, sycl_global_buffer>(cgh);
cgh.single_task<class kernel_norestrict>(
[=]() {
AccB[0] = AccA[0];
});
});

// CHECK: define {{.*}} spir_kernel {{.*}}kernel_restrict{{.*}}(i32 addrspace(1)* noalias %{{.*}} i32 addrspace(1)* noalias %{{.*}}
Q.submit([&](cl::sycl::handler& cgh) {
auto AccA = BufA.get_access<sycl_read_write, sycl_global_buffer>(cgh);
auto AccB = BufB.get_access<sycl_read_write, sycl_global_buffer>(cgh);
cgh.single_task<class kernel_restrict>(
[=]() [[intel::kernel_args_restrict]] {
AccB[0] = AccA[0];
});
});

// CHECK: define {{.*}} spir_kernel {{.*}}kernel_restrict_struct{{.*}}(i32 addrspace(1)* noalias %{{.*}} i32 addrspace(1)* noalias %{{.*}}
Q.submit([&](cl::sycl::handler& cgh) {
auto AccA = BufA.get_access<sycl_read_write, sycl_global_buffer>(cgh);
auto AccB = BufB.get_access<sycl_read_write, sycl_global_buffer>(cgh);
foostr<decltype(AccA), decltype(AccB)> f(AccA, AccB);
cgh.single_task<class kernel_restrict_struct>(f);
});

// CHECK: define {{.*}} spir_kernel {{.*}}kernel_restrict_other_params{{.*}}(i32 addrspace(1)* noalias %{{.*}} i32 addrspace(1)* noalias %{{.*}}, i32 %_arg_9)
int num = 42;
Q.submit([&](cl::sycl::handler& cgh) {
auto AccA = BufA.get_access<sycl_read_write, sycl_global_buffer>(cgh);
auto AccB = BufB.get_access<sycl_read_write, sycl_global_buffer>(cgh);
cgh.single_task<class kernel_restrict_other_params>(
[=]() [[intel::kernel_args_restrict]] {
AccB[0] = AccA[0] = num;
});
});
}
return B[0];
}
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@
// CHECK-NEXT: ReturnsTwice (SubjectMatchRule_function)
// CHECK-NEXT: SYCLDevice (SubjectMatchRule_function)
// CHECK-NEXT: SYCLDeviceIndirectlyCallable (SubjectMatchRule_function)
// CHECK-NEXT: SYCLIntelKernelArgsRestrict (SubjectMatchRule_function)
// CHECK-NEXT: SYCLKernel (SubjectMatchRule_function)
// CHECK-NEXT: ScopedLockable (SubjectMatchRule_record)
// CHECK-NEXT: Section (SubjectMatchRule_function, SubjectMatchRule_variable_is_global, SubjectMatchRule_objc_method, SubjectMatchRule_objc_property)
Expand Down
35 changes: 35 additions & 0 deletions clang/test/SemaSYCL/intel-restrict.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
// RUN: %clang %s -fsyntax-only --sycl -DCHECKDIAG -Xclang -verify
// RUN: %clang %s -fsyntax-only -Xclang -ast-dump --sycl | FileCheck %s

[[intel::kernel_args_restrict]] // expected-warning{{'kernel_args_restrict' attribute ignored}}
void func_ignore() {}

struct FuncObj {
[[intel::kernel_args_restrict]]
void operator()() {}
};

template <typename name, typename Func>
__attribute__((sycl_kernel)) void kernel(Func kernelFunc) {
kernelFunc();
#ifdef CHECKDIAG
[[intel::kernel_args_restrict]] int invalid = 42; // expected-error{{'kernel_args_restrict' attribute only applies to functions}}
#endif
}

int main() {
// CHECK-LABEL: FunctionDecl {{.*}} _ZTSZ4mainE12test_kernel1
// CHECK: SYCLIntelKernelArgsRestrictAttr
kernel<class test_kernel1>(
FuncObj());

// CHECK-LABEL: FunctionDecl {{.*}} _ZTSZ4mainE12test_kernel2
// CHECK: SYCLIntelKernelArgsRestrictAttr
kernel<class test_kernel2>(
[]() [[intel::kernel_args_restrict]] {});

// CHECK-LABEL: FunctionDecl {{.*}} _ZTSZ4mainE12test_kernel3
// CHECK-NOT: SYCLIntelKernelArgsRestrictAttr
kernel<class test_kernel3>(
[]() {func_ignore();});
}