Skip to content

[CUDA][HIP] fix virtual dtor host/device attr #128926

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 3, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions clang/docs/HIPSupport.rst
Original file line number Diff line number Diff line change
Expand Up @@ -286,6 +286,26 @@ Example Usage
basePtr->virtualFunction(); // Allowed since obj is constructed in device code
}

Host and Device Attributes of Default Destructors
===================================================

If a default destructor does not have explicit host or device attributes,
clang infers these attributes based on the destructors of its data members
and base classes. If any conflicts are detected among these destructors,
clang diagnoses the issue. Otherwise, clang adds an implicit host or device
attribute according to whether the data members's and base classes's
destructors can execute on the host or device side.

For explicit template classes with virtual destructors, which must be emitted,
the inference adopts a conservative approach. In this case, implicit host or
device attributes from member and base class destructors are ignored. This
precaution is necessary because, although a constexpr destructor carries
implicit host or device attributes, a constexpr function may call a
non-constexpr function, which is by default a host function.

Users can override the inferred host and device attributes of default
destructors by adding explicit host and device attributes to them.

C++ Standard Parallelism Offload Support: Compiler And Runtime
==============================================================

Expand Down
8 changes: 6 additions & 2 deletions clang/lib/Sema/SemaCUDA.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -437,7 +437,9 @@ bool SemaCUDA::inferTargetForImplicitSpecialMember(CXXRecordDecl *ClassDecl,
if (!SMOR.getMethod())
continue;

CUDAFunctionTarget BaseMethodTarget = IdentifyTarget(SMOR.getMethod());
CUDAFunctionTarget BaseMethodTarget =
IdentifyTarget(SMOR.getMethod(), IsExpVDtor);

if (!InferredTarget) {
InferredTarget = BaseMethodTarget;
} else {
Expand Down Expand Up @@ -481,7 +483,9 @@ bool SemaCUDA::inferTargetForImplicitSpecialMember(CXXRecordDecl *ClassDecl,
if (!SMOR.getMethod())
continue;

CUDAFunctionTarget FieldMethodTarget = IdentifyTarget(SMOR.getMethod());
CUDAFunctionTarget FieldMethodTarget =
IdentifyTarget(SMOR.getMethod(), IsExpVDtor);

if (!InferredTarget) {
InferredTarget = FieldMethodTarget;
} else {
Expand Down
18 changes: 10 additions & 8 deletions clang/test/SemaCUDA/dtor.cu
Original file line number Diff line number Diff line change
Expand Up @@ -32,22 +32,24 @@ public:
template class B<float>;
}

// The implicit host/device attrs of virtual dtor B<float>::~B() is inferred to
// have implicit device attr since dtors of its members and parent classes can
// be executed on device. This causes a diagnostic since B<float>::~B() must
// be emitted, and it eventually causes host_fun() called on device side.
// The implicit host/device attrs of virtual dtor ~B() should be
// conservatively inferred, where constexpr member dtor's should
// not be considered device since they may call host functions.
// Therefore B<float>::~B() should not have implicit device attr.
// However C<float>::~C() should have implicit device attr since
// it is trivial.
namespace ExplicitInstantiationDtorNoAttr {
void host_fun() // dev-note {{'host_fun' declared here}}
void host_fun()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand it correctly that the code is still expected to error out as a deferred diag at codegen phase?

If so, it would be great to mention that in the comment, and, maybe, add a codegen case if it's not done already in the deferred diags part of the patch we split off.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. deferred diag happens at the end of sema.

What this PR does is that for virtual dtor of explicitly instantiated template class, clang will not treat constexpr dtors as executable on device, therefore if there is a constexpr member dtor, the virtual dtor is host only. Then it won't be emitted on device side.

The reason is that users have no control of this virtual dtor since it must be emitted, and users may not have control of the class definition, so unless we are sure this dtor can be executed on device, let's do not infer it as device. This makes sure the code that works as C++ continue to work in CUDA/HIP.

{}

template <unsigned>
constexpr void hd_fun() {
host_fun(); // dev-error{{reference to __host__ function 'host_fun' in __host__ __device__ function}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason I can not reproduce any of the errors in this test on godbolt, even with older clang: https://godbolt.org/z/4fMh5jxKd

What am I missing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the old clang, there is no deferred diag for dtors, and the compilation stops at device assembly, so you won't see the link error. For the trunk clang, it is not new enough (-v shows abe1ecf). I think it may be updated once daily.

host_fun();
}

struct A {
constexpr ~A() { // dev-note {{called by '~B'}}
hd_fun<8>(); // dev-note {{called by '~A'}}
constexpr ~A() {
hd_fun<8>();
}
};

Expand Down
Loading