Skip to content

[LIBCLC] Generate FunctionDecl to remangle substituted entries #7220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Dec 6, 2022

Conversation

jchlanda
Copy link
Contributor

@jchlanda jchlanda commented Oct 28, 2022

Retire the by hand mangling of substituted function in favor of constructing corresponding FunctionDecls and letting clang mangle it for us. This ensures that we never diverge from clang's mangling idiosyncrasies and bugfixes would likely be in the AST creation, not mangling itself.

The tool has been rewritten as a ClangTool implementing FrontendAction.

Additionally, we provide the option to perform a test run, in which no substitutions are made, while checking that remangled name does not diverge from the original mangled name. The tests are added to check-libclc target.

The diffs are a bit ugly, it might be easier to inspect the file mode, basically the change lives entirely in Remangler class, and if you follow Remangler::remangle it will lead you through what it does.

Fixes: #6505

@jchlanda jchlanda requested a review from a team as a code owner October 28, 2022 13:59
@jchlanda jchlanda requested a review from romanovvlad October 28, 2022 13:59
@jchlanda
Copy link
Contributor Author

@steffenlarsen you might find it interesting :)

@romanovvlad
Copy link
Contributor

@bader Could you please help with review?

@Naghasan
Copy link
Contributor

Can I suggest someone from @intel/dpcpp-cfe-reviewers help for this ? This PR is mainly playing with the clang tooling so they would be the most relevant.

@bader
Copy link
Contributor

bader commented Nov 15, 2022

@jchlanda, could you resolve merge conflicts, please?

@jchlanda jchlanda force-pushed the jakub/remangling_action branch from 0ab248c to 3d8d1b2 Compare November 15, 2022 09:58
@jchlanda
Copy link
Contributor Author

@jchlanda, could you resolve merge conflicts, please?

Done.

@jchlanda
Copy link
Contributor Author

Fixes: #6505

@bader
Copy link
Contributor

bader commented Nov 16, 2022

Can I suggest someone from @intel/dpcpp-cfe-reviewers help for this ? This PR is mainly playing with the clang tooling so they would be the most relevant.

@intel/dpcpp-cfe-reviewers, ping.

@bader
Copy link
Contributor

bader commented Nov 16, 2022

Fixes: #6505

If you add this line to the PR description, GitHub will automatically link the PR with the referenced issue.

@smanna12
Copy link
Contributor

smanna12 commented Nov 16, 2022

Can I suggest someone from @intel/dpcpp-cfe-reviewers help for this ? This PR is mainly playing with the clang tooling so they would be the most relevant.

@intel/dpcpp-cfe-reviewers, ping.

I am not very familiar with the ClangTool. @premanandrao, @AaronBallman, could you please take a look at?

Copy link
Contributor

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have much expertise in this area, so a lot of my comments are more superficial.

Does existing test coverage suffice to test these changes?

Comment on lines 335 to 348
// Fix up the template types in the original FD's arg tys and return ty.
auto AreQTsEqual = [&](clang::QualType &LHS, clang::QualType &RHS) -> bool {
return (LHS.getBaseTypeIdentifier() && RHS.getBaseTypeIdentifier() &&
LHS.getBaseTypeIdentifier()->isStr(
RHS.getBaseTypeIdentifier()->getName())) ||
LHS == RHS;
};
unsigned NumReplaced = 0;
unsigned Idx = 0;
for (auto &TemplateArgQT : TemplateArgTys) {
if (AreQTsEqual(TemplateArgQT, RetTy)) {
RetTy = TemplateTypeParamTys[Idx];
goto Found;
}
for (unsigned i = 0; i < ArgTys.size(); ++i) {
if (AreQTsEqual(ArgTys[i], TemplateArgQT)) {
ArgTys[i] = TemplateTypeParamTys[Idx];
goto Found;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you basically trying to instantiate the function here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeap, making sure that the template is fully specialized.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooofda, I think you might need to use the usual template instantiation machinery to do this (you may have multiple levels of instantiation to perform for a type like foo<T<U>> (think: template template parameters). CC @erichkeane

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a tricky issue.
In principle you are right; the following, fictional, overload of image read would not be handled correctly:

template <typename T> class woof{};
template <>
int4 __spirv_ImageRead<int4 , ocl_image3d_ro, woof<int4>>(ocl_image3d_ro, woof<int4>) {}

However, this patch:

  • does not deviate from the current status, the same failure could be observed in the mainline,
  • we only need to be able to handle functions defined in libclc, all of which have to follow spir-v interface,
  • I am concerned that if we wanted to make it complete, we would end up re-implementing huge chunks of clang for something that is a closed scope task,
  • finally, remangler will hard fail should the interface changed.

With thte patch, the above sample would result in:

Unhandled type : woof<int vector[4]>
UNREACHABLE executed at /home/jakub/Projects/oneAPI/jchlanda/llvm/src/llvm/frontendaction/LibclcRemangler.cpp:479!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooof, there are quite a lot of situations that'll get out of hand here, its not handling constraint overloading, template-template params, etc, but I don't have a good hold on it. BUT if the libclc owner is equally unconcerned as you about these, I've got no problems obviously.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steffenlarsen would you be able to weigh in on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of templates in the current builtins are luckily limited and hopefully they stay this way. It would be nice to see handling of these improve in the future (hopefully in a way where it can again reuse some existing logic) but at least this doesn't seem to worsen the handling of these outliers compared to mainline.

Note that the builtins libclc attempts to mimic need to be handled correctly by the SPIR-V Translator too, so to my knowledge the subset of possible instantiations we support of the templated builtins are actually fairly limited too.

assert(false && "unhandled type name: decltype(auto)");
else if (Name == "std::nullptr_t")
Res = ContextAST->NullPtrTy;
else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if you don't know the type you assume it's an enumeration type? That seems dangerous. For example, this is missing a type for _BitInt, so that would turn into an enum here. I think the else case should be an assertion to help spot missing types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this matches what the previous code did.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added an assert for _BitInt, the problem here is that we match on strings printed from itanium_demangle::Node and having a generic else close that matches an invalid state is not possible (or hard to get?). FWIW handling a valid qualified type as an enum (falling to catch all else as it is now) would result in a mis-mangled name and would be spotted by the tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't that presume we're testing every possible builtin type? (That sounds like it would get out of sync pretty quickly as we do add new builtin types, though somewhat uncommonly.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is similar to #7220 (comment), we only need to be able to handle what spir-v can throw at the remangler.

BlockMeta *NewMeta = reinterpret_cast<BlockMeta *>(std::malloc(NBytes));
if (NewMeta == nullptr)
std::terminate();
BlockList->Next = new (NewMeta) BlockMeta{BlockList->Next, 0};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AaronBallman, does this statement have sequencing issues or is it well sequenced?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so I am glad I was not the only person who looked at that code and went "is that right?"! :-D

There's no sequencing problem there because of: https://eel.is/c++draft/expr.ass#1

assert(false && "unhandled type name: decltype(auto)");
else if (Name == "std::nullptr_t")
Res = ContextAST->NullPtrTy;
else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this matches what the previous code did.


bool remangleFunction(Function &Func, llvm::Module *M) {
if (!Func.getName().startswith("_Z"))
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this return false instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, the bool here is used as a success/failure notion, and I'd say that a state where we don't need to remangle and can early exit from this function matches the success.

@premanandrao
Copy link
Contributor

I am not very familiar with the ClangTool. @premanandrao, @AaronBallman, could you please take a look at?

Sorry @smanna12, me neither.

@jchlanda
Copy link
Contributor Author

I don't have much expertise in this area, so a lot of my comments are more superficial.

Does existing test coverage suffice to test these changes?

As it stands now we do not test remangler and discover the failures at runtime with ptxas reporting unresolved function calls (for example: #6505); this PR brings testing capability to the remangler, by providing a -t (test run) option; when used, the remangler will not perform any substitutions and assert that the remangled name matches the input one; this is wrapped against check-sycl target here: https://github.com/intel/llvm/pull/7220/files#diff-023be2cbe5badf0f927377ab972855d2e09923a36976921fd8c1efabb307edeaR57

@jchlanda jchlanda marked this pull request as draft November 18, 2022 08:48
@jchlanda jchlanda marked this pull request as ready for review November 18, 2022 08:49
@AaronBallman
Copy link
Contributor

I don't have much expertise in this area, so a lot of my comments are more superficial.
Does existing test coverage suffice to test these changes?

As it stands now we do not test remangler and discover the failures at runtime with ptxas reporting unresolved function calls (for example: #6505); this PR brings testing capability to the remangler, by providing a -t (test run) option; when used, the remangler will not perform any substitutions and assert that the remangled name matches the input one; this is wrapped against check-sycl target here: https://github.com/intel/llvm/pull/7220/files#diff-023be2cbe5badf0f927377ab972855d2e09923a36976921fd8c1efabb307edeaR57

Thank you for the explanation! Are there existing tests for -t that this will start to exercise, or should we add a case to this patch to exercise those changes?

@jchlanda
Copy link
Contributor Author

jchlanda commented Nov 22, 2022

I don't have much expertise in this area, so a lot of my comments are more superficial.
Does existing test coverage suffice to test these changes?

As it stands now we do not test remangler and discover the failures at runtime with ptxas reporting unresolved function calls (for example: #6505); this PR brings testing capability to the remangler, by providing a -t (test run) option; when used, the remangler will not perform any substitutions and assert that the remangled name matches the input one; this is wrapped against check-sycl target here: https://github.com/intel/llvm/pull/7220/files#diff-023be2cbe5badf0f927377ab972855d2e09923a36976921fd8c1efabb307edeaR57

Thank you for the explanation! Are there existing tests for -t that this will start to exercise, or should we add a case to this patch to exercise those changes?

I'm not sure if I follow exactly, but

@AaronBallman
Copy link
Contributor

I don't have much expertise in this area, so a lot of my comments are more superficial.
Does existing test coverage suffice to test these changes?

As it stands now we do not test remangler and discover the failures at runtime with ptxas reporting unresolved function calls (for example: #6505); this PR brings testing capability to the remangler, by providing a -t (test run) option; when used, the remangler will not perform any substitutions and assert that the remangled name matches the input one; this is wrapped against check-sycl target here: https://github.com/intel/llvm/pull/7220/files#diff-023be2cbe5badf0f927377ab972855d2e09923a36976921fd8c1efabb307edeaR57

Thank you for the explanation! Are there existing tests for -t that this will start to exercise, or should we add a case to this patch to exercise those changes?

I'm not sure if I follow exactly, but

I don't know about libclc, but at least in LLVM and Clang, adding untested code is generally not something we do. But I see now that the CMake changes also include an execution with -t (https://github.com/intel/llvm/pull/7220/files#diff-023be2cbe5badf0f927377ab972855d2e09923a36976921fd8c1efabb307edeaR63) so long as LIBCLC_GENERATE_REMANGLED_VARIANTS is defined. Is that variable on by default or get enabled by a CI run so if the flag was broken (perhaps in the future), at least one test would break as well?

@jchlanda
Copy link
Contributor Author

Yeap, it is being build by default for the CI runs (see the buildbot script here: https://github.com/jchlanda/llvm/blob/jakub/remangling_action/buildbot/configure.py#L127), as well as for both CUDA and HIP.

@AaronBallman
Copy link
Contributor

Yeap, it is being build by default for the CI runs (see the buildbot script here: https://github.com/jchlanda/llvm/blob/jakub/remangling_action/buildbot/configure.py#L127), as well as for both CUDA and HIP.

Fantastic, that alleviates my concerns about the testing!

@jchlanda
Copy link
Contributor Author

Is there anything else I could do for this PR?

@jchlanda jchlanda force-pushed the jakub/remangling_action branch from 8e95ddc to e3bb16e Compare December 5, 2022 16:13
@jchlanda
Copy link
Contributor Author

jchlanda commented Dec 6, 2022

The failures are not related (assert tests in HIP) are we OK to merge this @intel/llvm-gatekeepers

@steffenlarsen
Copy link
Contributor

@premanandrao & @erichkeane & @AaronBallman - Have your concerns been addressed?

@premanandrao
Copy link
Contributor

@premanandrao & @erichkeane & @AaronBallman - Have your concerns been addressed?

My concerns were all addressed.

@pvchupin pvchupin merged commit 114aeba into intel:sycl Dec 6, 2022
@pvchupin
Copy link
Contributor

pvchupin commented Dec 7, 2022

@pvchupin
Copy link
Contributor

pvchupin commented Dec 7, 2022

@jchlanda, I'm going to revert it in a moment. Let me know if you have any other fix ready.

pvchupin pushed a commit that referenced this pull request Dec 7, 2022
pvchupin pushed a commit that referenced this pull request Dec 7, 2022
steffenlarsen added a commit that referenced this pull request Dec 9, 2022
@pvchupin I've fixed the issues you mentioned in
#7220 (comment) (the
last two commits):
* shared libs build: e0f3a52
* windows mangling: b88627f

The rest is as per #7220

Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Co-authored-by: premanandrao <premanand.m.rao@intel.com>
Co-authored-by: Steffen Larsen <steffen.larsen@intel.com>
steffenlarsen pushed a commit that referenced this pull request Dec 14, 2022
The diffs are quite hard to follow, but in an essence this patch brings:
* a new entry, implementing a generic address space for multiple
`__CLC_NVVM_ATOMIC_XYZ_IMPL`, where `XYZ` stands for `CAS`, `INDEC`,
`LOAD`, `MAX`, `MIN`, `STORE` and `SUB`,
* fixes the name of mangled function that the IMPL uses,
* the rest is just formatting to 80 chars.

This patch supersedes: #5849 but it
requires the fixes to the remangler from:
#7220

Fixes: #7658
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SYCL-CTS test_vector_swizzles doesn't compile for NVPTX target.
10 participants