Skip to content

LLVM and SPIRV-LLVM-Translator pulldown (WW02-03) #5276

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1,004 commits into from
Jan 14, 2022

Conversation

vmaksimo
Copy link
Contributor

@vmaksimo vmaksimo commented Jan 10, 2022

topperc and others added 30 commits January 6, 2022 09:24
Many codegen pass require this pass with useful triple info. Legacy pass manager need to
add a TargetLibraryInfo with the module info before run passes. Or the TargetLibraryInfo
will be initialized too conservative.

Reviewed By: pengfei, aeubanks

Differential Revision: https://reviews.llvm.org/D115850
Extra definitions are placed in the generated source file for each op class. The substitution `$cppClass` is replaced by the op's C++ class name.

This is useful when declaring but not defining methods in TableGen base classes:

```
class BaseOp<string mnemonic>
    : Op<MyDialect, mnemonic, [DeclareOpInterfaceMethods<SomeInterface>] {
  let extraClassDeclaration = [{
    // ZOp is declared at at the bottom of the file and is incomplete here
    ZOp getParent();
  }];
  let extraClassDefinition = [{
    int $cppClass::someInterfaceMethod() {
      return someUtilityFunction(*this);
    }
    ZOp $cppClass::getParent() {
      return dyn_cast<ZOp>(this->getParentOp());
    }
  }];
}
```

Certain things may prevent defining these functions inline, in the declaration. In this example, `ZOp` in the same dialect is incomplete at the function declaration because ops classes are declared in alphabetical order. Alternatively, functions may be too big to be desired as inlined, or they may require dependencies that create cyclic includes, or they may be calling a templated utility function that one may not want to expose in a header. If the functions are not inlined, then inheriting from the base class N times means that each function will need to be defined N times. With `extraClassDefinitions`, they only need to be defined once.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D115783
Add 5 simple folders
* bitcast(x : T0, T0) -> x
* addrcast(x : T0, T0) -> x
* bitcast(bitcast(x : T0, T1), T0) -> x
* addrcast(addrcast(x : T0, T1), T0) -> x
* gep %x:T, 0 -> %x:T

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D116715
Let each format of inst have two tests for it like other MxCMP
testcases.
The current help for `frame variable` is somewhat long. Its length, combined
with the few aliases (`var`, `v`, and `vo`) can make the output of `apropos`
redundant and noisy.

This separates out the details into a separate long help.

Differential Revision: https://reviews.llvm.org/D116708
We don't need to restrict operations on ExecutorAddrDiff as carefully as we do
for ExecutorAddr.
…ress.

ExecutorAddr is the preferred representation for executor process addresses now.
Add a convenience for appending constructed string values.

Differential Revision: https://reviews.llvm.org/D116682
glibc versions < 2.26 use different names for the fields.
However the layout is unchanged, so using the offset should be a
portable way to address this issue across platforms.

Fixes: llvm/llvm-project#53014

Patch By: paulkirth

Differential Revision: https://reviews.llvm.org/D116695
There was a limitation in legality that in the original inner loop latch,
no instruction was allowed between the induction variable increment
and the branch instruction. This is because we used to split the
inner latch at the induction variable increment instruction. Since
now we have split at the inner latch branch instruction and have
properly duplicated instructions over to the split block, we remove
this limitation.

Please refer to the test case updates to see how we now interchange
loops where instructions exist between the induction variable increment
and the branch instruction.

Reviewed By: bmahjour

Differential Revision: https://reviews.llvm.org/D115238
llvm/test/Bindings/Go is quite flaky in the past few months and nobody fixes it.

See

* https://lists.llvm.org/pipermail/llvm-dev/2021-December/154353.html "Suggestions on debugging pre-merge test failure that looks irrelevant."
* llvm/llvm-project#53017

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D116698
In function `DeviceTy::getTargetPointer`, `Entry` could be `nullptr` because of
zero length array section. We need to check if it is a valid iterator before
using it.

Reviewed By: ronlieb

Differential Revision: https://reviews.llvm.org/D116716
Since the analysis is described to be suitable for a forward
data-flow analysis, maintaining the worklist as a queue mimics
RPO ordering of block visits, thus reaching the fixpoint earlier.

Differential Revision: https://reviews.llvm.org/D116393
Improves llvm-dwarfdump output and for simplified template names roundtripping.
Important for DWARFv5 debug info which might contain type units in the
debug_info section, which made summarize-types fairly ineffective/lost
amongst the noise of CUs being dumped.
Use the TUIndex in a DWP file if present, otherwise (in .o, .dwo, and
non-split linked executables) cache a DenseMap for lookup of type units.
Patch that removed the use of this variable was  reverted in
8ade3d4

This reverts commit 3988a06.
The function that optimally inserts the exec mask
restore operations by combining the blocks currently
visits the lowered END_CF pseudos in the forward
direction as it iterates the setvector in the order
the entries are inserted in it.

Due to the absence of BranchFolding at -O0, the
irregularly placed BBs cause the forward traversal
to incorrectly place two unconditional branches in
certain BBs while combining them, especially when
an intervening block later gets optimized away in
subsequent iterations.

It is avoided by reverse iterating the setvector.
The blocks at the bottom of a function will get
optimized first before processing those at the top.

Fixes: SWDEV-315215

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D116273
Breaks Asan on Fuchsia's and ubsan with gcc.

This reverts commit 685c94c.
…th fixes.

This re-applies 133f86e, which was reverted in
c5965a4 while I investigated bot failures.

The original failure contained an arithmetic conversion think-o (on line 419 of
EHFrameSupport.cpp) that could cause failures on 32-bit platforms. The issue
should be fixed in this patch.
llvm/llvm-project#52979

Though SpaceAfterCStyleCast is set to true, clang-format 13 does not add a space after (void *) here:

```
```

This patch addresses that

Fixes: #52979

Reviewed By: curdeius, HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D116592
Track all GlobalObjects that reference a given comdat, which allows
determining whether a function in a comdat is dead without scanning
the whole module.

In particular, this makes filterDeadComdatFunctions() have complexity
O(#DeadFunctions) rather than O(#SymbolsInModule), which addresses
half of the compile-time issue exposed by D115545.

Differential Revision: https://reviews.llvm.org/D115864
There are no duplicates among the include files, and all the
source files are wrapped in architecture ifdefs, so there's no harm
in including all of them, always.

This fixes builds if TARGET_TRIPLE is set to something else than the
build architecture.

This also allows building for multiple architectures at once by
setting CMAKE_OSX_ARCHITECTURES.

Differential Revision: https://reviews.llvm.org/D116625
…ints

This implements the clang side of D116531. The elementtype
attribute is added for all indirect constraints (*) and tests are
updated accordingly.

Differential Revision: https://reviews.llvm.org/D116666
No need to keep track of equivalent extract_slice / insert_slice tensors during bufferization. Just emit a copy, it will fold away.

Note: The analysis still keeps track of equivalent tensors to make the correct inplace bufferization decisions.

Differential Revision: https://reviews.llvm.org/D116684
@vmaksimo vmaksimo added the disable-lint Skip linter check step and proceed with build jobs label Jan 10, 2022
Previously we relied on the structure type (which represented matrix
type) mangling to obtain its layout. Now, when DPCPP magling scheme is
aligned with C++, thus the structure lost their usual mangling. Yet we
want to preserve the desired information within the matrix type, coming
from DPCPP headers. To achive this the 'Matrix structure' was changed
form:
template <typename T, int R, int C, int L, int S>
struct __spirv_JointMatrixINTEL;
to
template <typename T, int R, int C, int L, int S>
struct __spirv_JointMatrixINTEL {
  T (*Value)[R][C][L + 1][S + 1];
};

so it's no longer an opaque structure and now it look like this in
LLVM IR:
%struct.__spirv_JointMatrixINTEL = type { [42 x [6 x [2 x [1 x i32]]]]* }

Here we encode the number of Rows, Cols, Layout and Scope as sizes of an
array (which element type is the same as the base matrix's type), which
is a bit odd, but it's probably the best way we can preserve the information now
without having the matrix type itself in LLVM.

Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@74759ef
The assertions are actually not required here, since all these
functions are being called from visitCallInst, which has the
appropriate check. But for the code consistency lets do not
trust functions' inputs.

Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@39cae09
@vmaksimo vmaksimo changed the title LLVM and SPIRV-LLVM-Translator pulldown (WW03) LLVM and SPIRV-LLVM-Translator pulldown (WW02-03) Jan 10, 2022
@vmaksimo
Copy link
Contributor Author

llvm-spirv failures are expected to be fixed by #5221

MrSidims and others added 2 commits January 10, 2022 22:37
Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com>

Original commit:
KhronosGroup/SPIRV-LLVM-Translator@f6f7114
err_typecheck_zero_array_size now accepts an argument, the error message
has changed.
Remove asserts from SemaSYCL since deferred diagnostics happen after it.
@vmaksimo
Copy link
Contributor Author

/summary:run

Last argument of `removeAttributesAtIndex()` was changed from `AttrBuilder` to
a newly added class `AttributeMask`.
dpcpp_staging branch contains the merded fix while master branch doesn't.
LowerWGScope now produces a different IR, staying the same semantically.
Changed the expected IR lines for getelementptr instructions.
@vmaksimo
Copy link
Contributor Author

/summary:run

@vmaksimo
Copy link
Contributor Author

Hi, @Naghasan! Some time ago you've created a test that checks there is no declaration of the function.

// Autogenerated by gen-libclc-test.py
// RUN: %clang -emit-llvm -S -o - %s | FileCheck %s
#include <spirv/spirv_types.h>
// CHECK-NOT: declare {{.*}} @_Z
// CHECK-NOT: call {{[^ ]*}} bitcast
__attribute__((overloadable)) void
test___spirv_GroupWaitEvents(__clc_uint32_t args_0, __clc_int32_t args_1,
__clc_event_t *args_2) {
__spirv_GroupWaitEvents(args_0, args_1, args_2);
}

Now it started to fail like that (full log here https://github.com/intel/llvm/runs/4776465360?check_suite_focus=true):

/__w/llvm/llvm/src/libclc/test/binding/core/GroupWaitEvents.cl:16:15: 

error: CHECK-NOT: excluded string found in input 
// CHECK-NOT: declare {{.*}} @_Z 
 ^ 

<stdin>:16:1: note: found here 
declare hidden void @_Z23__spirv_GroupWaitEventsjiPU3AS59ocl_event(i32, i32, %opencl.event_t.0...

Do you have any idea if the declare hidden has the same meaning as not generating a declaration at all? Or is it expected to get some "empty" IR, and we've caught a regression?
Could you please help to understand what the test should check? Thank you!

@Naghasan
Copy link
Contributor

Hi, @Naghasan! Some time ago you've created a test that checks there is no declaration of the function.

Do you have any idea if the declare hidden has the same meaning as not generating a declaration at all? Or is it expected to get some "empty" IR, and we've caught a regression? Could you please help to understand what the test should check? Thank you!

Hello,

The test is checking that you don't have a declare which basically means there is a mangling mismatch between the function called and the one in libclc.

Did your test run included this 29047bc ?

@vmaksimo
Copy link
Contributor Author

The test is checking that you don't have a declare which basically means there is a mangling mismatch between the function called and the one in libclc.

Did your test run included this 29047bc ?

Oh, I haven't seen this change - I believe it's missing in this PR. Thanks for letting me know, aligning with the recent sycl branch should help!

@vmaksimo
Copy link
Contributor Author

/summary:run

@vmaksimo vmaksimo marked this pull request as ready for review January 13, 2022 16:50
@vmaksimo vmaksimo requested review from bader and a team as code owners January 13, 2022 16:50
@vmaksimo
Copy link
Contributor Author

/merge

@bb-sycl
Copy link
Contributor

bb-sycl commented Jan 14, 2022

Fri 14 Jan 2022 08:28:18 AM UTC --- Start to merge the commit into sycl branch. It will take several minutes.

@bb-sycl
Copy link
Contributor

bb-sycl commented Jan 14, 2022

Fri 14 Jan 2022 08:31:37 AM UTC --- Merge the branch in this PR to base automatically. Will close the PR later.

@bb-sycl bb-sycl merged commit fcd5a98 into intel:sycl Jan 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disable-lint Skip linter check step and proceed with build jobs
Projects
None yet
Development

Successfully merging this pull request may close these issues.