LLVM and SPIRV-LLVM-Translator pulldown (WW23) #3852

vmaksimo · 2021-05-31T10:26:41Z

LLVM: llvm/llvm-project@caf86d2
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@485c8c7

cxx20_iterator_traits.compile.pass.cpp actually depends on implementation details of libc++, which is not great; but I just left a comment and moved on.

…cpp. NFCI.

- Currently, the host cpu information is not easily available on z/OS as in other platforms. - This information is stored in the Communications Vector Table (https://www.ibm.com/docs/en/zos/2.2.0?topic=information-cvt-mapping) Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D102793

Currently, BPF only contains three relocations: R_BPF_NONE for no relocation R_BPF_64_64 for LD_imm64 and normal 64-bit data relocation R_BPF_64_32 for call insn and normal 32-bit data relocation Also .BTF and .BTF.ext sections contain symbols in allocated program and data sections. These two sections reserved 32bit space to hold the offset relative to the symbol's section. When LLVM JIT is used, the LLVM ExecutionEngine RuntimeDyld may attempt to resolve relocations for .BTF and .BTF.ext, which we want to prevent. So we used R_BPF_NONE for such relocations. This all works fine until when we try to do linking of multiple objects. . R_BPF_64_64 handling of LD_imm64 vs. normal 64-bit data is different, so lld target->relocate() needs more context to do a correct job. . The same for R_BPF_64_32. More context is needed for lld target->relocate() to differentiate call insn vs. normal 32-bit data relocation. . Since relocations in .BTF and .BTF.ext are set to R_BPF_NONE, they will not be relocated properly when multiple .BTF/.BTF.ext sections are merged by lld. This patch intends to address this issue by adding additional relocation kinds: R_BPF_64_ABS64 for normal 64-bit data relocation R_BPF_64_ABS32 for normal 32-bit data relocation R_BPF_64_NODYLD32 for .BTF and .BTF.ext style relocations. The old R_BPF_64_{64,32} semantics: R_BPF_64_64 for LD_imm64 relocation R_BPF_64_32 for call insn relocation The existing R_BPF_64_64/R_BPF_64_32 mapping to numeric values is maintained. They are the most common use cases for bpf programs and we want to maintain backward compatibility as much as possible. ExecutionEngine RuntimeDyld BPF relocations are adjusted as well. R_BPF_64_{ABS64,ABS32} relocations will be resolved properly and other relocations will be ignored. Two tests are added for RuntimeDyld. Not handling R_BPF_64_NODYLD32 in RuntimeDyldELF.cpp will result in "Relocation type not implemented yet!" fatal error. FK_SecRel_4 usages in BPFAsmBackend.cpp and BPFELFObjectWriter.cpp are removed as they are not triggered in BPF backend. BPF backend used FK_SecRel_8 for LD_imm64 instruction operands. Differential Revision: https://reviews.llvm.org/D102712

Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D103057

Said function had a few shortfalls: - didn't set an abort message on Android - was logged on several lines - didn't provide extra information like the size requested if OOM'ing This improves the function to address those points. Differential Revision: https://reviews.llvm.org/D103034

…to intptr_t A test in ir.c makes use of casting a void* to an integer type to print it's address. This cast is currently done with the datatype `long` however, which is only guaranteed to be equal to the pointer width on LP64 system. Other platforms may use a length not equal to the pointer width. 64bit Windows as an example uses 32 bit for `long` which does not match the 64 bit pointers. This also results in clang warning due to `-Wvoid-pointer-to-int-cast`. Technically speaking, since the test only passes the value 42, it does not cause any issues, but it'd be nice to fix the warning at least. Differential Revision: https://reviews.llvm.org/D103085

All users of the builder should set an insert point before using the builder. There should be no need for using InsertPointGuard here.

…s on AVX1 Determined from llvm-mca analysis, AVX1 capable targets have a higher throughput for VPBLENDVB and shuffle ops, making it cheaper to perform shift+shuffle/select shift patterns.

Match whats documented in the Intel AOM - the XMM variant of PSHUFB requires BOTH ports - this was being incorrectly modelled as EITHER port. Now that we can use in-order models in llvm-mca, the atom model is a good "worst case scenario" analysis for x86.

We are using TOCEntry symbols like `LC..0` in TOC loads, this is hard to read , at least requiring an additional step to figure out the loaded symbols. We should print out the name in comments. Reviewed By: #powerpc, shchenz Differential Revision: https://reviews.llvm.org/D102949

Removed some of the older raw "MLIRized" versions that are no longer needed now that the sparse runtime support library can focus on the proper sparse tensor types rather than the opague pointer approach of the past. This avoids legacy... Reviewed By: penpornk Differential Revision: https://reviews.llvm.org/D102960

All callers pass "false" for the Equality parameter. Kill the dead code, and update the function block comment.

The parseInputFile function returns an empty unique_ptr to signal an error, like when the input file doesn't exist, or is malformed. In this case, the tool should exit immediately rather than segfault by dereferencing the unique_ptr later. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D102891

Stylistic changes only. 1) Don't pass a parameter just to do an early exit. 2) Use a name which matches actual behavior.

This reverts commit 0bebda1. Causing "Invalid record" errors.

The 2nd test is based on the fuzzer example in post-commit comments of D101191 - https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34661 The 1st test shows that we don't deal with this symmetrically. We should be able to reduce both examples (possibly in instsimplify instead of instcombine).

…banks This function can change regbank for registers which already have a selected bank. Depending on the instruction where these registers were used it can cause instruction selection to fail.

A recent fix for problems with ENTRY statement handling didn't get the case of a procedure dummy argument on an ENTRY statement in an executable part right; the code presumed that those dummy arguments would be objects, not entities that might be objects or procedures. Fix. Differential Revision: https://reviews.llvm.org/D103098

llvm-profgen uses profile summary based cold threshold to merge and trim cold context profile. This is to strike a good balance between profile size and performance. We've been using 99.9% as the cutoff to save profile size without affecting performance. This change switch to use 99.9% instead of 99.9999% as default cold threshold cutoff for llvm-profgen. Redundant switch csprof-cold-thres is also removed and tests cleaned up. Differential Revision: https://reviews.llvm.org/D103071

Update the paragraph on generic / indexed_generic to reflect the unification of these operations. Differential Revision: https://reviews.llvm.org/D102775

Make sure that if SCUDO_DEBUG=1 in tests then we had the same in the scudo library itself. Reviewed By: cryptoad, hctim Differential Revision: https://reviews.llvm.org/D103061

Cast of signed types to u64 breaks comparison. Also remove double () around operands. Reviewed By: cryptoad, hctim Differential Revision: https://reviews.llvm.org/D103060

…ly infinite loops into finite ones Nowadays LLVM does not assume that all loops are finite, so if we want to produce a finite loop from a potentially-infinite one, we must ensure that the original loop is known to be a finite one. For this transform, it only matters for arithmetic right-shifts. For them, either the function or the loop must be known to be `mustprogress`, or the original value being shifted must be known to be non-negative (because iff the sign bit was set, it will never become zero, but will become `-1` in the "end"). It would be really good for alive2 to actually complain about this, but it currently does not: AliveToolkit/alive2#726

Differential Revision: https://reviews.llvm.org/D103104

Now that we can fold some transposes into multiplies (CM: A * B^t and RM: A^t * B), we want to move them around to create the optimal expressions: * fold away double transposes while still using them to assert the shape * sink transposes hoping they cancel out * lift transposes when both operands are transposed This also modifies the matrix remarks to include the number of exposed transposes (i.e. transposes that we couldn't fold into a multiply). The adjustment to the test remarks-inlining is a bit subtle: I am changing the double transpose to a single transpose so that we don't remove it completely. More importantly this changes some of the total instruction count, most notable stores because we can no longer use a vector store. Differential Revision: https://reviews.llvm.org/D102733

This patch is the third in a series of patches fixing markdown links and references inside the mlir documentation. This patch addresses all broken references to other markdown files and sections inside the Tutorials folder. Differential Revision: https://reviews.llvm.org/D103017

…alue The semantics of select with undefined/poison condition are not explicitly stated in the LangRef, but this matches comments in the code and Alive2 appears to concur: https://alive2.llvm.org/ce/z/KXytmd We can find this pattern after demanded elements transforms. As noted in D101191, fuzzers are finding infinite loops because we may not account for this pattern in other passes.

…(5/n) This revision refactors and simplifies the pattern detection logic: thanks to SSA value properties, we can actually look at all the uses of a given value and avoid having to pattern-match specific chains of operations. A bufferization pattern for subtensor is added and specific inplaceability analysis is implemented for the simple case of subtensor. More advanced use cases will follow. Differential revision: https://reviews.llvm.org/D102512

WG14 adopted N2645 and WG21 EWG has accepted P2334 in principle (still subject to full EWG vote + CWG review + plenary vote), which add support for #elifdef as shorthand for #elif defined and #elifndef as shorthand for #elif !defined. This patch adds support for the new preprocessor directives.

For uniform ReplicateRecipes, only the first lane should be used, so sinking them would mean we have to compute the value of the first lane multiple times. Also, at the moment, sinking them causes a crash because the value of the first lane is re-used by all users. Reported post-commit for D100258.

The vector calling convention dictates that when the vector argument registers are exhaused, GPRs are used to pass the address via the stack. When the GPRs themselves are exhausted, at best we would previously crash with an assertion, and at worst we'd generate incorrect code. This patch addresses this issue by passing fixed-length vectors via the stack with their full fixed-length size and aligned to their element type size. Since the calling convention lowering can't yet handle scalable vector types, this patch adds a fatal error to make it clear that we are lacking in this regard. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D102422

DAGCombine's `mergeStoresOfConstantsOrVecElts` optimization is told whether it's to use vector types and also whether it's to issue a truncating store. However, the truncating store code path assumes a scalar integer `ConstantSDNode`, and when using vector types it creates either a `BUILD_VECTOR` or `CONCAT_VECTORS` to store: neither of which is a constant. The `riscv64` target is able to expose a crash here because it switches on both code paths at the same time. The `f32` is stored as `i32` which must be promoted to `i64`, necessitating a truncating store. It also decides later that it prefers a vector store of `v2f32`. While vector truncating stores are legal, this combine is not able to emit them. We also don't have a test case. This patch adds an assert to catch this case more gracefully, and updates one of the caller functions to the function to turn off the use of truncating stores when preferring vectors. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103173

We were accidentally leaning on code in lowerLoad which expands extending loads which should be removed.

…pulldown

…v_pulldown

* Fix for ParseChecksum function This is a patch to fix failure, caused by bringing as argument of ParseChecksum StringRef with non-zero length, but with empty data. Original commit: KhronosGroup/SPIRV-LLVM-Translator@b768773

Some FPGA devices and toolchains can support customizable levels or implementation of pipeline parallelism when mapping a SPIR-V module to hardware. Through pipeline parallelism, multiple invocations of a kernel or function can execute concurrently. This extension adds decorations to request that a kernel or function support invocations at a specified initiation interval, that multiple invocations are forbidden from executing concurrently, or that the kernel or function is limited to a maximum number of concurrent invocations. Spec: https://github.com/KhronosGroup/SPIRV-Registry/blob/38a7e56b9b7feab7fa8b3bea08770f7232b76ed2/extensions/INTEL/SPV_INTEL_fpga_invocation_pipelining_attributes.asciidoc Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com> Original commit: KhronosGroup/SPIRV-LLVM-Translator@e303a5f

This extension adds a type-declaration instruction OpTypeTokenINTEL, which is an analog of LLVM's type token. This type specifies IDs of entry points and exit points of certain IL constructs, for example it can be useful for exception handling representation in IL. Spec: intel#3788 Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com> Original commit: KhronosGroup/SPIRV-LLVM-Translator@091c366

When some global variable contains constant function pointer to a function and this global variable is referenced from the pointed function, translation crashed because intializer for global variable was translated before mapping of OpVariable to LLVM value. It happened because during translation of initializer in such case the original variable was met for the second time, but it wasn't mapped to a proper llvm value. Original commit: KhronosGroup/SPIRV-LLVM-Translator@485c8c7

vmaksimo · 2021-05-31T10:27:01Z

/summary:run

Breaks check-llvm on non-linux, see comments on https://reviews.llvm.org/D85085 This reverts commit caae570 and follow-up commit 1546c52.

vmaksimo · 2021-06-02T10:57:37Z

/summary:run

Quuxplusone and others added 30 commits May 25, 2021 11:12

[libc++] [test] Format some C++20 iterator_traits tests. NFCI.

148c19a

cxx20_iterator_traits.compile.pass.cpp actually depends on implementation details of libc++, which is not great; but I just left a comment and moved on.

[libc++] [test] Make iter_difference_t.pass.cpp into a .compile.pass.…

bb523cc

…cpp. NFCI.

[SystemZ] Return true from preferZeroCompareBranch().

e77cb4a

Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D103057

[VectorCombine] Remove unneeded InsertPointGuard (NFCI).

8e83ff5

All users of the builder should set an insert point before using the builder. There should be no need for using InsertPointGuard here.

[CostModel][X86] Improve accuracy of 256-bit non-uniform vector shift…

def6269

…s on AVX1 Determined from llvm-mca analysis, AVX1 capable targets have a higher throughput for VPBLENDVB and shuffle ops, making it cheaper to perform shift+shuffle/select shift patterns.

[libc++] Try to fix the oss-fuzz failure

d95a4b9

[SCEV] Remove unused parameter from computeBECount [NFC]

a47b2d4

All callers pass "false" for the Equality parameter. Kill the dead code, and update the function block comment.

[SCEV] Cleanup doesIVOverflowOnX checks [NFC]

aabca2d

Stylistic changes only. 1) Don't pass a parameter just to do an early exit. 2) Use a name which matches actual behavior.

Revert "[OpaquePtr] Make atomicrmw work with opaque pointers"

0bbb502

This reverts commit 0bebda1. Causing "Invalid record" errors.

[AMDGPU][GlobalISel] Stop foldInsertEltToCmpSelect from changing reg …

18c5444

…banks This function can change regbank for registers which already have a selected bank. Depending on the instruction where these registers were used it can cause instruction selection to fail.

[mlir][linalg] Update Linalg.md (NFC).

6779fcb

Update the paragraph on generic / indexed_generic to reflect the unification of these operations. Differential Revision: https://reviews.llvm.org/D102775

[Hexagon] Improve argument packing in vector shuffle selection

e7c839b

[scudo] Consistent setting of SCUDO_DEBUG

6a84d37

Make sure that if SCUDO_DEBUG=1 in tests then we had the same in the scudo library itself. Reviewed By: cryptoad, hctim Differential Revision: https://reviews.llvm.org/D103061

[scudo] Fix CHECK implementation

8e30b55

Cast of signed types to u64 breaks comparison. Also remove double () around operands. Reviewed By: cryptoad, hctim Differential Revision: https://reviews.llvm.org/D103060

[mlir] Add an optional distributionTypes attribute to TiledLoopOp.

2ea6e13

Differential Revision: https://reviews.llvm.org/D103104

nicolasvasilache and others added 16 commits May 27, 2021 12:48

Fix -Wswitch warning; NFC

ce276b7

AMDGPU/GlobalISel: Remove redundant parameter from function

8a203ac

AMDGPU/GlobalISel: Lower constant-32-bit zextload/sextload consistently

ee35900

We were accidentally leaning on code in lowerLoad which expands extending loads which should be removed.

Speculatively fix a -Woverloaded-virtual diagnostic; NFC

758f51c

Speculatively fix this harder and with improved spelling capabilities.

caf86d2

Merge remote-tracking branch 'otcshare_llvm/sycl-web' into llvmspirv_…

62359ea

…pulldown

Merge commit 'caf86d2959d5e900ed29af5e0ae2be23e3d299c5' into llvmspir…

fcc9629

…v_pulldown

Fix for ParseChecksum function

9a47bbf

* Fix for ParseChecksum function This is a patch to fix failure, caused by bringing as argument of ParseChecksum StringRef with non-zero length, but with empty data. Original commit: KhronosGroup/SPIRV-LLVM-Translator@b768773

bader mentioned this pull request Jun 1, 2021

[SYCL] Fix self build on windows. #3845

Merged

Revert "Emit correct location lists with basic block sections."

fd85b9d

Breaks check-llvm on non-linux, see comments on https://reviews.llvm.org/D85085 This reverts commit caae570 and follow-up commit 1546c52.

vmaksimo marked this pull request as ready for review June 2, 2021 16:50

vmaksimo requested review from AaronBallman, AGindinson, AlexeySachkov, AlexeySotkin, bader, elizabethandrews, mdtoguchi and premanandrao as code owners June 2, 2021 16:50

vladimirlaz merged commit 541e697 into intel:sycl Jun 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLVM and SPIRV-LLVM-Translator pulldown (WW23) #3852

LLVM and SPIRV-LLVM-Translator pulldown (WW23) #3852

Uh oh!

vmaksimo commented May 31, 2021

Uh oh!

vmaksimo commented May 31, 2021

Uh oh!

vmaksimo commented Jun 2, 2021

Uh oh!

Uh oh!

LLVM and SPIRV-LLVM-Translator pulldown (WW23) #3852

LLVM and SPIRV-LLVM-Translator pulldown (WW23) #3852

Uh oh!

Conversation

vmaksimo commented May 31, 2021

Uh oh!

vmaksimo commented May 31, 2021

Uh oh!

vmaksimo commented Jun 2, 2021

Uh oh!

Uh oh!