-
Notifications
You must be signed in to change notification settings - Fork 769
LLVM and SPIRV-LLVM-Translator pulldown (WW23) #3852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cxx20_iterator_traits.compile.pass.cpp actually depends on implementation details of libc++, which is not great; but I just left a comment and moved on.
- Currently, the host cpu information is not easily available on z/OS as in other platforms. - This information is stored in the Communications Vector Table (https://www.ibm.com/docs/en/zos/2.2.0?topic=information-cvt-mapping) Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D102793
Currently, BPF only contains three relocations: R_BPF_NONE for no relocation R_BPF_64_64 for LD_imm64 and normal 64-bit data relocation R_BPF_64_32 for call insn and normal 32-bit data relocation Also .BTF and .BTF.ext sections contain symbols in allocated program and data sections. These two sections reserved 32bit space to hold the offset relative to the symbol's section. When LLVM JIT is used, the LLVM ExecutionEngine RuntimeDyld may attempt to resolve relocations for .BTF and .BTF.ext, which we want to prevent. So we used R_BPF_NONE for such relocations. This all works fine until when we try to do linking of multiple objects. . R_BPF_64_64 handling of LD_imm64 vs. normal 64-bit data is different, so lld target->relocate() needs more context to do a correct job. . The same for R_BPF_64_32. More context is needed for lld target->relocate() to differentiate call insn vs. normal 32-bit data relocation. . Since relocations in .BTF and .BTF.ext are set to R_BPF_NONE, they will not be relocated properly when multiple .BTF/.BTF.ext sections are merged by lld. This patch intends to address this issue by adding additional relocation kinds: R_BPF_64_ABS64 for normal 64-bit data relocation R_BPF_64_ABS32 for normal 32-bit data relocation R_BPF_64_NODYLD32 for .BTF and .BTF.ext style relocations. The old R_BPF_64_{64,32} semantics: R_BPF_64_64 for LD_imm64 relocation R_BPF_64_32 for call insn relocation The existing R_BPF_64_64/R_BPF_64_32 mapping to numeric values is maintained. They are the most common use cases for bpf programs and we want to maintain backward compatibility as much as possible. ExecutionEngine RuntimeDyld BPF relocations are adjusted as well. R_BPF_64_{ABS64,ABS32} relocations will be resolved properly and other relocations will be ignored. Two tests are added for RuntimeDyld. Not handling R_BPF_64_NODYLD32 in RuntimeDyldELF.cpp will result in "Relocation type not implemented yet!" fatal error. FK_SecRel_4 usages in BPFAsmBackend.cpp and BPFELFObjectWriter.cpp are removed as they are not triggered in BPF backend. BPF backend used FK_SecRel_8 for LD_imm64 instruction operands. Differential Revision: https://reviews.llvm.org/D102712
Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D103057
Said function had a few shortfalls: - didn't set an abort message on Android - was logged on several lines - didn't provide extra information like the size requested if OOM'ing This improves the function to address those points. Differential Revision: https://reviews.llvm.org/D103034
…to intptr_t A test in ir.c makes use of casting a void* to an integer type to print it's address. This cast is currently done with the datatype `long` however, which is only guaranteed to be equal to the pointer width on LP64 system. Other platforms may use a length not equal to the pointer width. 64bit Windows as an example uses 32 bit for `long` which does not match the 64 bit pointers. This also results in clang warning due to `-Wvoid-pointer-to-int-cast`. Technically speaking, since the test only passes the value 42, it does not cause any issues, but it'd be nice to fix the warning at least. Differential Revision: https://reviews.llvm.org/D103085
All users of the builder should set an insert point before using the builder. There should be no need for using InsertPointGuard here.
…s on AVX1 Determined from llvm-mca analysis, AVX1 capable targets have a higher throughput for VPBLENDVB and shuffle ops, making it cheaper to perform shift+shuffle/select shift patterns.
Match whats documented in the Intel AOM - the XMM variant of PSHUFB requires BOTH ports - this was being incorrectly modelled as EITHER port. Now that we can use in-order models in llvm-mca, the atom model is a good "worst case scenario" analysis for x86.
We are using TOCEntry symbols like `LC..0` in TOC loads, this is hard to read , at least requiring an additional step to figure out the loaded symbols. We should print out the name in comments. Reviewed By: #powerpc, shchenz Differential Revision: https://reviews.llvm.org/D102949
Removed some of the older raw "MLIRized" versions that are no longer needed now that the sparse runtime support library can focus on the proper sparse tensor types rather than the opague pointer approach of the past. This avoids legacy... Reviewed By: penpornk Differential Revision: https://reviews.llvm.org/D102960
All callers pass "false" for the Equality parameter. Kill the dead code, and update the function block comment.
The parseInputFile function returns an empty unique_ptr to signal an error, like when the input file doesn't exist, or is malformed. In this case, the tool should exit immediately rather than segfault by dereferencing the unique_ptr later. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D102891
Stylistic changes only. 1) Don't pass a parameter just to do an early exit. 2) Use a name which matches actual behavior.
This reverts commit 0bebda1. Causing "Invalid record" errors.
The 2nd test is based on the fuzzer example in post-commit comments of D101191 - https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34661 The 1st test shows that we don't deal with this symmetrically. We should be able to reduce both examples (possibly in instsimplify instead of instcombine).
…banks This function can change regbank for registers which already have a selected bank. Depending on the instruction where these registers were used it can cause instruction selection to fail.
A recent fix for problems with ENTRY statement handling didn't get the case of a procedure dummy argument on an ENTRY statement in an executable part right; the code presumed that those dummy arguments would be objects, not entities that might be objects or procedures. Fix. Differential Revision: https://reviews.llvm.org/D103098
llvm-profgen uses profile summary based cold threshold to merge and trim cold context profile. This is to strike a good balance between profile size and performance. We've been using 99.9% as the cutoff to save profile size without affecting performance. This change switch to use 99.9% instead of 99.9999% as default cold threshold cutoff for llvm-profgen. Redundant switch csprof-cold-thres is also removed and tests cleaned up. Differential Revision: https://reviews.llvm.org/D103071
Update the paragraph on generic / indexed_generic to reflect the unification of these operations. Differential Revision: https://reviews.llvm.org/D102775
Make sure that if SCUDO_DEBUG=1 in tests then we had the same in the scudo library itself. Reviewed By: cryptoad, hctim Differential Revision: https://reviews.llvm.org/D103061
Cast of signed types to u64 breaks comparison. Also remove double () around operands. Reviewed By: cryptoad, hctim Differential Revision: https://reviews.llvm.org/D103060
…ly infinite loops into finite ones Nowadays LLVM does not assume that all loops are finite, so if we want to produce a finite loop from a potentially-infinite one, we must ensure that the original loop is known to be a finite one. For this transform, it only matters for arithmetic right-shifts. For them, either the function or the loop must be known to be `mustprogress`, or the original value being shifted must be known to be non-negative (because iff the sign bit was set, it will never become zero, but will become `-1` in the "end"). It would be really good for alive2 to actually complain about this, but it currently does not: AliveToolkit/alive2#726
Now that we can fold some transposes into multiplies (CM: A * B^t and RM: A^t * B), we want to move them around to create the optimal expressions: * fold away double transposes while still using them to assert the shape * sink transposes hoping they cancel out * lift transposes when both operands are transposed This also modifies the matrix remarks to include the number of exposed transposes (i.e. transposes that we couldn't fold into a multiply). The adjustment to the test remarks-inlining is a bit subtle: I am changing the double transpose to a single transpose so that we don't remove it completely. More importantly this changes some of the total instruction count, most notable stores because we can no longer use a vector store. Differential Revision: https://reviews.llvm.org/D102733
This patch is the third in a series of patches fixing markdown links and references inside the mlir documentation. This patch addresses all broken references to other markdown files and sections inside the Tutorials folder. Differential Revision: https://reviews.llvm.org/D103017
…alue The semantics of select with undefined/poison condition are not explicitly stated in the LangRef, but this matches comments in the code and Alive2 appears to concur: https://alive2.llvm.org/ce/z/KXytmd We can find this pattern after demanded elements transforms. As noted in D101191, fuzzers are finding infinite loops because we may not account for this pattern in other passes.
…(5/n) This revision refactors and simplifies the pattern detection logic: thanks to SSA value properties, we can actually look at all the uses of a given value and avoid having to pattern-match specific chains of operations. A bufferization pattern for subtensor is added and specific inplaceability analysis is implemented for the simple case of subtensor. More advanced use cases will follow. Differential revision: https://reviews.llvm.org/D102512
WG14 adopted N2645 and WG21 EWG has accepted P2334 in principle (still subject to full EWG vote + CWG review + plenary vote), which add support for #elifdef as shorthand for #elif defined and #elifndef as shorthand for #elif !defined. This patch adds support for the new preprocessor directives.
For uniform ReplicateRecipes, only the first lane should be used, so sinking them would mean we have to compute the value of the first lane multiple times. Also, at the moment, sinking them causes a crash because the value of the first lane is re-used by all users. Reported post-commit for D100258.
The vector calling convention dictates that when the vector argument registers are exhaused, GPRs are used to pass the address via the stack. When the GPRs themselves are exhausted, at best we would previously crash with an assertion, and at worst we'd generate incorrect code. This patch addresses this issue by passing fixed-length vectors via the stack with their full fixed-length size and aligned to their element type size. Since the calling convention lowering can't yet handle scalable vector types, this patch adds a fatal error to make it clear that we are lacking in this regard. Reviewed By: HsiangKai Differential Revision: https://reviews.llvm.org/D102422
DAGCombine's `mergeStoresOfConstantsOrVecElts` optimization is told whether it's to use vector types and also whether it's to issue a truncating store. However, the truncating store code path assumes a scalar integer `ConstantSDNode`, and when using vector types it creates either a `BUILD_VECTOR` or `CONCAT_VECTORS` to store: neither of which is a constant. The `riscv64` target is able to expose a crash here because it switches on both code paths at the same time. The `f32` is stored as `i32` which must be promoted to `i64`, necessitating a truncating store. It also decides later that it prefers a vector store of `v2f32`. While vector truncating stores are legal, this combine is not able to emit them. We also don't have a test case. This patch adds an assert to catch this case more gracefully, and updates one of the caller functions to the function to turn off the use of truncating stores when preferring vectors. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103173
We were accidentally leaning on code in lowerLoad which expands extending loads which should be removed.
* Fix for ParseChecksum function This is a patch to fix failure, caused by bringing as argument of ParseChecksum StringRef with non-zero length, but with empty data. Original commit: KhronosGroup/SPIRV-LLVM-Translator@b768773
Some FPGA devices and toolchains can support customizable levels or implementation of pipeline parallelism when mapping a SPIR-V module to hardware. Through pipeline parallelism, multiple invocations of a kernel or function can execute concurrently. This extension adds decorations to request that a kernel or function support invocations at a specified initiation interval, that multiple invocations are forbidden from executing concurrently, or that the kernel or function is limited to a maximum number of concurrent invocations. Spec: https://github.com/KhronosGroup/SPIRV-Registry/blob/38a7e56b9b7feab7fa8b3bea08770f7232b76ed2/extensions/INTEL/SPV_INTEL_fpga_invocation_pipelining_attributes.asciidoc Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com> Original commit: KhronosGroup/SPIRV-LLVM-Translator@e303a5f
This extension adds a type-declaration instruction OpTypeTokenINTEL, which is an analog of LLVM's type token. This type specifies IDs of entry points and exit points of certain IL constructs, for example it can be useful for exception handling representation in IL. Spec: intel#3788 Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com> Original commit: KhronosGroup/SPIRV-LLVM-Translator@091c366
When some global variable contains constant function pointer to a function and this global variable is referenced from the pointed function, translation crashed because intializer for global variable was translated before mapping of OpVariable to LLVM value. It happened because during translation of initializer in such case the original variable was met for the second time, but it wasn't mapped to a proper llvm value. Original commit: KhronosGroup/SPIRV-LLVM-Translator@485c8c7
/summary:run |
Breaks check-llvm on non-linux, see comments on https://reviews.llvm.org/D85085 This reverts commit caae570 and follow-up commit 1546c52.
/summary:run |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
LLVM: llvm/llvm-project@caf86d2
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@485c8c7