forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pull] main from llvm:main #56
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I believe that this code doesn't care whether the offsets are known to be inbounds a priori. For the same reason the change is not testable, as the SCEV based fallback code will look through non-inbounds offsets anyway. So make it clear that there is no special inbounds requirement here.
I'm exploring marking microsoft/STL's std::expected as [[nodiscard]], which affects all functions returning std::expected, including its own monadic member functions. As usual, libc++'s test suite contains calls to these member functions to make sure they compile, but it's discarding the returns. I'm adding void casts to silence the [[nodiscard]] warnings without altering what the test is covering.
…huffle operand. (#119354) foldShuffleOfShuffles already handles "shuffle (shuffle x, undef), (shuffle y, undef)" patterns, this patch relaxes the requirement so it can handle cases where only a single operand is a shuffle (and the other can be any other value and will be kept in place). Fixes #86068
) This patch adds the following intrinsics: * 8-bit floating-point convert to half-precision and BFloat16. // Variants are also available for: _bf16 svfloat16_t svcvt1_f16[_mf8]_fpm(svmfloat8_t zn, fpm_t fpm); svfloat16_t svcvt2_f16[_mf8]_fpm(svmfloat8_t zn, fpm_t fpm); * 8-bit floating-point convert to half-precision and BFloat16 (top). // Variants are also available for: _bf16 svfloat16_t svcvtlt1_f16[_mf8]_fpm(svmfloat8_t zn, fpm_t fpm); svfloat16_t svcvtlt2_f16[_mf8]_fpm(svmfloat8_t zn, fpm_t fpm);
This patch supplements the fix introduced by PR #119319.
hlfir.elemental codegen optimize-out the final as_expr copy for temps local to its body, but sometimes, clean-up may have been emitted for this temp, and the code did not handle that. This caused #118922 and @113843. Only elide the copy if the as_expr is the last op.
#119366) …igibilityType This is the function we use to diagnose invalid types, so use it for those checks as well. NFC.
Originally added in #112035 cc @sunfishcode
When expanding a load into two loads, use nuw for the add that computes the offset from the base of the second load, because the original load doesn't straddle the address space. It turns out there's already a dedicated helper function for doing this, `getObjectPtrOffset`. This is in target-independent code, however in practice it only seems to affact WebAssembly code, because WebAssembly load and store instructions' constant offsets don't perform wrapping, so constant folding often depends on the nuw flag being present. This was noticed in the development of #119204.
…19320) This patch limits the permission of pre-commit libc pipelines. It also adds detailed comments to help future modifications.
Sven has been a long-standing contributor to OpenCL support in Clang and LLVM, consistently delivering high-quality commits and thorough code reviews. His deep expertise and proven track record demonstrate his commitment to advancing the project and maintaining its standards. I strongly believe Sven would excel as an OpenCL maintainer for Clang, ensuring the continued growth and reliability of OpenCL within the LLVM ecosystem. Unfortunately, due to other commitments I am stepping down from my duty as an OpenCL maintainer. Co-authored-by: Anastasia Stulova <astulova@nvidia.com>
Unfortunately, due to other commitments I am no longer able to host this community meeting. Co-authored-by: Anastasia Stulova <astulova@nvidia.com>
Including The frozen C++03 headers results in a lot of formatting changes in the main headers, so this splits these changes into a separate commit instead. This is part of https://discourse.llvm.org/t/rfc-freezing-c-03-headers-in-libc.
We've been having issues with cancelled CI jobs due to Docker-in-Docker failures (spurious). I tried addressing that by adding a new flavor of the restarter action, but that clearly hasn't worked out since we see a lot of these spurious failures. This PR is an attempt to fix this by changing the mainline CI restarter used by libc++, not just the workflow job I added for testing. I have to check this in to test it because this workflow uses the version that's on the main branch.
…g a vector [NFC] (#119134)
…info [NFC] (#119135) The `poison` values are used to substitute debug information of values moved from the original header into the preheader that are no longer available in the former.
They have been out for over 10 days, which only causes confusion when looking at CI results.
…8189) This patch fixes a const-qualification on the return type of a method of `limited_allocator`, which is widely used for testing allocator-aware containers.
…115713) * Avoid unnecessary truncation of comparison results in vecreduce_xor * Optimize generated code for vecreduce_and and vecreduce_or by comparing against 0.0 to check if all/any of the values are set Alive2 proof of vecreduce_and and vecreduce_or transformation: https://alive2.llvm.org/ce/z/SRfPtw
Add getMainOp and getAltOp. Use `InstructionsState &` instead of `const InstructionsState &`. Use `!S.isAltShuffle()` instead of `S.MainOp == S.AltOp`.
This PR emits implements the ability to emit the PPC version for both assembly and object files on AIX.
… the VSX FMA mutation pass. (#116071) The patch fix #116061 The root cause of the assertion is that the FMA mutation pass does not update the subranges of the live interval for the defined register of the modified instruction . it recalculate the live interval of the defined register of xvmaddmdp in the VSX FMA mutation pass.
Follow on from #111008.
…patterns of legalized types (#119363) In cases where the base/sub vector type in an insert_subvector pattern legalize to the same width through splitting, we can assume that the shuffle becomes free as the legalized vectors will not overlap. Note this isn't true if the vectors have been widened during legalization (e.g. v2f32 insertion into v4f32 would legalize to v4f32 into v4f32). Noticed while working on adding processShuffleMasks handling for SK_PermuteTwoSrc.
This patch simplifies the code in two different ways: * When SVE is available, return `cntd` directly to avoid the need for bitfield insert. * When SME is available, check the PSTATE.SM bit of `SVCR` directly rather than calling `__arm_sme_state`.
Enable `-fstack-clash-protection` for RISCV and stack probe for function prologues. We probe the stack by creating a loop that allocates and probe the stack in ProbeSize chunks. We emit an unrolled probe loop for small allocations and emit a variable length probe loop for bigger ones.
Adds initial CMake cache definition that is similar to what we use in one of our production buidlbots. The goal is to consolidate the configurations and make them accessible. This cache file is a first step and to prepare for full pipeline testing once the new bot comes online.
…2216) This adds support for these instructions.
Like we do in ExprConstant.cpp.
…p to strings.h (#118899) docgen relies on the convention that we have a file foo.cpp in libc/src/\<header\>/. Because the above functions weren't in libc/src/strings/ but rather libc/src/string/, docgen could not find that we had implemented these. Rather than add special carve outs to docgen, let's fix up our sources for these 7 functions to stick with the existing conventions the rest of the codebase follows. Link: #118860 Fixes: #118875
So that docgen can find our implementations. Fixes: #119272
so that docgen can find our definitions. Also eliminate the enums. POSIX is careful to call these "symbolic constants" rather than specifically whether they are preprocessor macro defines or not. Enums are useful to expressing mutual exclusion when the enum values are in distinct enums which can improve type safety. Our enum values weren't using that pattern though; they were all in one big anonymous enum. Link: https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/pthread.h.html Fixes: #88997
…more places We've decided to use vadd as the user instruction where possible for simplicity in this file. This patch uses vadd as user in more places.
…opt (#119372) Adds a new mlir-opt test-only pass, -test-vulkan-runner-pipeline, which runs a set of passes needed for mlir-vulkan-runner, and removes them from the runner. The tests are changed to invoke mlir-opt with this flag before invoking the runner. The passes moved are ones concerned with lowering of the device code prior to serialization to SPIR-V. This is an incremental step towards moving the entire pipeline to mlir-opt, to align with other runners (see #73457).
To temporarily address exchange2 perf regression reported in #118556 I disabled the inlining by default, and put it under engineering option `-flang-simplify-hlfir-sum`.
Clang [defaults to aligning `__int128_t` to 16 bytes], while LLVM `datalayout` strings [default to aligning `i128` to 8 bytes]. Wasm is currently using the defaults for both, so it's inconsistent. Fix this by adding `-i128:128` to Wasm's `datalayout` string so that it aligns `i128` to 16 bytes too. This is similar to [dbad963](dbad963) for SPARC. This fixes rust-lang/rust#133991; see that issue for further discussion. [defaults to aligning `__int128_t` to 16 bytes]: https://github.com/llvm/llvm-project/blob/f8b4182f076f8fe55f9d5f617b5a25008a77b22f/clang/lib/Basic/TargetInfo.cpp#L77 [default to aligning `i128` to 8 bytes]: https://llvm.org/docs/LangRef.html#langref-datalayout
A linkonce_odr definition can be omitted in LTO compilation if `canBeOmittedFromSymbolTable()` is true in all bitcode files. Test more linkage merge scenarios. The lo_and_wo symbol tests #111341.
Update root file in DWARF file/line table as soon as we see the first "#line" directive. This was moved from "enabledGenDwarfForAssembly", which is called right before we emit DWARF information. But if the file is empty or contains expressions that doesn't need DWARF, it is never called, leaving an original root file and not the file in the "#line" directive. Add a test checking for this case. Fixes: #119020
Add tests for the concatenation of boolean vectors bitcast to integers - similar to the MOVMSK pattern.
This PR adds default option below. The new options will come as default to true and not change the original lowering behavior of pack and unpack op. - lowerPadLikeWithInsertSlice to packOp (with default = true) - lowerUnpadLikeWithExtractSlice to unPackOp (with default = true) The motivation of the PR is finer granular control of the lowering of pack and unpack Ops. This is useful in particular when we want to guarantee that there's no additional insertslice and extractslice that interfere with tiling. With the original lowering pipeline, packOp and unPackOp may be lowered to insertslice and extractslice when the high dimensions are unit dimensions and no transpose is invovled. Under such circumstances, such insert and extract slice ops will block producer/consumer fusion tile + fuse transforms. With this PR, we will be able to disable such lowering path and allow consumer fusion to go through as expected.
Compared to the python version, this also does type checking and error handling, so it's slightly longer, however, it's still comfortably under 500 lines.
…/attributor-flatscratchinit.ll`
MRI is already available where this is instantiated.
When checking for the poison elements in the matches node, need to consider the register number, when clearing the corresponding mask element. Fixes #119393
Add missing conversion.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by pull[bot] (v2.0.0-alpha.1)
Can you help keep this open source service alive? 💖 Please sponsor : )