-
Notifications
You must be signed in to change notification settings - Fork 772
LLVM and LLVM-SPIRV-Translator pulldown #1192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
vladimirlaz
merged 882 commits into
intel:sycl
from
vladimirlaz:private/vlazarev/llvmspirv_pulldown
Feb 26, 2020
Merged
LLVM and LLVM-SPIRV-Translator pulldown #1192
vladimirlaz
merged 882 commits into
intel:sycl
from
vladimirlaz:private/vlazarev/llvmspirv_pulldown
Feb 26, 2020
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I've noticed that it is not convenient to create YAMLs from binaries (using obj2yaml) that have to be test cases for obj2yaml later (after applying yaml2obj). The problem, for example is that obj2yaml emits "DynamicSymbols:" key instead of .dynsym. It also does not create .dynstr. And when a YAML document without explicitly defined .dynsym/.dynstr is given to yaml2obj, we have issues: 1) These sections are placed after non-allocatable sections (I've fixed it in D74756). 2) They have VA == 0. User needs create descriptions for such sections explicitly manually to set a VA. This patch addresses (2). I suggest to let yaml2obj assign virtual addresses by itself. It makes an output binary to be much closer to "normal" ELF. (It is still possible to use "Address: 0x0" for a section to get the original behavior if it is needed) Differential revision: https://reviews.llvm.org/D74764
D74764 (https://reviews.llvm.org/rG31f2ad9c368d47721508cbd0d120d626f9041715) changed the behavior of the yaml2obj. Now it assigns virtual addresses for allocatable sections. SymbolFile/Breakpad/symtab.test started to fail after this change: (http://lab.llvm.org:8011/builders/lldb-x86_64-debian/builds/5520/steps/test/logs/stdio) Command Output (stderr): -- /home/worker/lldb-x86_64-debian/lldb-x86_64-debian/llvm-project/lldb/test/Shell/SymbolFile/Breakpad/symtab.test:6:10: error: CHECK: expected string not found in input # CHECK: Symtab, file = {{.*}}symtab.out, num_symbols = 5: ^ <stdin>:15:1: note: scanning from here Symtab, file = /home/worker/lldb-x86_64-debian/lldb-x86_64-debian/build/tools/lldb/test/SymbolFile/Breakpad/Output/symtab.out, num_symbols = 6: ^ <stdin>:15:99: note: possible intended match here Symtab, file = /home/worker/lldb-x86_64-debian/lldb-x86_64-debian/build/tools/lldb/test/SymbolFile/Breakpad/Output/symtab.out, num_symbols = 6: For now I've updated the basic-elf.yaml so that now it produce the same layout as before D74764. Breakpad/symtab.test should be updated it seems.
With this --shuffle-sections=seed produces the same result in every host. Reviewed By: grimar, MaskRay Differential Revision: https://reviews.llvm.org/D74971
The diagnostic added in D72231 also shows a diagnostic when casting to a _Bool. This is unwanted. This patch removes the diagnostic for _Bool types. Differential Revision: https://reviews.llvm.org/D74860
This patch adds new errors and error checking to the ObjectLinkingLayer to catch cases where a compiled or loaded object either: (1) Contains definitions not covered by its responsibility set, or (2) Is missing definitions that are covered by its responsibility set. Proir to this patch providing the correct set of definitions was treated as an API contract requirement, however this requires that the client be confident in the correctness of the whole compiler / object-cache pipeline and results in difficult-to-debug assertions upon failure. Treating this as a recoverable error results in clearer diagnostics. The performance overhead of this check is one comparison of densemap keys (symbol string pointers) per linking object, which is minimal. If this overhead ever becomes a problem we can add the check under a flag that can be turned off if the client fully trusts the rest of the pipeline.
…unctions. The GenericLLVMIRPlatformSupport class runs a transform on all LLVM IR added to the LLJIT instance to replace instances of llvm.global_ctors with a specially named function that runs the corresponing static initializers (See (GlobalCtorDtorScraper from lib/ExecutionEngine/Orc/LLJIT.cpp). This patch updates the GenericIRPlatform class to check for this specially named function in other materialization units that are added to the JIT and, if found, add the function to the initializer work queue. Doing this allows object files that were compiled from IR and cached to be reloaded in subsequent JIT sessions without their initializers being skipped. To enable testing this patch also updates the lli tool's -jit-kind=orc-lazy mode to respect the -enable-cache-manager and -object-cache-dir options, and modifies the CompileOnDemandLayer to rename extracted submodules to include a hash of the names of their symbol definitions. This allows a simple object caching scheme based on module names (which was already implemented in lli) to work with the lazy JIT.
This is similar to using movd which we do for sse2 targets. I've added a DAG combine for VEXTRACT_STORE to use SimplifyDemandedVectorElts to clean up some artifacts from type legalization.
…XT_LOAD with a 64 bit memory size on SSE1 targets. We can use MOVLPS which will load 64 bits, but we need a v4f32 result type. We already have isel patterns for this. The code here is a little hacky. We can probably improve it with more isel patterns.
…fyDemandedVectorElts that are called on an operand of N. If a simplication occurs the operand will be added to the worklist. But since the demanded mask was based on N, we need to make sure we revisit N in case there are more simplifications to be done. Returning SDValue(N, 0) as we do, only tells DAG combine that something changed, but that won't make it add anything to the worklist. Found while playing around with using VEXTRACT_STORE in more cases. But I guess this doesn't affect any of our existing tests.
The extra available vector types on sse2 causes us to produce different code.
Summary: Old: 500ms always. New: rebuild time, up to 500ms. Fixes clangd/clangd#275 Reviewers: hokein Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D73949
…f token-after-cursor fails. This reverts commit 6af1ad2.
…cursor if token-after-cursor fails." This reverts commit a2ce807. Buildbot failures on GCC due to SelectionTree not being copyable, and instantiating vector<Selection> in the tweak-handling in ClangdServer.
Add a map from BasicBlocks to overlap intervals. For partial writes, we can keep track of those in IOLs. We only add candidates that are valid for eliminations. Reviewers: dmgreen, bryant, asbirlea, Tyker Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D73757
A question about this behavior came up on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2020-February/139003.html ...and as part of backend improvements in D73978. We decided not to implement a more general change that would have folded any FP binop with nearly arbitrary constant + undef operand to undef because that is not theoretically correct (even if it is practically correct). This is the SDAG-equivalent to the IR change in D74713.
…heck." This version fixes a buildbot failure cause by picking the wrong insert point for XORs. We cannot pick the XOR binary operator as insert point, as it is not guaranteed that both input operands for the overflow intrinsic are defined before it. This reverts the revert commit c7fc0e5.
Changed after 7769030.
…f token-after-cursor fails. This reverts commit b4b9706. Now avoiding expected<vector<selection>> in favor of expected<vector<unique_ptr<selection>>>
In order to build the Linux kernel, the back chain must be supported with packed-stack. The back chain is then stored topmost in the register save area. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D74506
…with sse1. Still a little room for improvement by using movlps to store to the stack temporary needed to move data out of the xmm register after the load.
…mIntrinsicSDNode as we usually do. Leave the gather/scatter subclasses, but make them inherit from MemIntrinsicSDNode and delete their constructor and destructor. This way we can still have the getIndex, getMask, etc. convenience functions.
Move the implementation of __libcpp_thread_poll_with_backoff and __libcpp_timed_backoff_policy::operator() out of the _LIBCPP_HAS_THREAD_API_PTHREAD block. None of the code in these methods is pthreads specific. Also add "inline _LIBCPP_INLINE_VISIBILITY" to __libcpp_timed_backoff_policy::operator(), to avoid errors due to multiple definitions of the operator. Contrary to __libcpp_thread_poll_with_backoff (which is a template function), this is a normal non-templated method. Differential Revision: https://reviews.llvm.org/D75102
A goto label uses the 'l' constraint, skipping it can cause unexpected warnings.
…Helper() Summary: Future patches will make use of TTI to perform cost-model-driven `SCEVExpander::isHighCostExpansionHelper()` This is a fully NFC patch to make things reviewable. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, zzheng, javed.absar, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73704
…CEVExpander::isHighCostExpansionHelper() Summary: In future patches`SCEVExpander::isHighCostExpansionHelper()` will respect the budget allocated by performing TTI cost modelling. This is a fully NFC patch to make things reviewable. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, zzheng, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73705
…processed expression first Summary: As far as i can tell this is still NFC. Initially in rL146438 it was added at the top of the function, later rL238507 dethroned it, and rL244474 did it again. I'm not sure if we have already checked the cost of this expansion, we should be doing that again. Reviewers: reames, mkazantsev, wmi, sanjoy, atrick, igor-laevsky Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73706
…ided Summary: Currently, as per `check-llvm`, we never call `SCEVExpander::isHighCostExpansion()` with null TTI, so this appears to be a safe restriction. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: javed.absar, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73712
…g - model cast cost Summary: This is not a NFC, although it does not change any of the existing tests. I'm not really sure if we should have specific tests for the cost modelling itself. This is the first patch that actually makes `SCEVExpander::isHighCostExpansionHelper()` account for the cost of the SCEV expression, and consider the budget available, by modelling cast expressions. I believe the logic itself is "pretty obviously correct" - from budget, we need to subtract the cost of the cast expression from inner type `Op->getType()` to the `S->getType()` type, and recurse into the expression we are casting. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: xbolva00, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73716
…power-of-two as LShr Summary: Like with casts, we need to subtract the cost of `lshr` instruction from budget, and recurse into LHS operand. Seems "pretty obviously correct" to me? To be noted, there is a number of other shortcuts we //could// cost-model: * `... + (-1 * ...)` -> `... - ...` <- likely very frequent case * `x - (rem x, power-of-2)`, which is currently `(x udiv power-of-2) * power-of-2` -> `x & -log2(power-of-2)` * `rem x, power-of-2`, which is currently `x - ((x udiv power-of-2) * power-of-2)` -> `x & log2(power-of-2)-1` * `... * power-of-2` -> `... << log2(power-of-2)` <- likely not very beneficial Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73718
…_test2.ll %tmp prefix confuses auto-update scripts
Summary: If we don't believe this UDiv is actually a LShr in disguise, things are much worse. First, we try to see if this UDiv actually originates from user code, by looking for `S + 1`, and if found considering this UDiv to be free. But otherwise, we always considered this UDiv to be high-cost. However that is no longer the case with TTI-driven cost model: our default budget is 4, which matches the default cost of UDiv, so now we allow a single UDiv to not be counted as high-cost. While that is the case, it is evident this is actually a regression due to the fact that cost-modelling is incomplete - we did not account for the `add`, `mul` costs yet. That is being addressed in D73728. Cost-modelling for UDiv also seems pretty straight-forward: subtract cost of the UDiv itself, and recurse into both the LHS and RHS. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73722
Summary: While this resolves the regression from D73722 in `llvm/test/Transforms/IndVarSimplify/exit_value_test2.ll`, this now regresses `llvm/test/Transforms/IndVarSimplify/elim-extend.ll` `@nestedIV` test, we no longer can perform that expansion within default budget of `4`, but require budget of `6`. That regression is being addressed by D73777. The basic idea here is simple. ``` Op0, Op1, Op2 ... | | | \--+--/ | | | \---+---/ ``` I.e. given N operands, we will have N-1 operations, so we have to add cost of an add (mul) for **every** Op processed, **except** the first one, plus we need to recurse into *every* Op. I'm guessing there's already canonicalization that ensures we won't have `1` operand in `scMulExpr`, and no `0` in `scAddExpr`/`scMulExpr`. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73728
…al recurrence Summary: So, i wouldn't call this *obviously* correct, but i think i got it right this time :) Roughly, we have ``` Op0*x^0 + Op1*x^1 + Op2*x^2 ... ``` where `Op_{n} * x^{n}` is called term, and `n` the degree of term. Due to the way they are stored internally in `SCEVAddRecExpr`, i believe we can have `Op_{n}` to be `0`, so we should not charge for those. I think it is most straight-forward to count the cost in 4 steps: 1. First, count it the same way we counted `scAddExpr`, but be sure to skip terms with zero constants. Much like with `add` expr we will have one less addition than number of terms. 2. Each non-constant term (term degree >= 1) requires a multiplication between the `Op_{n}` and `x^{n}`. But again, only charge for it if it is required - `Op_{n}` must not be 0 (no term) or 1 (no multiplication needed), and obviously don't charge constant terms (`x^0 == 1`). 3. We must charge for all the `x^0`..`x^{poly_degree}` themselves. Since `x^{poly_degree}` is `x * x * ... * x`, i.e. `poly_degree` `x`'es multiplied, for final `poly_degree` term we again require `poly_degree-1` multiplications. Note that all the `x^{0}`..`x^{poly_degree-1}` will be computed for the free along the way there. 4. And finally, the operands themselves. Here, much like with add/mul exprs, we really don't look for preexisting instructions.. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73741
…(PR44668) Summary: Previosly we simply always said that `SCEVMinMaxExpr` is too costly to expand. But this isn't really true, it expands into just a comparison+swap pair. And again much like with add/mul, there will be one less such pair than the number of operands. And we need to count the cost of operands themselves. This does change a number of testcases, and as far as i can tell, all of these changes are improvements, in the sense that we fixed up more latches to do the [in]equality comparison. This concludes cost-modelling changes, no other SCEV expressions exist as of now. This is a part of addressing [[ https://bugs.llvm.org/show_bug.cgi?id=44668 | PR44668 ]]. Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73744
… if cheap (PR44668) Summary: Replacing uses of IV outside of the loop is likely generally useful, but `rewriteLoopExitValues()` is cautious, and if it isn't told to always perform the replacement, and there are hard uses of IV in loop, it doesn't replace. In [[ https://bugs.llvm.org/show_bug.cgi?id=44668 | PR44668 ]], that prevents `-indvars` from replacing uses of induction variable after the loop, which might be one of the optimization failures preventing that code from being vectorized. Instead, now that the cost model is fixed, i believe we should be a little bit more optimistic, and also perform replacement if we believe it is within our budget. Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44668 | PR44668 ]]. Reviewers: reames, mkazantsev, asbirlea, fhahn, skatkov Reviewed By: mkazantsev Subscribers: nikic, hiraditya, zzheng, javed.absar, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73501
…sHighCostExpansion() Summary: This addresses the `llvm/test/Transforms/IndVarSimplify/elim-extend.ll` `@nestedIV` regression from D73728 Reviewers: reames, mkazantsev, wmi, sanjoy Reviewed By: mkazantsev Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73777
…ress. Verifies that an argument passed to __builtin_frame_address or __builtin_return_address is within the range [0, 0xFFFF] Differential revision: https://reviews.llvm.org/D66839 Re-committed after fixed: c93112d
CONFLICT (content): Merge conflict in clang/lib/Sema/SemaChecking.cpp
Signed-off-by: Alexey Sotkin <alexey.sotkin@intel.com>
Signed-off-by: Alexey Sotkin <alexey.sotkin@intel.com>
Relax a limitation of the SPIR-V writer. The memset intrinsic is mapped to `OpCopyMemorySized`, causing it to be converted to memcpy when converting back from SPIR-V to LLVM. Fixes KhronosGroup/SPIRV-LLVM-Translator#357
atomicrmw with nand, fadd and fsub operation can't be represented in SPIR-V. Signed-off-by: Alexey Sotkin <alexey.sotkin@intel.com>
Signed-off-by: Vladimir Lazarev <vladimir.lazarev@intel.com>
vladimirlaz
pushed a commit
to vladimirlaz/llvm
that referenced
this pull request
Sep 21, 2021
… update (intel#1192) DecorationSingleElementVectorINTEl now has one extra operand representing number of pointer stars owned by the element of vector. It helps to restore scalars after translation back. Eg: <i32* x 1>** -> i32*** "VCSingleElementVector"="1" Also it now can be applied to global variables. Original commit: KhronosGroup/SPIRV-LLVM-Translator@4eb63bf
alexbatashev
added a commit
to alexbatashev/llvm
that referenced
this pull request
Sep 24, 2021
* upstream/sycl: (2344 commits) [ESIMD] Rename slm_load4/slm_store4 to slm_load_rgba/slm_store_rgba (intel#4158) [SYCL] Avoid re-computing group_range in nd_item::get_group_range() (intel#4621) [clang-offload-extract] Ignore zero padding in .tgting section (intel#4622) [Driver][SYCL] Fix -fsycl-help output when redirected (intel#4619) [Driver][SYCL][FPGA] Do not unbundle aoco as an archive for hardware (intel#4477) [Driver][SYCL] Fix offload-bundler and offload-deps triples (intel#4616) [SYCL] Fix bit_cast for half type (intel#4603) [SYCL] Fix a typo in accessor::get_range method (intel#4556) [SYCL] Detach allocas from their dependencies regardless of linked alloca presence (intel#4573) [SYCL][L0] Make sure that we only query/sync host-visible events from the host. (intel#4613) Fix tests with wrong alias metadata [Driver][SYCL] Fixup arguments to llc call for PIC and code-model (intel#4614) [SYCL][L0] Add ownership control for LeveL-Zero kernel_bundle interop. (intel#4576) [SYCL][Driver] Expose llvm-foreach --jobs functionality through a driver option (intel#4543) [SYCL] Prevent stream buffer leak on constructor exception (intel#4594) [ESIMD] Replace mask_type_t with simd_mask to represent Gen predicates. (intel#4230) Fix for a bunch of fixed point integer SPIR-V instructions (intel#1213) Amend SingleElementVectorINTEL decoration use cases according to spec update (intel#1192) Enforce UserSemantic decoration if no FPGA decorations found [SYCL][CUDA] Fix context scope in kernel launch (intel#4606) ...
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
LLVM: 6201f66
LLVM-SPIRV-Translator: 39edc1dc