-
Notifications
You must be signed in to change notification settings - Fork 769
LLVM and LLVM-SPIRV-Translator pulldown #1509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
If we have a must-tail call the callee and caller need to have matching ABIs. Part of that is alignment which we might modify when we deduce alignment of arguments of either. Since we would need to keep them in sync, which is not as simple, we simply avoid deducing alignment for arguments of the must-tail caller or callee. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D76673
This reverts commit 0071eaa. Inputs/noop-main.ll wasn't checked in, so this breaks check-llvm everywhere.
Use DL & ABI information for better alignment deduction, e.g., if a type is accessed and the ABI specifies an alignment requirement for such an access we can use it. This is based on a patch by @lebedev.ri and inspired by getBaseAlign in Loads.cpp. Depends on D76673. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D76674
It could happen that we delete the first function in the SCC in the future so we should be careful accessing `Functions` after the manifest stage.
While D68850 allowed functions to be deleted I accidentally saved some version of the function to be used once a suitable prefix was found. This turned out to be problematic when the occasionally deleted function is also occasionally modified. The test case is adjusted to resemble the case in which the problem was found. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D76586
This cannot be triggered right now, as far as I know, but it doesn't make sense to deduce a constant range on arguments of declarations. Exposed during testing of AAValueSimplify extensions.
There was a TODO in genericValueTraversal to provide the context instruction and due to the lack of it users that wanted one just used something available. Unfortunately, using a fixed instruction is wrong in the presence of PHIs so we need to update the context instruction properly. Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D76870
…e or library Summary: cmake fails with an error when attempting to evaluate $<TARGET_FILE:tgt> where `tgt` is defined via an `add_custom_target` and thus the `TYPE` is `UTILITY`. Requesting a TARGET_FILE only works on an `EXECUTABLE` or one of a few differetnt types of `X_LIBRARY` (e.g. added via `add_library` or `add_executable`). The logic as implemented in cmake is below: enum TargetType { EXECUTABLE, STATIC_LIBRARY, SHARED_LIBRARY, MODULE_LIBRARY, OBJECT_LIBRARY, UTILITY, GLOBAL_TARGET, INTERFACE_LIBRARY, UNKNOWN_LIBRARY }; if (target->GetType() >= cmStateEnums::OBJECT_LIBRARY && target->GetType() != cmStateEnums::UNKNOWN_LIBRARY) { ::reportError(context, content->GetOriginalExpression(), "Target \"" + name + "\" is not an executable or library."); return nullptr; } This has always been the case back to at least 3.12 (furthest I checked) but this is causing a new failure in cmake 3.17 while evaluating ExternalProjectAdd. Subscribers: mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77284
…int’ Intrinsic The requirement for deopt parameter to be in gc parameter if it can be modified by GC is very strong and difficult to follow. The key example of why this can't work: %p1 = bitcast i8* %p to i8* statepoint [gc = (%p1)], [deopt = (%p1)] The optimizer is allowed to replace either use (or both) of %p1 with %p. If it updates only one of the two (entirely legal), the two sets do not overlap. So this change removes the strong wording. Reviewers: reames, dantrushin Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D77122
…vtable symbol. In most cases, LLD prints its multiline diagnostic messages starting additional lines with ">>> ". That greatly helps external tools to parse the output, simplifying combining several lines of the log back into one message. The patch fixes the only message I found that does not follow the common pattern. Differential Revision: https://reviews.llvm.org/D77132
…ns.h This is not supported to change anything but allow us to reuse the math functions separately from the device functions, e.g., source them at different times. This will be used by the OpenMP overlay. This also adds two `return` keywords that were missing. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D77238
The math wrapper handling is going to be replaced shortly and d1705c1 was actually a precursor for that.
…form Follow-up of D76591 and D76907
It was added by D76591 for migration purposes (not all printBranchOperand users have migrated to the overload with `uint64_t Address`). Now that all have been migrated, the parameter can go away.
CONFLICT (content): Merge conflict in clang/lib/Sema/Sema.cpp
This commit removes support for building against the system libc++abi, which was supported on Apple platforms. This is basically never what we want to do, since libc++ and libc++abi are coupled and building a trunk libc++ against an older libc++abi can lead to incompatibilities (and good luck debugging them!). It might have made some sense to support that when the monorepo did not exist, however I don't think this is anything but a footgun nowadays. Furthermore, based on the newly-made assumption that we're building against the monorepo libc++abi, we can simplify the search path logic for finding libc++abi. This area of our build system has a lot of technical debt accumulated, and it's surprisingly difficult to change. We've tried different things and failed several times in the past. I did test this change on our Docker image for the build bots and on Apple platforms, however it is possible that this breaks some unknown configuration, in which case it should be fine to revert this (so we can try again!).
Summary: As noted in documentation, different repetition modes have different trade-offs: > .. option:: -repetition-mode=[duplicate|loop] > > Specify the repetition mode. `duplicate` will create a large, straight line > basic block with `num-repetitions` copies of the snippet. `loop` will wrap > the snippet in a loop which will be run `num-repetitions` times. The `loop` > mode tends to better hide the effects of the CPU frontend on architectures > that cache decoded instructions, but consumes a register for counting > iterations. Indeed. Example: >>! In D74156#1873657, @lebedev.ri wrote: > At least for `CMOV`, i'm seeing wildly different results > | | Latency | RThroughput | > | duplicate | 1 | 0.8 | > | loop | 2 | 0.6 | > where latency=1 seems correct, and i'd expect the througput to be close to 1/2 (since there are two execution units). This isn't great for analysis, at least for schedule model development. As discussed in excruciating detail in >>! In D74156#1924514, @gchatelet wrote: >>>! In D74156#1920632, @lebedev.ri wrote: >> ... did that explanation of the question i'm having made any sense? > > Thx for digging in the conversation ! > Ok it makes more sense now. > > I discussed it a bit with @courbet: > - We want the analysis tool to stay simple so we'd rather not make it knowledgeable of the repetition mode. > - We'd like to still be able to select either repetition mode to dig into special cases > > So we could add a third `min` repetition mode that would run both and take the minimum. It could be the default option. > Would you have some time to look what it would take to add this third mode? there appears to be an agreement that it is indeed sub-par, and that we should provide an optional, measurement (not analysis!) -time way to rectify the situation. However, the solutions isn't entirely straight-forward. We can just add an actual 'multiplexer' `MinSnippetRepetitor`, because if we just concatenate snippets produced by `DuplicateSnippetRepetitor` and `LoopSnippetRepetitor` and run+measure that, the measurement will naturally be different from what we'd get by running+measuring them separately and taking the min. ([[ https://www.wolframalpha.com/input/?i=%28x%2By%29%2F2+%21%3D+min%28x%2C+y%29 | `time(D+L)/2 != min(time(D), time(L))` ]]) Also, it seems best to me to have a single snippet instead of generating a snippet per repetition mode, since the only difference here is that the loop repetition mode reserves one register for loop counter. As far as i can tell, we can either teach `BenchmarkRunner::runConfiguration()` to produce a single report given multiple repetitors (as in the patch), or do that one layer higher - don't modify `BenchmarkRunner::runConfiguration()`, produce multiple reports, don't actually print each one, but aggregate them somehow and only print the final one. Initially i've gone ahead with the latter approach, but it didn't look like a natural fit; the former (as in the diff) does seem like a better fit to me. There's also a question of the test coverage. It sure currently does work here: ``` $ ./bin/llvm-exegesis --opcode-name=CMOV64rr --mode=inverse_throughput --repetition-mode=duplicate Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-8fb949.o --- mode: inverse_throughput key: instructions: - 'CMOV64rr RAX RAX R11 i_0x0' - 'CMOV64rr RBP RBP R15 i_0x0' - 'CMOV64rr RBX RBX RBX i_0x0' - 'CMOV64rr RCX RCX RBX i_0x0' - 'CMOV64rr RDI RDI R10 i_0x0' - 'CMOV64rr RDX RDX RAX i_0x0' - 'CMOV64rr RSI RSI RAX i_0x0' - 'CMOV64rr R8 R8 R8 i_0x0' - 'CMOV64rr R9 R9 RDX i_0x0' - 'CMOV64rr R10 R10 RBX i_0x0' - 'CMOV64rr R11 R11 R14 i_0x0' - 'CMOV64rr R12 R12 R9 i_0x0' - 'CMOV64rr R13 R13 R12 i_0x0' - 'CMOV64rr R14 R14 R15 i_0x0' - 'CMOV64rr R15 R15 R13 i_0x0' config: '' register_initial_values: - 'RAX=0x0' - 'R11=0x0' - 'EFLAGS=0x0' - 'RBP=0x0' - 'R15=0x0' - 'RBX=0x0' - 'RCX=0x0' - 'RDI=0x0' - 'R10=0x0' - 'RDX=0x0' - 'RSI=0x0' - 'R8=0x0' - 'R9=0x0' - 'R14=0x0' - 'R12=0x0' - 'R13=0x0' cpu_name: bdver2 llvm_triple: x86_64-unknown-linux-gnu num_repetitions: 10000 measurements: - { key: inverse_throughput, value: 0.819, per_snippet_value: 12.285 } error: '' info: instruction has tied variables, using static renaming. assembled_snippet: 5541574156415541545348B8000000000000000049BB00000000000000004883EC08C7042400000000C7442404000000009D48BD000000000000000049BF000000000000000048BB000000000000000048B9000000000000000048BF000000000000000049BA000000000000000048BA000000000000000048BE000000000000000049B8000000000000000049B9000000000000000049BE000000000000000049BC000000000000000049BD0000000000000000490F40C3490F40EF480F40DB480F40CB490F40FA480F40D0480F40F04D0F40C04C0F40CA4C0F40D34D0F40DE4D0F40E14D0F40EC4D0F40F74D0F40FD490F40C35B415C415D415E415F5DC3 ... $ ./bin/llvm-exegesis --opcode-name=CMOV64rr --mode=inverse_throughput --repetition-mode=loop Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-051eb3.o --- mode: inverse_throughput key: instructions: - 'CMOV64rr RAX RAX R11 i_0x0' - 'CMOV64rr RBP RBP RSI i_0x0' - 'CMOV64rr RBX RBX R9 i_0x0' - 'CMOV64rr RCX RCX RSI i_0x0' - 'CMOV64rr RDI RDI RBP i_0x0' - 'CMOV64rr RDX RDX R9 i_0x0' - 'CMOV64rr RSI RSI RDI i_0x0' - 'CMOV64rr R9 R9 R12 i_0x0' - 'CMOV64rr R10 R10 R11 i_0x0' - 'CMOV64rr R11 R11 R9 i_0x0' - 'CMOV64rr R12 R12 RBP i_0x0' - 'CMOV64rr R13 R13 RSI i_0x0' - 'CMOV64rr R14 R14 R14 i_0x0' - 'CMOV64rr R15 R15 R10 i_0x0' config: '' register_initial_values: - 'RAX=0x0' - 'R11=0x0' - 'EFLAGS=0x0' - 'RBP=0x0' - 'RSI=0x0' - 'RBX=0x0' - 'R9=0x0' - 'RCX=0x0' - 'RDI=0x0' - 'RDX=0x0' - 'R12=0x0' - 'R10=0x0' - 'R13=0x0' - 'R14=0x0' - 'R15=0x0' cpu_name: bdver2 llvm_triple: x86_64-unknown-linux-gnu num_repetitions: 10000 measurements: - { key: inverse_throughput, value: 0.6083, per_snippet_value: 8.5162 } error: '' info: instruction has tied variables, using static renaming. assembled_snippet: 5541574156415541545348B8000000000000000049BB00000000000000004883EC08C7042400000000C7442404000000009D48BD000000000000000048BE000000000000000048BB000000000000000049B9000000000000000048B9000000000000000048BF000000000000000048BA000000000000000049BC000000000000000049BA000000000000000049BD000000000000000049BE000000000000000049BF000000000000000049B80200000000000000490F40C3480F40EE490F40D9480F40CE480F40FD490F40D1480F40F74D0F40CC4D0F40D34D0F40D94C0F40E54C0F40EE4D0F40F64D0F40FA4983C0FF75C25B415C415D415E415F5DC3 ... $ ./bin/llvm-exegesis --opcode-name=CMOV64rr --mode=inverse_throughput --repetition-mode=min Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-c7a47d.o Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-2581f1.o --- mode: inverse_throughput key: instructions: - 'CMOV64rr RAX RAX R11 i_0x0' - 'CMOV64rr RBP RBP R10 i_0x0' - 'CMOV64rr RBX RBX R10 i_0x0' - 'CMOV64rr RCX RCX RDX i_0x0' - 'CMOV64rr RDI RDI RAX i_0x0' - 'CMOV64rr RDX RDX R9 i_0x0' - 'CMOV64rr RSI RSI RAX i_0x0' - 'CMOV64rr R9 R9 RBX i_0x0' - 'CMOV64rr R10 R10 R12 i_0x0' - 'CMOV64rr R11 R11 RDI i_0x0' - 'CMOV64rr R12 R12 RDI i_0x0' - 'CMOV64rr R13 R13 RDI i_0x0' - 'CMOV64rr R14 R14 R9 i_0x0' - 'CMOV64rr R15 R15 RBP i_0x0' config: '' register_initial_values: - 'RAX=0x0' - 'R11=0x0' - 'EFLAGS=0x0' - 'RBP=0x0' - 'R10=0x0' - 'RBX=0x0' - 'RCX=0x0' - 'RDX=0x0' - 'RDI=0x0' - 'R9=0x0' - 'RSI=0x0' - 'R12=0x0' - 'R13=0x0' - 'R14=0x0' - 'R15=0x0' cpu_name: bdver2 llvm_triple: x86_64-unknown-linux-gnu num_repetitions: 10000 measurements: - { key: inverse_throughput, value: 0.6073, per_snippet_value: 8.5022 } error: '' info: instruction has tied variables, using static renaming. assembled_snippet: 5541574156415541545348B8000000000000000049BB00000000000000004883EC08C7042400000000C7442404000000009D48BD000000000000000049BA000000000000000048BB000000000000000048B9000000000000000048BA000000000000000048BF000000000000000049B9000000000000000048BE000000000000000049BC000000000000000049BD000000000000000049BE000000000000000049BF0000000000000000490F40C3490F40EA490F40DA480F40CA480F40F8490F40D1480F40F04C0F40CB4D0F40D44C0F40DF4C0F40E74C0F40EF4D0F40F14C0F40FD490F40C3490F40EA5B415C415D415E415F5DC35541574156415541545348B8000000000000000049BB00000000000000004883EC08C7042400000000C7442404000000009D48BD000000000000000049BA000000000000000048BB000000000000000048B9000000000000000048BA000000000000000048BF000000000000000049B9000000000000000048BE000000000000000049BC000000000000000049BD000000000000000049BE000000000000000049BF000000000000000049B80200000000000000490F40C3490F40EA490F40DA480F40CA480F40F8490F40D1480F40F04C0F40CB4D0F40D44C0F40DF4C0F40E74C0F40EF4D0F40F14C0F40FD4983C0FF75C25B415C415D415E415F5DC3 ... ``` but i open to suggestions as to how test that. I also have gone with the suggestion to default to this new mode. This was irking me for some time, so i'm happy to finally see progress here. Looking forward to feedback. Reviewers: courbet, gchatelet Reviewed By: courbet, gchatelet Subscribers: mstojanovic, RKSimon, llvm-commits, courbet, gchatelet Tags: #llvm Differential Revision: https://reviews.llvm.org/D76921
In d1705c1 (D77238) we accidentally included subsequent changes and did not only move the code into a new file (which was the intention). We undo the changes now and re-introduce them with the appropriate test changes later.
This is a cleanup and normalization patch that also enables reuse with Flang later on. A follow up will clean up and move the directive -> clauses mapping. Differential Revision: https://reviews.llvm.org/D77112
…d/OpenMP`" This reverts commit c18d559. Bots have reported uses that need changing, e.g., clang-tools-extra/clang-tidy/openmp/UseDefaultNoneCheck.cp as reported by http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/46591
Summary: The assertion is almost correct, but it fails on refs from non-preamble Reviewers: sammccall Reviewed By: sammccall Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77222
Replace pattern getContents().size with universe function call
This reverts commit a157cde.
Summary: For more details about this instruction, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Reviewers: craig.topper, RKSimon, LuoYuanke Reviewed By: craig.topper Subscribers: mgorny, hiraditya, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77193
This work prepares us for the overall goal of clean shutdown on user keyboard interrupt [Ctrl+C].
Summary: Reason: the option has an effect on preprocessing. Also see thread: http://lists.llvm.org/pipermail/cfe-dev/2020-March/065014.html Reviewers: chill, efriedma Reviewed By: efriedma Subscribers: efriedma, danielkiss, cfe-commits, kristof.beyls Tags: #clang Differential Revision: https://reviews.llvm.org/D77131
…eg in PowerPC special fma compiler builtins" The new test case causes bot failures. This reverts commit ba87430.
It is an obvious part of D77326. It removes some needless deep indentation and some redundant statements. It prepares the code for a more clean next patch - DWARF index callbacks in D77327.
… (LVI) Gadgets Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph. More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK). Also adds a new target feature to X86: +lvi-load-hardening The feature can be added via the clang CLI using -mlvi-hardening. Differential Revision: https://reviews.llvm.org/D75936
Graceful lit shutdown on user keyboard interrupt [Ctrl+C] was a longstanding goal of mine. After a few refactorings this revision finally enables it. We use the following strategy to deal with KeyboardInterrupt: https://noswap.com/blog/python-multiprocessing-keyboardinterrupt Printing of a helpful summary for interrupted runs (just as the one for completed runs) will be tackled in future revisions. Reviewed By: serge-sans-paille, rnk Differential Revision: https://reviews.llvm.org/D77365
Summary: Linalg makes it possible to interface codegen with externally precompiled HPC libraries. The mechanism to allow such interop uses a normalized ABI and the emission of C interface wrappers. The mechanism controlling these C interface emission is too aggressive and makes it very easy to obtained undefined symbols for external function (e.g. the ones coming from libm). This revision uses the newly introduced llvm.emit_c_interface function attribute which allows controlling this behavior at a function granularity. As a consequence LinalgToLLVM does not need to activate the C wrapper emission when adding the StdToLLVM patterns. Differential Revision: https://reviews.llvm.org/D77364
See rational here: https://reviews.llvm.org/D76173#1922916 Time to compile Attr.h in isolation goes from 2.6s to 1.8s. Original patch by Johannes, plus some additions from Reid to fix some clang tooling targets. Effect on transitive includes is marginal, though: $ diff -u <(sort thedeps-before.txt) <(sort thedeps-after.txt) \ | grep '^[-+] ' | sort | uniq -c | sort -nr 104 - /usr/local/google/home/rnk/llvm-project/clang/include/clang/AST/OpenMPClause.h 87 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/Frontend/OpenMP/OMPContext.h 19 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/ADT/SmallSet.h 19 - /usr/local/google/home/rnk/llvm-project/llvm/include/llvm/ADT/SetVector.h 14 - /usr/include/c++/9/set ... Differential Revision: https://reviews.llvm.org/D76184
This replaces SPIR-V assembly tests for translation of FPFastMathMode decorations with transcoding tests, adding a single SPIR-V assembly test for the translation of the combination of the Fast and NotNaN flags.
Done automatically using: ```bash $ for file in $(grep -r "llvm-spirv" --exclude-dir=transcoding --exclude-dir=DebugInfo --exclude="*.sp*" -B 10 -A 10 | grep "llvm-spirv -r" | grep -Po "^[\w\d/.]+(?=:.*$)" | sort | uniq); do git mv $file transcoding/$file; done ``` Signed-off-by: Alexey Sachkov <alexey.sachkov@intel.com>
This patch features two things: - fixed translation of OpControlBarrier into OpenCL 2.0 built-ins: Scope of the operation is now respected and we can either get work_group_barrier or sub_group_barrier depending on it - improved the translation so it handles non-constant 'Memory Semantic' operand of the instruction Signed-off-by: Alexey Sachkov <alexey.sachkov@intel.com>
This extension relaxes the restriction that OpTypeInt must have a width of 32 bits. Ints of arbitrary bit widths can be beneficial on targets that can exploit narrower widths such as FPGAs. Signed-off-by: Viktoria Maksimova <viktoria.maksimova@intel.com>
The CreateShuffleVector API only accepts `int` or `uint32_t` ArrayRef mask arguments.
- Apply renaming OPT_cuda_gpu_arch_EQ to OPT_offload_arch_EQ in clang driver. - Apply space changes in AST dump to LIT test - Add file-table-tform to check-llvm dependencies Signed-off-by: Vladimir Lazarev <vladimir.lazarev@intel.com>
@vladimirlaz please use the following notation to refer to a SPIRV-LLVM-Translator commit: |
jsji
pushed a commit
that referenced
this pull request
Jan 11, 2024
Use target triple's subarch component, if present, for setting exact SPIR-V version for the SPIR-V emission. Resolves #1509. Original commit: KhronosGroup/SPIRV-LLVM-Translator@9f29f97
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
LLVM: ba1ffd2
LLVM-SPIRV-Translator: KhronosGroup/SPIRV-LLVM-Translator@207924d