forked from llvm-mirror/llvm
-
Notifications
You must be signed in to change notification settings - Fork 16
AMD OpenCL 1.2 adapter pass #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Handles 1.2 builtin calls and create wrapper functions that call 2.0 counterpart inside.
tstellarAMD
pushed a commit
that referenced
this pull request
Oct 19, 2016
…ump tables The TBB and TBH instructions in Thumb-2 allow jump tables to be compressed into sequences of bytes or shorts respectively. These instructions do not exist in Thumb-1, however it is possible to synthesize them out of a sequence of other instructions. It turns out this sequence is so short that it's almost never a lose for performance and is ALWAYS a significant win for code size. TBB example: Before: lsls r0, r0, #2 After: add r0, pc adr r1, .LJTI0_0 ldrb r0, [r0, #6] ldr r0, [r0, r1] lsls r0, r0, #1 mov pc, r0 add pc, r0 => No change in prologue code size or dynamic instruction count. Jump table shrunk by a factor of 4. The only case that can increase dynamic instruction count is the TBH case: Before: lsls r0, r4, #2 After: lsls r4, r4, #1 adr r1, .LJTI0_0 add r4, pc ldr r0, [r0, r1] ldrh r4, [r4, #6] mov pc, r0 lsls r4, r4, #1 add pc, r4 => 1 more instruction in prologue. Jump table shrunk by a factor of 2. So there is an argument that this should be disabled when optimizing for performance (and a TBH needs to be generated). I'm not so sure about that in practice, because on small cores with Thumb-1 performance is often tied to code size. But I'm willing to turn it off when optimizing for performance if people want (also note that TBHs are fairly rare in practice!) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284580 91177308-0d34-0410-b5e6-96231b3b80d8
tstellarAMD
pushed a commit
that referenced
this pull request
Oct 27, 2016
…tions. Instead of cmp w0, #1 orr w8, wzr, #0x1 cneg w0, w8, ne we now generate cmp w0, #1 csinv w0, w0, wzr, eq PR28965 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285217 91177308-0d34-0410-b5e6-96231b3b80d8
tstellarAMD
pushed a commit
that referenced
this pull request
Nov 2, 2016
…ump tables [Reapplying r284580 and r285917 with fix and testing to ensure emitted jump tables for Thumb-1 have 4-byte alignment] The TBB and TBH instructions in Thumb-2 allow jump tables to be compressed into sequences of bytes or shorts respectively. These instructions do not exist in Thumb-1, however it is possible to synthesize them out of a sequence of other instructions. It turns out this sequence is so short that it's almost never a lose for performance and is ALWAYS a significant win for code size. TBB example: Before: lsls r0, r0, #2 After: add r0, pc adr r1, .LJTI0_0 ldrb r0, [r0, #6] ldr r0, [r0, r1] lsls r0, r0, #1 mov pc, r0 add pc, r0 => No change in prologue code size or dynamic instruction count. Jump table shrunk by a factor of 4. The only case that can increase dynamic instruction count is the TBH case: Before: lsls r0, r4, #2 After: lsls r4, r4, #1 adr r1, .LJTI0_0 add r4, pc ldr r0, [r0, r1] ldrh r4, [r4, #6] mov pc, r0 lsls r4, r4, #1 add pc, r4 => 1 more instruction in prologue. Jump table shrunk by a factor of 2. So there is an argument that this should be disabled when optimizing for performance (and a TBH needs to be generated). I'm not so sure about that in practice, because on small cores with Thumb-1 performance is often tied to code size. But I'm willing to turn it off when optimizing for performance if people want (also note that TBHs are fairly rare in practice!) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285690 91177308-0d34-0410-b5e6-96231b3b80d8
tstellarAMD
pushed a commit
that referenced
this pull request
Nov 8, 2016
…line asm register selection on AArch64. Without this patch, register allocation for the example below fails. define half @test(half %a1, half %a2) #0 { entry: %0 = tail call half asm "sqrshl ${0:h}, ${1:h}, ${2:h}", "=w,w,w" (half %a1, half %a2) #1 ret half %0 } Patch by Florian Hahn. Differential Revision: https://reviews.llvm.org/D25080 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286111 91177308-0d34-0410-b5e6-96231b3b80d8
tstellarAMD
pushed a commit
that referenced
this pull request
Dec 7, 2016
…Idx. NFC. Summary: This is NFC but prevents assertions when PartialMappingIdx is tablegen-erated. The assumptions were: 1) FirstGPR is 0 2) FirstGPR is the first of the First* enumerators. GPR32 is changed to 1 to demonstrate that assumption #1 is fixed. #2 will be covered by a subsequent patch that tablegen-erates information and swaps the order of GPR and FPR as a side effect. Depends on D27336 Reviewers: ab, t.p.northover, qcolombet Subscribers: aemerson, rengolin, vkalintiris, dberris, rovka, llvm-commits Differential Revision: https://reviews.llvm.org/D27337 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288812 91177308-0d34-0410-b5e6-96231b3b80d8
tstellarAMD
pushed a commit
that referenced
this pull request
Dec 14, 2016
This reverts r289215 (git SHA1 cb7b86a). It breaks the ubsan build because a DenseMap that keys off of `AssertingVH<T>` will hit UB when it tries to cast the empty and tombstone keys to `T *` (due to insufficient alignment). This is the relevant stack trace (thanks to Mike Aizatsky): #0 0x25cf100 in llvm::AssertingVH<llvm::PHINode>::getValPtr() const llvm/include/llvm/IR/ValueHandle.h:212:39 #1 0x25cea20 in llvm::AssertingVH<llvm::PHINode>::operator=(llvm::AssertingVH<llvm::PHINode> const&) llvm/include/llvm/IR/ValueHandle.h:234:19 #2 0x25d0092 in llvm::DenseMapBase<llvm::DenseMap<llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >, llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >::clear() llvm/include/llvm/ADT/DenseMap.h:113:23 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289482 91177308-0d34-0410-b5e6-96231b3b80d8
tstellarAMD
pushed a commit
that referenced
this pull request
Dec 14, 2016
This change re-lands r289215, by reverting r289482. The underlying issue that caused it to be reverted has been fixed by Tim Northover in r289496. Original commit message for r289215: [SCEVExpander] Use llvm data structures; NFC Original commit message for r289482: Revert "[SCEVExpander] Use llvm data structures; NFC" This reverts r289215 (git SHA1 cb7b86a). It breaks the ubsan build because a DenseMap that keys off of `AssertingVH<T>` will hit UB when it tries to cast the empty and tombstone keys to `T *` (due to insufficient alignment). This is the relevant stack trace (thanks to Mike Aizatsky): #0 0x25cf100 in llvm::AssertingVH<llvm::PHINode>::getValPtr() const llvm/include/llvm/IR/ValueHandle.h:212:39 #1 0x25cea20 in llvm::AssertingVH<llvm::PHINode>::operator=(llvm::AssertingVH<llvm::PHINode> const&) llvm/include/llvm/IR/ValueHandle.h:234:19 #2 0x25d0092 in llvm::DenseMapBase<llvm::DenseMap<llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >, llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >::clear() llvm/include/llvm/ADT/DenseMap.h:113:23 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289602 91177308-0d34-0410-b5e6-96231b3b80d8
tstellarAMD
pushed a commit
that referenced
this pull request
Feb 1, 2017
#0 0x89cdeb in operator new[](unsigned long) /code/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:84:37 #1 0x4ec87c4 in llvm::RegisterBankInfo::ValueMapping const* llvm::RegisterBankInfo::getOperandsMapping<llvm::RegisterBankInfo::ValueMapping const* const*>(llvm::RegisterBankInfo::ValueMapping const* const*, llvm::RegisterBankInfo::ValueMapping const* const*) const /code/llvm/lib/CodeGen/GlobalISel/RegisterBankInfo.cpp:297:9 #2 0x9327ee in llvm::AArch64RegisterBankInfo::getInstrMapping(llvm::MachineInstr const&) const /code/llvm/lib/Target/AArch64/AArch64RegisterBankInfo.cpp:540:30 #3 0x4eb8d07 in llvm::RegBankSelect::assignInstr(llvm::MachineInstr&) /code/llvm/lib/CodeGen/GlobalISel/RegBankSelect.cpp:546:24 #4 0x4eb9dd2 in llvm::RegBankSelect::runOnMachineFunction(llvm::MachineFunction&) /code/llvm/lib/CodeGen/GlobalISel/RegBankSelect.cpp:624:12 #5 0x3141875 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /code/llvm/lib/CodeGen/MachineFunctionPass.cpp:62:13 #6 0x396128d in llvm::FPPassManager::runOnFunction(llvm::Function&) /code/llvm/lib/IR/LegacyPassManager.cpp:1513:27 #7 0x3961832 in llvm::FPPassManager::runOnModule(llvm::Module&) /code/llvm/lib/IR/LegacyPassManager.cpp:1534:16 #8 0x3962540 in runOnModule /code/llvm/lib/IR/LegacyPassManager.cpp:1590:27 #9 0x3962540 in llvm::legacy::PassManagerImpl::run(llvm::Module&) /code/llvm/lib/IR/LegacyPassManager.cpp:1693 #10 0x8ae368 in compileModule(char**, llvm::LLVMContext&) /code/llvm/tools/llc/llc.cpp:562:8 #11 0x8a7a1b in main /code/llvm/tools/llc/llc.cpp:316:22 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293351 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Mar 3, 2017
…tores. This patch extends the current functionality of the AArch64 redundant copy elimination pass to handle non-zero cases such as: BB#0: cmp x0, #1 b.eq .LBB0_1 .LBB0_1: orr x0, xzr, #0x1 ; <-- redundant copy; x0 known to hold #1. Differential Revision: https://reviews.llvm.org/D29344 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296809 91177308-0d34-0410-b5e6-96231b3b80d8
whchung
added a commit
that referenced
this pull request
Mar 16, 2017
…ge (#1) * Additional patches for AMDGPU backend to cope with new address space mapping Contributed by Gregory Rodgers * Fix AMDGPU address space mapping Conflicts: lib/Target/AMDGPU/AMDGPU.h
kzhuravl
pushed a commit
that referenced
this pull request
Mar 24, 2017
Summary: To support negative immediates for certain arithmetic instructions, the instruction is converted to the inverse instruction with a negated (or inverted) immediate. For example, "ADD r0, r1, #FFFFFFFF" cannot be encoded as an ADD instruction. However, "SUB r0, r1, #1" is equivalent. These conversions are different from instruction aliases. An alias maps several assembler instructions onto one encoding. A conversion, however, maps an *invalid* instruction--e.g. with an immediate that cannot be represented in the encoding--to a different (but equivalent) instruction. Several instructions with negative immediates were being converted already, but this was not systematically tested, nor did it cover all instructions. This patch implements all possible substitutions for ARM, Thumb1 and Thumb2 assembler and adds tests. It also adds a feature flag (-mattr=+no-neg-immediates) to turn these substitutions off. This is helpful for users who want their code to assemble to exactly what they wrote. Reviewers: t.p.northover, rovka, samparker, javed.absar, peter.smith, rengolin Reviewed By: javed.absar Subscribers: aadg, aemerson, llvm-commits Differential Revision: https://reviews.llvm.org/D30571 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298380 91177308-0d34-0410-b5e6-96231b3b80d8
whchung
added a commit
that referenced
this pull request
Mar 31, 2017
…ge (#1) * Additional patches for AMDGPU backend to cope with new address space mapping Contributed by Gregory Rodgers * Fix AMDGPU address space mapping Conflicts: lib/Target/AMDGPU/AMDGPU.h Conflicts: lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
kzhuravl
pushed a commit
that referenced
this pull request
May 11, 2017
…merged" This reverts r302712. The change fails with ASAN enabled: ERROR: AddressSanitizer: use-after-poison on address ... at ... READ of size 2 at ... thread T0 #0 ... in llvm::SDNode::getNumValues() const <snip>/include/llvm/CodeGen/SelectionDAGNodes.h:855:42 #1 ... in llvm::SDNode::hasAnyUseOfValue(unsigned int) const <snip>/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7270:3 #2 ... in llvm::SDValue::use_empty() const <snip> include/llvm/CodeGen/SelectionDAGNodes.h:1042:17 #3 ... in (anonymous namespace)::DAGCombiner::MergeConsecutiveStores(llvm::StoreSDNode*) <snip>/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12944:7 Reviewers: niravd Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33081 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302746 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
May 18, 2017
There is often a lot of boilerplate code required to visit a type record or type stream. The #1 use case is that you have a sequence of bytes that represent one or more records, and you want to deserialize each one, switch on it, and call a callback with the deserialized record that the user can examine. Currently this requires at least 6 lines of code: codeview::TypeVisitorCallbackPipeline Pipeline; Pipeline.addCallbackToPipeline(Deserializer); Pipeline.addCallbackToPipeline(MyCallbacks); codeview::CVTypeVisitor Visitor(Pipeline); consumeError(Visitor.visitTypeRecord(Record)); With this patch, it becomes one line of code: consumeError(codeview::visitTypeRecord(Record, MyCallbacks)); This is done by having the deserialization happen internally inside of the visitTypeRecord function. Since this is occasionally not desirable, the function provides a 3rd parameter that can be used to change this behavior. Hopefully this can significantly reduce the barrier to entry to using the visitation infrastructure. Differential Revision: https://reviews.llvm.org/D33245 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303271 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
May 29, 2017
error C2971: 'llvm::ManagedStatic': template parameter 'Creator': 'CreateDefaultTimerGroup': a variable with non-static storage duration cannot be used as a non-type argument git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304157 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Jun 3, 2017
…ruction Op1 (RHS) is a constant, so putting it on the LHS makes us churn through visitICmp an extra time to canonicalize it: INSTCOMBINE ITERATION #1 on cmpnot IC: ADDING: 3 instrs to worklist IC: Visiting: %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp sgt i8 %notx, 42 IC: Old = %cmp = icmp sgt i8 %notx, 42 New = <badref> = icmp sgt i8 -43, %x IC: ADD: %cmp = icmp sgt i8 -43, %x IC: ERASE %1 = icmp sgt i8 %notx, 42 IC: ADD: %notx = xor i8 %x, -1 IC: DCE: %notx = xor i8 %x, -1 IC: ERASE %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp sgt i8 -43, %x IC: Mod = %cmp = icmp sgt i8 -43, %x New = %cmp = icmp slt i8 %x, -43 IC: ADD: %cmp = icmp slt i8 %x, -43 IC: Visiting: %cmp = icmp slt i8 %x, -43 IC: Visiting: ret i1 %cmp If we create the swapped ICmp directly, we go faster: INSTCOMBINE ITERATION #1 on cmpnot IC: ADDING: 3 instrs to worklist IC: Visiting: %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp sgt i8 %notx, 42 IC: Old = %cmp = icmp sgt i8 %notx, 42 New = <badref> = icmp slt i8 %x, -43 IC: ADD: %cmp = icmp slt i8 %x, -43 IC: ERASE %1 = icmp sgt i8 %notx, 42 IC: ADD: %notx = xor i8 %x, -1 IC: DCE: %notx = xor i8 %x, -1 IC: ERASE %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp slt i8 %x, -43 IC: Visiting: ret i1 %cmp git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304558 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Oct 10, 2017
Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315288 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Oct 13, 2017
…SCCP." This is r315288 & r315294, which were reverted due to stage2 bot failures. Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315593 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Oct 22, 2017
This fixes bugzilla 26810 https://bugs.llvm.org/show_bug.cgi?id=26810 This is intended to prevent sequences like: movl %ebp, 8(%esp) # 4-byte Spill movl %ecx, %ebp movl %ebx, %ecx movl %edi, %ebx movl %edx, %edi cltd idivl %esi movl %edi, %edx movl %ebx, %edi movl %ecx, %ebx movl %ebp, %ecx movl 16(%esp), %ebp # 4 - byte Reload Such sequences are created in 2 scenarios: Scenario #1: vreg0 is evicted from physreg0 by vreg1 Evictee vreg0 is intended for region splitting with split candidate physreg0 (the reg vreg0 was evicted from) Region splitting creates a local interval because of interference with the evictor vreg1 (normally region spliiting creates 2 interval, the "by reg" and "by stack" intervals. Local interval created when interference occurs.) one of the split intervals ends up evicting vreg2 from physreg1 Evictee vreg2 is intended for region splitting with split candidate physreg1 one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills Scenario #2 vreg0 is evicted from physreg0 by vreg1 vreg2 is evicted from physreg2 by vreg3 etc Evictee vreg0 is intended for region splitting with split candidate physreg1 Region splitting creates a local interval because of interference with the evictor vreg1 one of the split intervals ends up evicting back original evictor vreg1 from physreg0 (the reg vreg0 was evicted from) Another evictee vreg2 is intended for region splitting with split candidate physreg1 one of the split intervals ends up evicting vreg3 from physreg2 etc.. until someone spills As compile time was a concern, I've added a flag to control weather we do cost calculations for local intervals we expect to be created (it's on by default for X86 target, off for the rest). Differential Revision: https://reviews.llvm.org/D35816 Change-Id: Id9411ff7bbb845463d289ba2ae97737a1ee7cc39 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316295 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Oct 27, 2017
Summary: The code comments indicate that no effort has been spent on handling load/stores when the size isn't a multiple of the byte size correctly. However, the code only avoided types smaller than 8 bits. So for example a load of an i28 could still be considered as a candidate for vectorization. This patch adjusts the code to behave according to the code comment. The test case used to hit the following assert when trying to use "cast" an i32 to i28 using CreateBitOrPointerCast: opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. #0 PrintStackTraceSignalHandler(void*) #1 SignalHandler(int) #2 __restore_rt #3 __GI_raise #4 __GI_abort #5 __GI___assert_fail #6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value*, llvm::Type*, llvm::Twine const&, llvm::Instruction*) #7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value*, llvm::Type*, llvm::Twine const&) #8 (anonymous namespace)::Vectorizer::vectorizeLoadChain(llvm::ArrayRef<llvm::Instruction*>, llvm::SmallPtrSet<llvm::Instruction*, 16u>*) Reviewers: arsenm Reviewed By: arsenm Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D39295 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316663 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Oct 27, 2017
Summary: We no longer add vectors of pointers as candidates for load/store vectorization. It does not seem to work anyway, but without this patch we can end up in asserts when trying to create casts between an integer type and the pointer of vectors type. The test case I've added used to assert like this when trying to cast between i64 and <2 x i16*>: opt: ../lib/IR/Instructions.cpp:2565: Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. #0 PrintStackTraceSignalHandler(void*) #1 SignalHandler(int) #2 __restore_rt #3 __GI_raise #4 __GI_abort #5 __GI___assert_fail #6 llvm::CastInst::Create(llvm::Instruction::CastOps, llvm::Value*, llvm::Type*, llvm::Twine const&, llvm::Instruction*) #7 llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>::CreateBitOrPointerCast(llvm::Value*, llvm::Type*, llvm::Twine const&) #8 Vectorizer::vectorizeStoreChain(llvm::ArrayRef<llvm::Instruction*>, llvm::SmallPtrSet<llvm::Instruction*, 16u>*) Reviewers: arsenm Reviewed By: arsenm Subscribers: nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D39296 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316665 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Oct 30, 2017
…in IPSCCP. This version of the patch includes a fix addressing a stage2 LTO buildbot failure and addressed some additional nits. Original commit message: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to ret i32 2 with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316887 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Oct 30, 2017
…in IPSCCP. This version of the patch includes a fix addressing a stage2 LTO buildbot failure and addressed some additional nits. Original commit message: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to ret i32 2 with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316891 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Dec 17, 2017
We want to do this for 2 reasons: 1. Value tracking does not recognize the ashr variant, so it would fail to match for cases like D39766. 2. DAGCombiner does better at producing optimal codegen when we have the cmp+sel pattern. More detail about what happens in the backend: 1. DAGCombiner has a generic transform for all targets to convert the scalar cmp+sel variant of abs into the shift variant. That is the opposite of this IR canonicalization. 2. DAGCombiner has a generic transform for all targets to convert the vector cmp+sel variant of abs into either an ABS node or the shift variant. That is again the opposite of this IR canonicalization. 3. DAGCombiner has a generic transform for all targets to convert the exact shift variants produced by #1 or #2 into an ISD::ABS node. Note: It would be an efficiency improvement if we had #1 go directly to an ABS node when that's legal/custom. 4. The pattern matching above is incomplete, so it is possible to escape the intended/optimal codegen in a variety of ways. a. For #2, the vector path is missing the case for setlt with a '1' constant. b. For #3, we are missing a match for commuted versions of the shift variants. 5. Therefore, this IR canonicalization can only help get us to the optimal codegen. The version of cmp+sel produced by this patch will be recognized in the DAG and converted to an ABS node when possible or the shift sequence when not. 6. In the following examples with this patch applied, we may get conditional moves rather than the shift produced by the generic DAGCombiner transforms. The conditional move is created using a target-specific decision for any given target. Whether it is optimal or not for a particular subtarget may be up for debate. define i32 @abs_shifty(i32 %x) { %signbit = ashr i32 %x, 31 %add = add i32 %signbit, %x %abs = xor i32 %signbit, %add ret i32 %abs } define i32 @abs_cmpsubsel(i32 %x) { %cmp = icmp slt i32 %x, zeroinitializer %sub = sub i32 zeroinitializer, %x %abs = select i1 %cmp, i32 %sub, i32 %x ret i32 %abs } define <4 x i32> @abs_shifty_vec(<4 x i32> %x) { %signbit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> %add = add <4 x i32> %signbit, %x %abs = xor <4 x i32> %signbit, %add ret <4 x i32> %abs } define <4 x i32> @abs_cmpsubsel_vec(<4 x i32> %x) { %cmp = icmp slt <4 x i32> %x, zeroinitializer %sub = sub <4 x i32> zeroinitializer, %x %abs = select <4 x i1> %cmp, <4 x i32> %sub, <4 x i32> %x ret <4 x i32> %abs } > $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=x86_64 -mattr=avx > abs_shifty: > movl %edi, %eax > negl %eax > cmovll %edi, %eax > retq > > abs_cmpsubsel: > movl %edi, %eax > negl %eax > cmovll %edi, %eax > retq > > abs_shifty_vec: > vpabsd %xmm0, %xmm0 > retq > > abs_cmpsubsel_vec: > vpabsd %xmm0, %xmm0 > retq > > $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=aarch64 > abs_shifty: > cmp w0, #0 // =0 > cneg w0, w0, mi > ret > > abs_cmpsubsel: > cmp w0, #0 // =0 > cneg w0, w0, mi > ret > > abs_shifty_vec: > abs v0.4s, v0.4s > ret > > abs_cmpsubsel_vec: > abs v0.4s, v0.4s > ret > > $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=powerpc64le > abs_shifty: > srawi 4, 3, 31 > add 3, 3, 4 > xor 3, 3, 4 > blr > > abs_cmpsubsel: > srawi 4, 3, 31 > add 3, 3, 4 > xor 3, 3, 4 > blr > > abs_shifty_vec: > vspltisw 3, -16 > vspltisw 4, 15 > vsubuwm 3, 4, 3 > vsraw 3, 2, 3 > vadduwm 2, 2, 3 > xxlxor 34, 34, 35 > blr > > abs_cmpsubsel_vec: > vspltisw 3, -16 > vspltisw 4, 15 > vsubuwm 3, 4, 3 > vsraw 3, 2, 3 > vadduwm 2, 2, 3 > xxlxor 34, 34, 35 > blr > Differential Revision: https://reviews.llvm.org/D40984 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320921 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Dec 29, 2017
Summary: I have been getting rather difficult to reproduce SIGBUS crashes when compiling certain FreeBSD sources, and their stack traces pointed squarely at `SelectionDAG::salvageDebugInfo()`: ``` Core was generated by `/usr/obj/share/dim/src/freebsd/clang600-import/amd64.amd64/tmp/usr/bin/cc -cc1 -'. Program terminated with signal SIGBUS, Bus error. #0 isInvalidated () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SDNodeDbgValue.h:115 115 bool isInvalidated() const { return Invalid; } (gdb) bt #0 isInvalidated () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SDNodeDbgValue.h:115 #1 salvageDebugInfo () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7116 #2 0x00000000033b2516 in operator() () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3595 #3 __invoke<(lambda at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3593:59) &, llvm::SDNode *, llvm::SDNode *> () at /usr/include/c++/v1/type_traits:4323 #4 __call<(lambda at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3593:59) &, llvm::SDNode *, llvm::SDNode *> () at /usr/include/c++/v1/__functional_base:349 #5 operator() () at /usr/include/c++/v1/functional:1562 #6 0x00000000033b0817 in operator() () at /usr/include/c++/v1/functional:1916 #7 NodeDeleted () at /share/dim/src/freebsd/clang600-import/contrib/llvm/include/llvm/CodeGen/SelectionDAG.h:293 #8 0x0000000003529dde in RemoveDeadNodes () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:610 #9 0x00000000035556df in MorphNodeTo () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:6794 #10 0x00000000033a9acc in MorphNode () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:2594 #11 0x00000000033ac80b in SelectCodeCommon () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:3601 #12 0x00000000023d464b in SelectCode () at /usr/obj/share/dim/src/freebsd/clang600-import/amd64.amd64/tmp/obj-tools/lib/clang/libllvm/X86GenDAGISel.inc:282902 #13 Select () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:3072 #14 0x00000000033a5afa in DoInstructionSelection () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:988 #15 0x00000000033a4e1a in CodeGenAndEmitDAG () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:868 #16 0x00000000033a2643 in SelectAllBasicBlocks () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1624 #17 0x000000000339f158 in runOnMachineFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:466 #18 0x00000000023d03c4 in runOnMachineFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp:175 #19 0x00000000035cc8c2 in runOnFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/MachineFunctionPass.cpp:62 #20 0x00000000030dca9a in runOnFunction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1520 #21 0x00000000030dccf3 in runOnModule () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1541 #22 0x00000000030dd228 in runOnModule () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1597 #23 run () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/IR/LegacyPassManager.cpp:1700 #24 0x00000000014db578 in EmitAssembly () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/CodeGen/BackendUtil.cpp:815 #25 EmitBackendOutput () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/CodeGen/BackendUtil.cpp:1181 #26 0x00000000014d5b26 in HandleTranslationUnit () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/CodeGen/CodeGenAction.cpp:292 #27 0x0000000001c4c332 in ParseAST () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/Parse/ParseAST.cpp:159 #28 0x00000000015d546c in Execute () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/Frontend/FrontendAction.cpp:897 #29 0x0000000001cec311 in ExecuteAction () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/Frontend/CompilerInstance.cpp:991 #30 0x00000000014b4f81 in ExecuteCompilerInvocation () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:252 #31 0x00000000014aa73f in cc1_main () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/tools/driver/cc1_main.cpp:221 #32 0x00000000014b2928 in ExecuteCC1Tool () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/tools/driver/driver.cpp:309 #33 main () at /share/dim/src/freebsd/clang600-import/contrib/llvm/tools/clang/tools/driver/driver.cpp:388 (gdb) frame 1 #1 salvageDebugInfo () at /share/dim/src/freebsd/clang600-import/contrib/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7116 7116 if (DV->isInvalidated()) (gdb) disassemble Dump of assembler code for function salvageDebugInfo(): [...] 0x0000000003557348 <+744>: nopl 0x0(%rax,%rax,1) 0x0000000003557350 <+752>: mov (%r12),%r13 => 0x0000000003557354 <+756>: cmpb $0x0,0x31(%r13) 0x0000000003557359 <+761>: jne 0x35573b0 <salvageDebugInfo()+848> (gdb) info registers [...] r13 0x5a5a5a5a5a5a5a5a 6510615555426900570 ``` The `0x5a5a5a5a5a5a5a5a` value in `r13` indicates the memory was either uninitialized, or already freed. Unfortunately I do not have a simple self-contained test case for this. However, it seems pretty clear that the call to `AddDbgValue()` in `salvageDebugInfo()` causes the problems, since it modifies `SelectionDag::DbgInfo` while looping through one of its DenseMaps: ``` void SelectionDAG::salvageDebugInfo(SDNode &N) { [...] for (auto DV : GetDbgValues(&N)) { if (DV->isInvalidated()) continue; [...] AddDbgValue(Clone, N0.getNode(), false); [...] } } ``` At least, if I comment out the `AddDbgValue()` call, the crashes go away. I propose to change this function slightly, similar to the `SelectionDAG::transferDbgValues()` function just above it, to save the cloned SDDbgValues in a separate SmallVector, and only call AddDbgValue() on them after the for loop is done. Reviewers: aprantl, bogner, bkramer, davide Reviewed By: davide Subscribers: davide, krytarowski, JDevlieghere, emaste, llvm-commits Differential Revision: https://reviews.llvm.org/D41589 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321545 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Feb 9, 2018
…168) This patch is the LLVM part of fixing the issues, described in https://bugs.llvm.org/show_bug.cgi?id=36168 * The representation of enumerator values in the debug info metadata now contains a boolean flag isUnsigned, which determines how the bits of the value are interpreted. * The DW_TAG_enumeration type DIE now always (for DWARF version >= 3) includes a DW_AT_type attribute, which refers to the underlying integer type, as suggested in DWARFv4 (5.7 Enumeration Type Entries). * The debug info metadata for enumeration type contains (in flags) indication whether this is a C++11 "fixed enum". * For C++11 enumeration with a fixed underlying type, the DIE also includes the DW_AT_enum_class attribute (for DWARF version >= 4). * Encoding of enumerator constants uses DW_FORM_sdata for signed values and DW_FORM_udata for unsigned values, as suggested by DWARFv4 (7.5.4 Attribute Encodings). The changes should be backwards compatible: * the isUnsigned attribute is optional and defaults to false. * if the underlying type for the enumeration is not available, the enumerator values are considered signed. * the FixedEnum flag defaults to clear. * the bitcode format for DIEnumerator stores the unsigned flag bit #1 of the first record element, so the format does not change and the zero previously stored there is consistent with the false default for IsUnsigned. Differential Revision: https://reviews.llvm.org/D42734 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324489 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Feb 16, 2018
The reference '&' is missing in the function parameter. If there are back-to-back optimizations in terms of dag node list like below: t29: i64,ch = load<LD4[bitcast (%struct.test_t* @test.t to i8*)+12](dereferenceable), zext from i32> t3, t43, undef:i64 t34: i64,ch = load<LD4[bitcast (%struct.test_t* @test.t to i8*)](dereferenceable), zext from i32> t3, t41, undef:i64 The bug will trigger a segfault for the added test case remove_truncate_5.ll: LLVMSymbolizer: error reading file: No such file or directory #0 0x000000000241c4d9 (llc+0x241c4d9) #1 0x000000000241c56a (llc+0x241c56a) #2 0x000000000241aa50 (llc+0x241aa50) ... #22 0x0000000000fd5edf (llc+0xfd5edf) #23 0x00007f0fe03bec05 __libc_start_main (/lib64/libc.so.6+0x21c05) #24 0x0000000000fd3e69 (llc+0xfd3e69) ... Segmentation fault Signed-off-by: Yonghong Song <yhs@fb.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325267 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Jul 13, 2018
Spectre variant #1 for x86. There is a lengthy, detailed RFC thread on llvm-dev which discusses the high level issues. High level discussion is probably best there. I've split the design document out of this patch and will land it separately once I update it to reflect the latest edits and updates to the Google doc used in the RFC thread. This patch is really just an initial step. It isn't quite ready for prime time and is only exposed via debugging flags. It has two major limitations currently: 1) It only supports x86-64, and only certain ABIs. Many assumptions are currently hard-coded and need to be factored out of the code here. 2) It doesn't include any options for more fine-grained control, either of which control flow edges are significant or which loads are important to be hardened. 3) The code is still quite rough and the testing lighter than I'd like. However, this is enough for people to begin using. I have had numerous requests from people to be able to experiment with this patch to understand the trade-offs it presents and how to use it. We would also like to encourage work to similar effect in other toolchains. The ARM folks are actively developing a system based on this for AArch64. We hope to merge this with their efforts when both are far enough along. But we also don't want to block making this available on that effort. Many thanks to the *numerous* people who helped along the way here. For this patch in particular, both Eric and Craig did a ton of review to even have confidence in it as an early, rough cut at this functionality. Differential Revision: https://reviews.llvm.org/D44824 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336990 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Jul 14, 2018
…ering" This reverts commit r337021. WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x1415cd65 in void write_signed<long>(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:95:7 #1 0x1415c900 in llvm::write_integer(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:121:3 #2 0x1472357f in llvm::raw_ostream::operator<<(long) /code/llvm-project/llvm/lib/Support/raw_ostream.cpp:117:3 #3 0x13bb9d4 in llvm::raw_ostream::operator<<(int) /code/llvm-project/llvm/include/llvm/Support/raw_ostream.h:210:18 #4 0x3c2bc18 in void printField<unsigned int, &(amd_kernel_code_s::amd_kernel_code_version_major)>(llvm::StringRef, amd_kernel_code_s const&, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:78:23 #5 0x3c250ba in llvm::printAmdKernelCodeField(amd_kernel_code_s const&, int, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:104:5 #6 0x3c27ca3 in llvm::dumpAmdKernelCode(amd_kernel_code_s const*, llvm::raw_ostream&, char const*) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:113:5 #7 0x3a46e6c in llvm::AMDGPUTargetAsmStreamer::EmitAMDKernelCodeT(amd_kernel_code_s const&) /code/llvm-project/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp:161:3 #8 0xd371e4 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:204:26 [...] Uninitialized value was created by an allocation of 'KernelCode' in the stack frame of function '_ZN4llvm16AMDGPUAsmPrinter21EmitFunctionBodyStartEv' #0 0xd36650 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:192 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337079 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Jul 17, 2018
This patch completes support for the following floating point instructions that take FP immediates: FADD* (addition) FSUB (subtract) FSUBR (subtract reverse form) FMUL* (multiplication) FMAX* (maximum) FMAXNM (maximum number) FMIN (maximum) FMINNM (maximum number) All operations are predicated and take a FP immediate operand, e.g. fadd z0.h, p0/m, z0.h, #0.5 fmin z0.s, p0/m, z0.s, #1.0 ^___________^ (tied) * Instructions added in a previous patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337272 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Jul 19, 2018
changes that are intertwined here: 1) Extracting the tracing of predicate state through the CFG to its own function. 2) Creating a struct to manage the predicate state used throughout the pass. Doing #1 necessitates and motivates the particular approach for #2 as now the predicate management is spread across different functions focused on different aspects of it. A number of simplifications then fell out as a direct consequence. I went with an Optional to make it more natural to construct the MachineSSAUpdater object. This is probably the single largest outstanding refactoring step I have. Things get a bit more surgical from here. My current goal, beyond generally making this maintainable long-term, is to implement several improvements to how we do interprocedural tracking of predicate state. But I don't want to do that until the predicate state management and tracing is in reasonably clear state. Differential Revision: https://reviews.llvm.org/D49427 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337446 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Jul 23, 2018
A DAG-NOT-DAG is a CHECK-DAG group, X, followed by a CHECK-NOT group, N, followed by a CHECK-DAG group, Y. Let y be the initial directive of Y. This patch makes the following changes to the behavior: 1. Directives in N can no longer match within part of Y's match range just because y happens not to be the earliest match from Y. Specifically, this patch withdraws N's search range end from y's match range start to Y's match range start. 2. y can no longer match within X's match range, where a y match produced a reordering complaint, which is thus no longer possible. Specifically, this patch withdraws y's search range start from X's permitted range start to X's match range end, which was already the search range start for other members of Y. Both of these changes can only increase the number of test passes: #1 constrains the ability of CHECK-NOTs to match, and #2 expands the ability of CHECK-DAGs to match without complaints. These changes are based on discussions at: <http://lists.llvm.org/pipermail/llvm-dev/2018-May/123550.html> <https://reviews.llvm.org/D47106> which conclude that: 1. These changes simplify the FileCheck conceptual model. First, it makes search ranges for DAG-NOT-DAG more consistent with other cases. Second, it was confusing that y was treated differently from the rest of Y. 2. These changes add theoretical use cases for DAG-NOT-DAG that had no obvious means to be expressed otherwise. We can justify the first half of this assertion with the observation that these changes can only increase the number of test passes. 3. Reordering detection for DAG-NOT-DAG had no obvious real benefit. We don't have evidence from real uses cases to help us debate conclusions #2 and #3, but #1 at least seems intuitive. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D48986 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337605 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Aug 9, 2018
Summary: Currently, in line with GCC, when specifying reserved registers like sp or pc on an inline asm() clobber list, we don't always preserve the original value across the statement. And in general, overwriting reserved registers can have surprising results. For example: ``` extern int bar(int[]); int foo(int i) { int a[i]; // VLA asm volatile( "mov r7, #1" : : : "r7" ); return 1 + bar(a); } ``` Compiled for thumb, this gives: ``` $ clang --target=arm-arm-none-eabi -march=armv7a -c test.c -o - -S -O1 -mthumb ... foo: .fnstart @ %bb.0: @ %entry .save {r4, r5, r6, r7, lr} push {r4, r5, r6, r7, lr} .setfp r7, sp, #12 add r7, sp, #12 .pad #4 sub sp, #4 movs r1, #7 add.w r0, r1, r0, lsl #2 bic r0, r0, #7 sub.w r0, sp, r0 mov sp, r0 @app mov.w r7, #1 @NO_APP bl bar adds r0, #1 sub.w r4, r7, #12 mov sp, r4 pop {r4, r5, r6, r7, pc} ... ``` r7 is used as the frame pointer for thumb targets, and this function needs to restore the SP from the FP because of the variable-length stack allocation a. r7 is clobbered by the inline assembly (and r7 is included in the clobber list), but LLVM does not preserve the value of the frame pointer across the assembly block. This type of behavior is similar to GCC's and has been discussed on the bugtracker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11807 . No consensus seemed to have been reached on the way forward. Clang behavior has briefly been discussed on the CFE mailing (starting here: http://lists.llvm.org/pipermail/cfe-dev/2018-July/058392.html). I've opted for following Eli Friedman's advice to print warnings when there are reserved registers on the clobber list so as not to diverge from GCC behavior for now. The patch uses MachineRegisterInfo's target-specific knowledge of reserved registers, just before we convert the inline asm string in the AsmPrinter. If we find a reserved register, we print a warning: ``` repro.c:6:7: warning: inline asm clobber list contains reserved registers: R7 [-Winline-asm] "mov r7, #1" ^ ``` Reviewers: eli.friedman, olista01, javed.absar, efriedma Reviewed By: efriedma Subscribers: efriedma, eraman, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D49727 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339257 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Aug 30, 2018
Summary: This is a continuation of https://reviews.llvm.org/D49727 Below the original text, current changes in the comments: Currently, in line with GCC, when specifying reserved registers like sp or pc on an inline asm() clobber list, we don't always preserve the original value across the statement. And in general, overwriting reserved registers can have surprising results. For example: extern int bar(int[]); int foo(int i) { int a[i]; // VLA asm volatile( "mov r7, #1" : : : "r7" ); return 1 + bar(a); } Compiled for thumb, this gives: $ clang --target=arm-arm-none-eabi -march=armv7a -c test.c -o - -S -O1 -mthumb ... foo: .fnstart @ %bb.0: @ %entry .save {r4, r5, r6, r7, lr} push {r4, r5, r6, r7, lr} .setfp r7, sp, #12 add r7, sp, #12 .pad #4 sub sp, #4 movs r1, #7 add.w r0, r1, r0, lsl #2 bic r0, r0, #7 sub.w r0, sp, r0 mov sp, r0 @app mov.w r7, #1 @NO_APP bl bar adds r0, #1 sub.w r4, r7, #12 mov sp, r4 pop {r4, r5, r6, r7, pc} ... r7 is used as the frame pointer for thumb targets, and this function needs to restore the SP from the FP because of the variable-length stack allocation a. r7 is clobbered by the inline assembly (and r7 is included in the clobber list), but LLVM does not preserve the value of the frame pointer across the assembly block. This type of behavior is similar to GCC's and has been discussed on the bugtracker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11807 . No consensus seemed to have been reached on the way forward. Clang behavior has briefly been discussed on the CFE mailing (starting here: http://lists.llvm.org/pipermail/cfe-dev/2018-July/058392.html). I've opted for following Eli Friedman's advice to print warnings when there are reserved registers on the clobber list so as not to diverge from GCC behavior for now. The patch uses MachineRegisterInfo's target-specific knowledge of reserved registers, just before we convert the inline asm string in the AsmPrinter. If we find a reserved register, we print a warning: repro.c:6:7: warning: inline asm clobber list contains reserved registers: R7 [-Winline-asm] "mov r7, #1" ^ Reviewers: efriedma, olista01, javed.absar Reviewed By: efriedma Subscribers: eraman, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D51165 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341062 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Sep 16, 2018
…uter loops. Summary: [VPlan] Implement vector code generation support for simple outer loops. Context: Patch Series #1 for outer loop vectorization support in LV using VPlan. (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). This patch introduces vector code generation support for simple outer loops that are currently supported in the VPlanNativePath. Changes here essentially do the following: - force vector code generation using explicit vectorize_width - add conservative early returns in cost model and other places for VPlanNativePath - add code for setting up outer loop inductions - support for widening non-induction PHIs that can result from inner loops and uniform conditional branches - support for generating uniform inner branches We plan to add a handful C outer loop executable tests once the initial code generation support is committed. This patch is expected to be NFC for the inner loop vectorizer path. Since we are moving in the direction of supporting outer loop vectorization in LV, it may also be time to rename classes such as InnerLoopVectorizer. Reviewers: fhahn, rengolin, hsaito, dcaballe, mkuper, hfinkel, Ayal Reviewed By: fhahn, hsaito Subscribers: dmgreen, bollu, tschuett, rkruppe, rogfer01, llvm-commits Differential Revision: https://reviews.llvm.org/D50820 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342197 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Oct 8, 2018
…ression The following instruction: > str q28, [x0, #1*6*4*@] contains a @ which is parsed as an empty symbol. The parser returns true but has no error, so the assembler continues by ignoring the instruction. Differential Revision: https://reviews.llvm.org/D52645 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343961 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Oct 10, 2018
icmp ne (and X, 1), 0 --> trunc X to N x i1 Ideally, we'd do the same for scalars, but there will likely be regressions unless we add more trunc folds as we're doing here for vectors. The motivating vector case is from PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) { %c = fcmp ole <4 x float> %x, %y %s = sext <4 x i1> %c to <4 x i32> %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1> %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3> %cond = or <4 x i32> %s1, %s2 %condtr = trunc <4 x i32> %cond to <4 x i1> %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w ret <4 x float> %r } Here's a sampling of the vector codegen for that case using mask+icmp (current behavior) vs. trunc (with this patch): AVX before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vandps LCPI0_0(%rip), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vpcmpeqd %xmm1, %xmm0, %xmm0 vblendvps %xmm0, %xmm3, %xmm2, %xmm0 AVX after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vblendvps %xmm0, %xmm2, %xmm3, %xmm0 AVX512f before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpbroadcastd LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1] vptestnmd %zmm1, %zmm0, %k1 vblendmps %zmm3, %zmm2, %zmm0 {%k1} AVX512f after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpslld $31, %xmm0, %xmm0 vptestmd %zmm0, %zmm0, %k1 vblendmps %zmm2, %zmm3, %zmm0 {%k1} AArch64 before: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b movi v1.4s, #1 and v0.16b, v0.16b, v1.16b cmeq v0.4s, v0.4s, #0 bsl v0.16b, v3.16b, v2.16b AArch64 after: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b bsl v0.16b, v2.16b, v3.16b PowerPC-le before: xvcmpgesp 34, 35, 34 vspltisw 0, 1 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxlxor 35, 35, 35 xxland 34, 0, 32 vcmpequw 2, 2, 3 xxsel 34, 36, 37, 34 PowerPC-le after: xvcmpgesp 34, 35, 34 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxsel 34, 37, 36, 0 Differential Revision: https://reviews.llvm.org/D52747 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344082 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Oct 12, 2018
Re-trying r344082 because it unintentionally included extra diffs. Original commit message: icmp ne (and X, 1), 0 --> trunc X to N x i1 Ideally, we'd do the same for scalars, but there will likely be regressions unless we add more trunc folds as we're doing here for vectors. The motivating vector case is from PR37549: https://bugs.llvm.org/show_bug.cgi?id=37549 define <4 x float> @bitwise_select(<4 x float> %x, <4 x float> %y, <4 x float> %z, <4 x float> %w) { %c = fcmp ole <4 x float> %x, %y %s = sext <4 x i1> %c to <4 x i32> %s1 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 0, i32 0, i32 1, i32 1> %s2 = shufflevector <4 x i32> %s, <4 x i32> undef, <4 x i32> <i32 2, i32 2, i32 3, i32 3> %cond = or <4 x i32> %s1, %s2 %condtr = trunc <4 x i32> %cond to <4 x i1> %r = select <4 x i1> %condtr, <4 x float> %z, <4 x float> %w ret <4 x float> %r } Here's a sampling of the vector codegen for that case using mask+icmp (current behavior) vs. trunc (with this patch): AVX before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vandps LCPI0_0(%rip), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vpcmpeqd %xmm1, %xmm0, %xmm0 vblendvps %xmm0, %xmm3, %xmm2, %xmm0 AVX after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vblendvps %xmm0, %xmm2, %xmm3, %xmm0 AVX512f before: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpbroadcastd LCPI0_0(%rip), %xmm1 ## xmm1 = [1,1,1,1] vptestnmd %zmm1, %zmm0, %k1 vblendmps %zmm3, %zmm2, %zmm0 {%k1} AVX512f after: vcmpleps %xmm1, %xmm0, %xmm0 vpermilps $80, %xmm0, %xmm1 ## xmm1 = xmm0[0,0,1,1] vpermilps $250, %xmm0, %xmm0 ## xmm0 = xmm0[2,2,3,3] vorps %xmm0, %xmm1, %xmm0 vpslld $31, %xmm0, %xmm0 vptestmd %zmm0, %zmm0, %k1 vblendmps %zmm2, %zmm3, %zmm0 {%k1} AArch64 before: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b movi v1.4s, #1 and v0.16b, v0.16b, v1.16b cmeq v0.4s, v0.4s, #0 bsl v0.16b, v3.16b, v2.16b AArch64 after: fcmge v0.4s, v1.4s, v0.4s zip1 v1.4s, v0.4s, v0.4s zip2 v0.4s, v0.4s, v0.4s orr v0.16b, v1.16b, v0.16b bsl v0.16b, v2.16b, v3.16b PowerPC-le before: xvcmpgesp 34, 35, 34 vspltisw 0, 1 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxlxor 35, 35, 35 xxland 34, 0, 32 vcmpequw 2, 2, 3 xxsel 34, 36, 37, 34 PowerPC-le after: xvcmpgesp 34, 35, 34 vmrglw 3, 2, 2 vmrghw 2, 2, 2 xxlor 0, 35, 34 xxsel 34, 37, 36, 0 Differential Revision: https://reviews.llvm.org/D52747 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344181 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Oct 26, 2018
…>> (32 - y) pattern" *Seems* to be breaking sanitizer-x86_64-linux-fast buildbot, the ELF/relocatable-versioned.s test: ==17758==MemorySanitizer CHECK failed: /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_allocator.cc:191 "((kBlockMagic)) == ((((u64*)addr)[0]))" (0x6a6cb03abcebc041, 0x0) #0 0x59716b in MsanCheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/msan/msan.cc:393 #1 0x586635 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_termination.cc:79 #2 0x57d5ff in __sanitizer::InternalFree(void*, __sanitizer::SizeClassAllocatorLocalCache<__sanitizer::SizeClassAllocator32<__sanitizer::AP32> >*) /b/sanitizer-x86_64-linux-fast/build/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_allocator.cc:191 #3 0x7fc21b24193f (/lib/x86_64-linux-gnu/libc.so.6+0x3593f) #4 0x7fc21b241999 in exit (/lib/x86_64-linux-gnu/libc.so.6+0x35999) #5 0x7fc21b22c2e7 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e7) #6 0x57c039 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/lld+0x57c039) This reverts commit r345014. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345017 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Nov 1, 2018
Summary: As a bonus, this arguably improves the code by making it simpler. gcc 8 on Ubuntu 18.10 reports the following: ==39667==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7fffffff8ae0 at pc 0x555555dbfc68 bp 0x7fffffff8760 sp 0x7fffffff8750 WRITE of size 8 at 0x7fffffff8ae0 thread T0 #0 0x555555dbfc67 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_Alloc_hider::_Alloc_hider(char*, std::allocator<char>&&) /usr/include/c++/8/bits/basic_string.h:149 #1 0x555555dbfc67 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&) /usr/include/c++/8/bits/basic_string.h:542 #2 0x555555dbfc67 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&) /usr/include/c++/8/bits/basic_string.h:6009 #3 0x555555dbfc67 in searchableFieldType /home/nha/amd/build/san/llvm-src/utils/TableGen/SearchableTableEmitter.cpp:168 (...) Address 0x7fffffff8ae0 is located in stack of thread T0 at offset 864 in frame #0 0x555555dbef3f in searchableFieldType /home/nha/amd/build/san/llvm-src/utils/TableGen/SearchableTableEmitter.cpp:148 Reviewers: fhahn, simon_tatham, kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53931 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345749 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Nov 3, 2018
ModuleSummaryIndex::exportToDot crashes when linking the Linux kernel under ThinLTO using LLVMgold.so. This is due to the exportToDot function trying to get the GUID of an empty ValueInfo. The root cause related to the fact that we attempt to get the GUID of an aliasee via its OriginalGUID recorded in the aliasee summary, and that is not always possible. Specifically, we cannot do this mapping when the value is internal linkage and there were other internal linkage symbols with the same name. There are 2 fixes for the problem included here. 1) In all cases where we can currently print the dot file from the command line (which is only via save-temps), we have a valid AliaseeGUID in the AliasSummary. Use that when it is available, so that we can get the correct aliasee GUID whenever possible. 2) However, if we were to invoke exportToDot from the debugger right after it is built during the initial analysis step (i.e. the per-module summary), we won't have the AliaseeGUID field populated. In that case, we have a fallback fix that will simply print "@"+GUID when we aren't able to get the GUID from the OriginalGUID. It simply checks if the VI is valid or not before attempting to get the name. Additionally, since getAliaseeGUID will assert that the AliaseeGUID is non-zero, guard the earlier fix #1 by a new function hasAliaseeGUID(). Reviewers: pcc, tmroeder Subscribers: evgeny777, mehdi_amini, inglorion, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D53986 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346055 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Nov 15, 2018
Flags variable was not initialized and later used (both isMBBSafeToOutlineFrom implementations assume it's initialized), which breaks test/CodeGen/AArch64/machine-outliner.mir. under memory sanitizer: MemorySanitizer: use-of-uninitialized-value #0 in llvm::AArch64InstrInfo::getOutliningType(llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>&, unsigned int) const llvm/lib/Target/AArch64/AArch64InstrInfo.cpp:5494:9 #1 in (anonymous namespace)::InstructionMapper::convertToUnsignedVec(llvm::MachineBasicBlock&, llvm::TargetInstrInfo const&) llvm/lib/CodeGen/MachineOutliner.cpp:772:19 #2 in (anonymous namespace)::MachineOutliner::populateMapper((anonymous namespace)::InstructionMapper&, llvm::Module&, llvm::MachineModuleInfo&) llvm/lib/CodeGen/MachineOutliner.cpp:1543:14 #3 in (anonymous namespace)::MachineOutliner::runOnModule(llvm::Module&) llvm/lib/CodeGen/MachineOutliner.cpp:1645:3 #4 in (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) llvm/lib/IR/LegacyPassManager.cpp:1744:27 #5 in llvm::legacy::PassManagerImpl::run(llvm::Module&) llvm/lib/IR/LegacyPassManager.cpp:1857:44 #6 in compileModule(char**, llvm::LLVMContext&) llvm/tools/llc/llc.cpp:597:8 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346761 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Jan 31, 2019
…implify (#1) This is the second of a series of patches simplifying llvm-symbolizer tests. See r352752 for the first. This one splits out 5 distinct test cases from llvm-symbolizer.test into separate tests, and simplifies them slightly by using --obj/positional arguments for the input file and addresses instead of stdin. See https://bugs.llvm.org/show_bug.cgi?id=40070#c1 for the motivation. Reviewed by: dblaikie Differential Revision: https://reviews.llvm.org/D57443 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@352753 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Mar 1, 2019
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355013 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Mar 7, 2019
Introduces memory leak in FunctionTest.GetPointerAlignment that breaks sanitizer buildbots: ``` ================================================================= ==2453==ERROR: LeakSanitizer: detected memory leaks Direct leak of 128 byte(s) in 1 object(s) allocated from: #0 0x610428 in operator new(unsigned long) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:105 #1 0x16936bc in llvm::User::operator new(unsigned long) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/User.cpp:151:19 #2 0x7c3fe9 in Create /b/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/IR/Function.h:144:12 #3 0x7c3fe9 in (anonymous namespace)::FunctionTest_GetPointerAlignment_Test::TestBody() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/unittests/IR/FunctionTest.cpp:136 #4 0x1a836a0 in HandleExceptionsInMethodIfSupported<testing::Test, void> /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc #5 0x1a836a0 in testing::Test::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2474 #6 0x1a85c55 in testing::TestInfo::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2656:11 #7 0x1a870d0 in testing::TestCase::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2774:28 #8 0x1aa5b84 in testing::internal::UnitTestImpl::RunAllTests() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:4649:43 #9 0x1aa4d30 in HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc #10 0x1aa4d30 in testing::UnitTest::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:4257 #11 0x1a6b656 in RUN_ALL_TESTS /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/include/gtest/gtest.h:2233:46 #12 0x1a6b656 in main /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/UnitTestMain/TestMain.cpp:50 #13 0x7f5af37a22e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) Indirect leak of 40 byte(s) in 1 object(s) allocated from: #0 0x610428 in operator new(unsigned long) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:105 #1 0x151be6b in make_unique<llvm::ValueSymbolTable> /b/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/ADT/STLExtras.h:1349:29 #2 0x151be6b in llvm::Function::Function(llvm::FunctionType*, llvm::GlobalValue::LinkageTypes, unsigned int, llvm::Twine const&, llvm::Module*) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/lib/IR/Function.cpp:241 #3 0x7c4006 in Create /b/sanitizer-x86_64-linux-bootstrap/build/llvm/include/llvm/IR/Function.h:144:16 #4 0x7c4006 in (anonymous namespace)::FunctionTest_GetPointerAlignment_Test::TestBody() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/unittests/IR/FunctionTest.cpp:136 #5 0x1a836a0 in HandleExceptionsInMethodIfSupported<testing::Test, void> /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc #6 0x1a836a0 in testing::Test::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2474 #7 0x1a85c55 in testing::TestInfo::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2656:11 #8 0x1a870d0 in testing::TestCase::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2774:28 #9 0x1aa5b84 in testing::internal::UnitTestImpl::RunAllTests() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:4649:43 #10 0x1aa4d30 in HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc #11 0x1aa4d30 in testing::UnitTest::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:4257 #12 0x1a6b656 in RUN_ALL_TESTS /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/include/gtest/gtest.h:2233:46 #13 0x1a6b656 in main /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/UnitTestMain/TestMain.cpp:50 #14 0x7f5af37a22e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) SUMMARY: AddressSanitizer: 168 byte(s) leaked in 2 allocation(s). ``` See http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/11358/steps/check-llvm%20asan/logs/stdio for more information. Also introduces use-of-uninitialized-value in ConstantsTest.FoldGlobalVariablePtr: ``` ==7070==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x14e703c in User /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/IR/User.h:79:5 #1 0x14e703c in Constant /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/IR/Constant.h:44 #2 0x14e703c in llvm::GlobalValue::GlobalValue(llvm::Type*, llvm::Value::ValueTy, llvm::Use*, unsigned int, llvm::GlobalValue::LinkageTypes, llvm::Twine const&, unsigned int) /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/IR/GlobalValue.h:78 #3 0x14e5467 in GlobalObject /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/IR/GlobalObject.h:34:9 #4 0x14e5467 in llvm::GlobalVariable::GlobalVariable(llvm::Type*, bool, llvm::GlobalValue::LinkageTypes, llvm::Constant*, llvm::Twine const&, llvm::GlobalValue::ThreadLocalMode, unsigned int, bool) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/Globals.cpp:314 #5 0x6938f1 in llvm::(anonymous namespace)::ConstantsTest_FoldGlobalVariablePtr_Test::TestBody() /b/sanitizer-x86_64-linux-fast/build/llvm/unittests/IR/ConstantsTest.cpp:565:18 #6 0x1a240a1 in HandleExceptionsInMethodIfSupported<testing::Test, void> /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc #7 0x1a240a1 in testing::Test::Run() /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc:2474 #8 0x1a26d26 in testing::TestInfo::Run() /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc:2656:11 #9 0x1a2815f in testing::TestCase::Run() /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc:2774:28 #10 0x1a43de8 in testing::internal::UnitTestImpl::RunAllTests() /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc:4649:43 #11 0x1a42c47 in HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc #12 0x1a42c47 in testing::UnitTest::Run() /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/src/gtest.cc:4257 #13 0x1a0dfba in RUN_ALL_TESTS /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/googletest/include/gtest/gtest.h:2233:46 #14 0x1a0dfba in main /b/sanitizer-x86_64-linux-fast/build/llvm/utils/unittest/UnitTestMain/TestMain.cpp:50 #15 0x7f2081c412e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #16 0x4dff49 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/unittests/IR/IRTests+0x4dff49) SUMMARY: MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-fast/build/llvm/include/llvm/IR/User.h:79:5 in User ``` See http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/30222/steps/check-llvm%20msan/logs/stdio for more information. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355616 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Mar 22, 2019
Currently, the type id for a derived type is computed incorrectly. For example, type #1: int type #2: ptr to #1 For a global variable "int *a", type #1 will be attributed to variable "a". This is due to a bug which assigns the type id of the basetype of that derived type as the derived type's type id. This happens to "const", "volatile", "restrict", "typedef" and "pointer" types. This patch fixed this bug, fixed existing test cases and added a new one focusing on pointers plus other derived types. Signed-off-by: Yonghong Song <yhs@fb.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356727 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Apr 13, 2019
The size field of a location can be different for each entry, so it is useful to have this displayed in the output of llvm-readobj -stackmap. Below is an example of how the output would look: Record ID: 2882400000, instruction offset: 16 3 locations: #1: Constant 1, size: 8 #2: Constant 2, size: 8 #3: Constant 3, size: 8 0 live-outs: [ ] Patch By: jacob.hughes@kcl.ac.uk (with heavy modification by me) Differential Revision: https://reviews.llvm.org/D59169 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358324 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
May 3, 2019
Looks like just a minor oversight in the parsing code. Fixes https://bugs.llvm.org/show_bug.cgi?id=41504. Differential Revision: https://reviews.llvm.org/D60840 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359855 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
May 16, 2019
This is the first in a set of patches I have to improve testing of llvm-objdump. This patch targets --all-headers, --section, and --full-contents. In the --section case, it deletes a pre-canned binary which is only used by the one test and replaces it with yaml. Reviewed by: grimar, MaskRay Differential Revision: https://reviews.llvm.org/D61941 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360893 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Jun 12, 2019
Looks like a MachinePipeliner algorithm problem found by sanitizer-x86_64-linux-fast. I will backout this test first while investigating the problem to unblock buildbot. ==49637==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x614000002e08 at pc 0x000004364350 bp 0x7ffe228a3bd0 sp 0x7ffe228a3bc8 READ of size 4 at 0x614000002e08 thread T0 #0 0x436434f in llvm::SwingSchedulerDAG::checkValidNodeOrder(llvm::SmallVector<llvm::NodeSet, 8u> const&) const /b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:3736:11 #1 0x4342cd0 in llvm::SwingSchedulerDAG::schedule() /b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:486:3 #2 0x434042d in llvm::MachinePipeliner::swingModuloScheduler(llvm::MachineLoop&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:385:7 #3 0x433eb90 in llvm::MachinePipeliner::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachinePipeliner.cpp:207:5 #4 0x428b7ea in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:13 #5 0x4d1a913 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1648:27 #6 0x4d1b192 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1685:16 #7 0x4d1c06d in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1752:27 #8 0x4d1c06d in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm/lib/IR/LegacyPassManager.cpp:1865 #9 0xa48ca3 in compileModule(char**, llvm::LLVMContext&) /b/sanitizer-x86_64-linux-fast/build/llvm/tools/llc/llc.cpp:611:8 #10 0xa4270f in main /b/sanitizer-x86_64-linux-fast/build/llvm/tools/llc/llc.cpp:365:22 #11 0x7fec902572e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #12 0x971b69 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan/bin/llc+0x971b69) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363105 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Jun 22, 2019
On Windows ARM64, intrinsic __debugbreak is compiled into brk #0xF000 which is mapped to llvm.debugtrap in Clang. Instruction brk #F000 is the defined break point instruction on ARM64 which is recognized by Windows debugger and exception handling code, so llvm.debugtrap should map to it instead of redirecting to llvm.trap (brk #1) as the default implementation. Differential Revision: https://reviews.llvm.org/D63635 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364115 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Jul 11, 2019
Introduction ============ This patch added intial support for bpf program compile once and run everywhere (CO-RE). The main motivation is for bpf program which depends on kernel headers which may vary between different kernel versions. The initial discussion can be found at https://lwn.net/Articles/773198/. Currently, bpf program accesses kernel internal data structure through bpf_probe_read() helper. The idea is to capture the kernel data structure to be accessed through bpf_probe_read() and relocate them on different kernel versions. On each host, right before bpf program load, the bpfloader will look at the types of the native linux through vmlinux BTF, calculates proper access offset and patch the instruction. To accommodate this, three intrinsic functions preserve_{array,union,struct}_access_index are introduced which in clang will preserve the base pointer, struct/union/array access_index and struct/union debuginfo type information. Later, bpf IR pass can reconstruct the whole gep access chains without looking at gep itself. This patch did the following: . An IR pass is added to convert preserve_*_access_index to global variable who name encodes the getelementptr access pattern. The global variable has metadata attached to describe the corresponding struct/union debuginfo type. . An SimplifyPatchable MachineInstruction pass is added to remove unnecessary loads. . The BTF output pass is enhanced to generate relocation records located in .BTF.ext section. Typical CO-RE also needs support of global variables which can be assigned to different values to different hosts. For example, kernel version can be used to guard different versions of codes. This patch added the support for patchable externals as well. Example ======= The following is an example. struct pt_regs { long arg1; long arg2; }; struct sk_buff { int i; struct net_device *dev; }; #define _(x) (__builtin_preserve_access_index(x)) static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr) = (void *) 4; extern __attribute__((section(".BPF.patchable_externs"))) unsigned __kernel_version; int bpf_prog(struct pt_regs *ctx) { struct net_device *dev = 0; // ctx->arg* does not need bpf_probe_read if (__kernel_version >= 41608) bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff *)ctx->arg1)->dev)); else bpf_probe_read(&dev, sizeof(dev), _(&((struct sk_buff *)ctx->arg2)->dev)); return dev != 0; } In the above, we want to translate the third argument of bpf_probe_read() as relocations. -bash-4.4$ clang -target bpf -O2 -g -S trace.c The compiler will generate two new subsections in .BTF.ext, OffsetReloc and ExternReloc. OffsetReloc is to record the structure member offset operations, and ExternalReloc is to record the external globals where only u8, u16, u32 and u64 are supported. BPFOffsetReloc Size struct SecLOffsetReloc for ELF section #1 A number of struct BPFOffsetReloc for ELF section #1 struct SecOffsetReloc for ELF section #2 A number of struct BPFOffsetReloc for ELF section #2 ... BPFExternReloc Size struct SecExternReloc for ELF section #1 A number of struct BPFExternReloc for ELF section #1 struct SecExternReloc for ELF section #2 A number of struct BPFExternReloc for ELF section #2 struct BPFOffsetReloc { uint32_t InsnOffset; ///< Byte offset in this section uint32_t TypeID; ///< TypeID for the relocation uint32_t OffsetNameOff; ///< The string to traverse types }; struct BPFExternReloc { uint32_t InsnOffset; ///< Byte offset in this section uint32_t ExternNameOff; ///< The string for external variable }; Note that only externs with attribute section ".BPF.patchable_externs" are considered for Extern Reloc which will be patched by bpf loader right before the load. For the above test case, two offset records and one extern record will be generated: OffsetReloc records: .long .Ltmp12 # Insn Offset .long 7 # TypeId .long 242 # Type Decode String .long .Ltmp18 # Insn Offset .long 7 # TypeId .long 242 # Type Decode String ExternReloc record: .long .Ltmp5 # Insn Offset .long 165 # External Variable In string table: .ascii "0:1" # string offset=242 .ascii "__kernel_version" # string offset=165 The default member offset can be calculated as the 2nd member offset (0 representing the 1st member) of struct "sk_buff". The asm code: .Ltmp5: .Ltmp6: r2 = 0 r3 = 41608 .Ltmp7: .Ltmp8: .loc 1 18 9 is_stmt 0 # t.c:18:9 .Ltmp9: if r3 > r2 goto LBB0_2 .Ltmp10: .Ltmp11: .loc 1 0 9 # t.c:0:9 .Ltmp12: r2 = 8 .Ltmp13: .loc 1 19 66 is_stmt 1 # t.c:19:66 .Ltmp14: .Ltmp15: r3 = *(u64 *)(r1 + 0) goto LBB0_3 .Ltmp16: .Ltmp17: LBB0_2: .loc 1 0 66 is_stmt 0 # t.c:0:66 .Ltmp18: r2 = 8 .loc 1 21 66 is_stmt 1 # t.c:21:66 .Ltmp19: r3 = *(u64 *)(r1 + 8) .Ltmp20: .Ltmp21: LBB0_3: .loc 1 0 66 is_stmt 0 # t.c:0:66 r3 += r2 r1 = r10 .Ltmp22: .Ltmp23: .Ltmp24: r1 += -8 r2 = 8 call 4 For instruction .Ltmp12 and .Ltmp18, "r2 = 8", the number 8 is the structure offset based on the current BTF. Loader needs to adjust it if it changes on the host. For instruction .Ltmp5, "r2 = 0", the external variable got a default value 0, loader needs to supply an appropriate value for the particular host. Compiling to generate object code and disassemble: 0000000000000000 bpf_prog: 0: b7 02 00 00 00 00 00 00 r2 = 0 1: 7b 2a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r2 2: b7 02 00 00 00 00 00 00 r2 = 0 3: b7 03 00 00 88 a2 00 00 r3 = 41608 4: 2d 23 03 00 00 00 00 00 if r3 > r2 goto +3 <LBB0_2> 5: b7 02 00 00 08 00 00 00 r2 = 8 6: 79 13 00 00 00 00 00 00 r3 = *(u64 *)(r1 + 0) 7: 05 00 02 00 00 00 00 00 goto +2 <LBB0_3> 0000000000000040 LBB0_2: 8: b7 02 00 00 08 00 00 00 r2 = 8 9: 79 13 08 00 00 00 00 00 r3 = *(u64 *)(r1 + 8) 0000000000000050 LBB0_3: 10: 0f 23 00 00 00 00 00 00 r3 += r2 11: bf a1 00 00 00 00 00 00 r1 = r10 12: 07 01 00 00 f8 ff ff ff r1 += -8 13: b7 02 00 00 08 00 00 00 r2 = 8 14: 85 00 00 00 04 00 00 00 call 4 Instructions #2, #5 and #8 need relocation resoutions from the loader. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D61524 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365503 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Jul 25, 2019
Summary: Every time PrettyPrinter::printInst is called, stdout is flushed and it makes llvm-objdump slow. This patches adds a string buffer to prevent stdout from being flushed. Benchmark results (./llvm-objdump-master: without this patch, ./bin/llvm-objcopy: with this patch): $ hyperfine --warmup 10 './llvm-objdump-master -d ./bin/llvm-objcopy' './bin/llvm-objdump -d ./bin/llvm-objcopy' Benchmark #1: ./llvm-objdump-master -d ./bin/llvm-objcopy Time (mean ± σ): 2.230 s ± 0.050 s [User: 1.533 s, System: 0.682 s] Range (min … max): 2.115 s … 2.278 s 10 runs Benchmark #2: ./bin/llvm-objdump -d ./bin/llvm-objcopy Time (mean ± σ): 386.4 ms ± 13.0 ms [User: 376.6 ms, System: 6.1 ms] Range (min … max): 366.1 ms … 407.0 ms 10 runs Summary './bin/llvm-objdump -d ./bin/llvm-objcopy' ran 5.77 ± 0.23 times faster than './llvm-objdump-master -d ./bin/llvm-objcopy' Reviewers: alexshap, Bigcheese, jhenderson, rupprecht, grimar, MaskRay Reviewed By: jhenderson, MaskRay Subscribers: dexonsmith, jhenderson, javed.absar, kristof.beyls, rupprecht, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366984 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Aug 14, 2019
Summary: int mm(char *a, char *b) { return memcmp(a,b,16); } Currently: define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 { entry: %call = tail call i32 @memcmp(i8* %a, i8* %b, i64 16) ret i32 %call } After patch: define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 { entry: %call = tail call i32 @memcmp(i8* dereferenceable(16) %a, i8* dereferenceable(16) %b, i64 16) ret i32 %call } Reviewers: jdoerfert, efriedma Reviewed By: jdoerfert Subscribers: javed.absar, spatel, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66079 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368657 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Aug 27, 2019
Summary: Before we required the comparison against null to be "canonical", hence null to be operand #1. This patch allows null to be in either operand, similar to the handling of loaded globals that follows. Reviewers: sanjoy, hfinkel, aykevl, sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66321 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369158 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Sep 2, 2019
…ymbol that belongs to a section with a broken sh_name" It broke BB: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/16955/steps/test/logs/stdio Expected<T> must be checked before access or destruction. Unchecked Expected<T> contained error: a section [index 1] has an invalid sh_name (0xffff) offset which goes past the end of the section name string tableStack dump: 0. Program arguments: /srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm /srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/test/tools/llvm-nm/Output/format-sysv-section.test.tmp2.o --format=sysv #0 0x00000000008af7c4 PrintStackTraceSignalHandler(void*) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x8af7c4) #1 0x00000000008ad8be llvm::sys::RunSignalHandlers() (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x8ad8be) #2 0x00000000008afbd8 SignalHandler(int) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x8afbd8) #3 0x00007f0a6b989730 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12730) #4 0x00007f0a6b48d7bb raise (/lib/x86_64-linux-gnu/libc.so.6+0x377bb) #5 0x00007f0a6b478535 abort (/lib/x86_64-linux-gnu/libc.so.6+0x22535) #6 0x000000000042004b llvm::Expected<llvm::StringRef>::fatalUncheckedExpected() const (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x42004b) #7 0x00000000008367f5 (/sv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x8367f5) #8 0x0000000000817b80 llvm::object::IRObjectFile::findBitcodeInObject(llvm::object::ObjectFile const&) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x817b80) #9 0x0000000000838416 llvm::object::SymbolicFile::createSymbolicFile(llvm::MemoryBufferRef, llvm::file_magic, llvm::LLVMContext*) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x838416) #10 0x00000000007f36cb llvm::object::createBinary(llvm::MemoryBufferRef, llvm::LLVMContext*) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x7f36cb) #11 0x0000000000413123 dumpSymbolNamesFromFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x413123) #12 0x0000000000412e38 main (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x412e38) #13 0x00007f0a6b47a09b __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409b) #14 0x00000000004120da _start (/srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/llvm-nm+0x4120da) FileCheck error: '-' is empty. FileCheck command line: /srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /srv/llvm-buildbot-srcatch/llvm-build-dir/clang-x86_64-debian-fast/llvm.src/test/tools/llvm-nm/format-sysv-section.test --check-prefix=ERR -- git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370662 91177308-0d34-0410-b5e6-96231b3b80d8
kzhuravl
pushed a commit
that referenced
this pull request
Sep 19, 2019
Summary: This reverts commit r372204. This change causes build bot failures under msan: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/35236/steps/check-llvm%20msan/logs/stdio: ``` FAIL: LLVM :: DebugInfo/AArch64/asan-stack-vars.mir (19531 of 33579) ******************** TEST 'LLVM :: DebugInfo/AArch64/asan-stack-vars.mir' FAILED ******************** Script: -- : 'RUN: at line 1'; /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc -O0 -start-before=livedebugvalues -filetype=obj -o - /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-dwarfdump -v - | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir -- Exit Code: 2 Command Output (stderr): -- ==62894==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0xdfcafb in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 #1 0xdfae8a in resolveFrameIndexReference /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1580:10 #2 0xdfae8a in llvm::AArch64FrameLowering::getFrameIndexReference(llvm::MachineFunction const&, int, unsigned int&) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1536 #3 0x46642c1 in (anonymous namespace)::LiveDebugValues::extractSpillBaseRegAndOffset(llvm::MachineInstr const&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:582:21 #4 0x4647cb3 in transferSpillOrRestoreInst /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:883:11 #5 0x4647cb3 in process /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1079 #6 0x4647cb3 in (anonymous namespace)::LiveDebugValues::ExtendRanges(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1361 #7 0x463ac0e in (anonymous namespace)::LiveDebugValues::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1415:18 #8 0x4854ef0 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:13 #9 0x53b0b01 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1648:27 #10 0x53b15f6 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1685:16 #11 0x53b298d in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1750:27 #12 0x53b298d in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1863 #13 0x905f21 in compileModule(char**, llvm::LLVMContext&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:601:8 #14 0x8fdc4e in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:355:22 #15 0x7f67673632e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #16 0x882369 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc+0x882369) MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const Exiting error: -: The file was not recognized as a valid object file FileCheck error: '-' is empty. FileCheck command line: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir ``` Reviewers: bkramer Reviewed By: bkramer Subscribers: sdardis, aprantl, kristof.beyls, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67710 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372228 91177308-0d34-0410-b5e6-96231b3b80d8
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Handles 1.2 builtin calls and create wrapper functions that call 2.0 counterpart inside.