Skip to content

Draft: Revert "[OpenCL] Allow taking address of functions as an extension." #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

mlychkov
Copy link

This reverts commit abbdb56.

@mlychkov mlychkov marked this pull request as ready for review March 19, 2021 13:05
@mlychkov mlychkov changed the title Revert "[OpenCL] Allow taking address of functions as an extension." Draft: Revert "[OpenCL] Allow taking address of functions as an extension." Mar 19, 2021
@mlychkov mlychkov closed this Mar 19, 2021
@mlychkov mlychkov deleted the private/mlychkov/llvmspirv_pulldown_test branch March 19, 2021 13:08
vmaksimo pushed a commit that referenced this pull request Apr 6, 2021
This reverts commit 9be8f8b.
This breaks tsan on Ubuntu 16.04:

    $ cat tiny_race.c
    #include <pthread.h>
    int Global;
    void *Thread1(void *x) {
      Global = 42;
      return x;
    }
    int main() {
      pthread_t t;
      pthread_create(&t, NULL, Thread1, NULL);
      Global = 43;
      pthread_join(t, NULL);
      return Global;
    }
    $ out/gn/bin/clang -fsanitize=thread -g -O1 tiny_race.c --sysroot ~/src/chrome/src/build/linux/debian_sid_amd64-sysroot/
    $ docker run -v $PWD:/foo ubuntu:xenial /foo/a.out
    FATAL: ThreadSanitizer CHECK failed: ../../compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp:447 "((thr_beg)) >= ((tls_addr))" (0x7fddd76beb80, 0xfffffffffffff980)
        #0 <null> <null> (a.out+0x4960b6)
        #1 <null> <null> (a.out+0x4b677f)
        #2 <null> <null> (a.out+0x49cf94)
        #3 <null> <null> (a.out+0x499bd2)
        intel#4 <null> <null> (a.out+0x42aaf1)
        intel#5 <null> <null> (libpthread.so.0+0x76b9)
        intel#6 <null> <null> (libc.so.6+0x1074dc)

(Get the sysroot from here: https://commondatastorage.googleapis.com/chrome-linux-sysroot/toolchain/500976182686961e34974ea7bdc0a21fca32be06/debian_sid_amd64_sysroot.tar.xz)

Also reverts follow-on commits:
This reverts commit 58c62fd.
This reverts commit 31e541e.
vmaksimo pushed a commit that referenced this pull request Apr 6, 2021
  CONFLICT (content): Merge conflict in clang/lib/Sema/TreeTransform.h
  CONFLICT (content): Merge conflict in clang/lib/Sema/SemaStmtAttr.cpp
vmaksimo pushed a commit that referenced this pull request Apr 12, 2021
vmaksimo pushed a commit that referenced this pull request Apr 26, 2021
  CONFLICT (content): Merge conflict in clang/lib/Frontend/CompilerInvocation.cpp
vmaksimo pushed a commit that referenced this pull request Apr 26, 2021
vmaksimo pushed a commit that referenced this pull request Apr 26, 2021
vmaksimo pushed a commit that referenced this pull request May 4, 2021
Commit e3d8ee3 ("reland "[DebugInfo] Support to emit debugInfo
for extern variables"") added support to emit debugInfo for
extern variables if requested by the target. Currently, only
BPF target enables this feature by default.

As BPF ecosystem grows, callback function started to get
support, e.g., recently bpf_for_each_map_elem() is introduced
(https://lwn.net/Articles/846504/) with a callback function as an
argument. In the future we may have something like below as
a demonstration of use case :
    extern int do_work(int);
    long bpf_helper(void *callback_fn, void *callback_ctx, ...);
    long prog_main() {
        struct { ... } ctx = { ... };
        return bpf_helper(&do_work, &ctx, ...);
    }
Basically bpf helper may have a callback function and the
callback function is defined in another file or in the kernel.
In this case, we would like to know the debuginfo types for
do_work(), so the verifier can proper verify the safety of
bpf_helper() call.

For the following example,
    extern int do_work(int);
    long bpf_helper(void *callback_fn);
    long prog() {
        return bpf_helper(&do_work);
    }

Currently, there is no debuginfo generated for extern function do_work().
In the IR, we have,
    ...
    define dso_local i64 @prog() local_unnamed_addr #0 !dbg !7 {
    entry:
      %call = tail call i64 @bpf_helper(i8* bitcast (i32 (i32)* @do_work to i8*)) #2, !dbg !11
      ret i64 %call, !dbg !12
    }
    ...
    declare dso_local i32 @do_work(i32) #1
    ...

This patch added support for the above callback function use case, and
the generated IR looks like below:
    ...
    declare !dbg !17 dso_local i32 @do_work(i32) #1
    ...
    !17 = !DISubprogram(name: "do_work", scope: !1, file: !1, line: 1, type: !18, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, retainedNodes: !2)
    !18 = !DISubroutineType(types: !19)
    !19 = !{!20, !20}
    !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)

The TargetInfo.allowDebugInfoForExternalVar is renamed to
TargetInfo.allowDebugInfoForExternalRef as now it guards
both extern variable and extern function debuginfo generation.

Differential Revision: https://reviews.llvm.org/D100567
vmaksimo pushed a commit that referenced this pull request May 4, 2021
…rtial type

llvm-dwarfdump crashed for Unit header with DW_UT_partial type.
-------------
llvm-dwarfdump: /tmp/llvm/include/llvm/ADT/Optional.h:197: T& llvm::optional_detail::OptionalStorage<T, true>::getValue() &
[with T = long unsigned int]: Assertion `hasVal' failed.
PLEASE submit a bug report to the technical support section of https://developer.amd.com/amd-aocc and include the crash backtrace.
Stack dump:
0.      Program arguments: llvm-dwarfdump -v /tmp/test/DebugInfo/X86/Output/dwarfdump-he
ader.s.tmp.o
 #0 0x00007f37d5ad8838 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /tmp/llvm/lib/Support/Unix/Signals.inc:565:0
 #1 0x00007f37d5ad88ef PrintStackTraceSignalHandler(void*) /tmp/llvm/lib/Support/Unix/Signals.inc:632:0
 #2 0x00007f37d5ad65bd llvm::sys::RunSignalHandlers() /tmp/llvm/lib/Support/Signals.cpp:71:0
 #3 0x00007f37d5ad81b9 SignalHandler(int) /tmp/llvm/lib/Support/Unix/Signals.inc:407:0
 intel#4 0x00007f37d4c26040 (/lib/x86_64-linux-gnu/libc.so.6+0x3f040)
 intel#5 0x00007f37d4c25fb7 raise /build/glibc-S9d2JN/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0
 intel#6 0x00007f37d4c27921 abort /build/glibc-S9d2JN/glibc-2.27/stdlib/abort.c:81:0
 intel#7 0x00007f37d4c1748a __assert_fail_base /build/glibc-S9d2JN/glibc-2.27/assert/assert.c:89:0
 intel#8 0x00007f37d4c17502 (/lib/x86_64-linux-gnu/libc.so.6+0x30502)
 intel#9 0x00007f37d7576b81 llvm::optional_detail::OptionalStorage<unsigned long, true>::getValue() & /tmp/llvm/include/llvm/ADT/Optional.h:198:0
 intel#10 0x00007f37d75726ac llvm::Optional<unsigned long>::operator*() && /tmp/llvm/include/llvm/ADT/Optional.h:309:0
 intel#11 0x00007f37d7582968 llvm::DWARFCompileUnit::dump(llvm::raw_ostream&, llvm::DIDumpOptions) /tmp/llvm/lib/DebugInfo/DWARF/DWARFCompileUnit.cpp:30:0
--------------

Patch by: @jini.susan

Reviewed By: @probinson

Differential Revision: https://reviews.llvm.org/D101255
vmaksimo pushed a commit that referenced this pull request May 4, 2021
This patch implements expansion of llvm.vp.* intrinsics
(https://llvm.org/docs/LangRef.html#vector-predication-intrinsics).

VP expansion is required for targets that do not implement VP code
generation. Since expansion is controllable with TTI, targets can switch
on the VP intrinsics they do support in their backend offering a smooth
transition strategy for VP code generation (VE, RISC-V V, ARM SVE,
AVX512, ..).

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D78203
vmaksimo pushed a commit that referenced this pull request May 4, 2021
  CONFLICT (content): Merge conflict in llvm/include/llvm/LinkAllPasses.h
vmaksimo pushed a commit that referenced this pull request May 18, 2021
…ing it

Having nested macros in the C code could cause clangd to fail an assert in clang::Preprocessor::setLoadedMacroDirective() and crash.

 #1 0x00000000007ace30 PrintStackTraceSignalHandler(void*) /qdelacru/llvm-project/llvm/lib/Support/Unix/Signals.inc:632:1
 #2 0x00000000007aaded llvm::sys::RunSignalHandlers() /qdelacru/llvm-project/llvm/lib/Support/Signals.cpp:76:20
 #3 0x00000000007ac7c1 SignalHandler(int) /qdelacru/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1
 intel#4 0x00007f096604db20 __restore_rt (/lib64/libpthread.so.0+0x12b20)
 intel#5 0x00007f0964b307ff raise (/lib64/libc.so.6+0x377ff)
 intel#6 0x00007f0964b1ac35 abort (/lib64/libc.so.6+0x21c35)
 intel#7 0x00007f0964b1ab09 _nl_load_domain.cold.0 (/lib64/libc.so.6+0x21b09)
 intel#8 0x00007f0964b28de6 (/lib64/libc.so.6+0x2fde6)
 intel#9 0x0000000001004d1a clang::Preprocessor::setLoadedMacroDirective(clang::IdentifierInfo*, clang::MacroDirective*, clang::MacroDirective*) /qdelacru/llvm-project/clang/lib/Lex/PPMacroExpansion.cpp:116:5

An example of the code that causes the assert failure:
```
...
```

During code completion in clangd, the macros will be loaded in loadMainFilePreambleMacros() by iterating over the macro names and calling PreambleIdentifiers->get(). Since these macro names are store in a StringSet (has StringMap underlying container), the order of the iterator is not guaranteed to be same as the order seen in the source code.

When clangd is trying to resolve nested macros it sometimes attempts to load them out of order which causes a macro to be stored twice. In the example above, ECHO2 macro gets resolved first, but since it uses another macro that has not been resolved it will try to resolve/store that as well. Now there are two MacroDirectives stored in the Preprocessor, ECHO and ECHO2. When clangd tries to load the next macro, ECHO, the preprocessor fails an assert in clang::Preprocessor::setLoadedMacroDirective() because there is already a MacroDirective stored for that macro name.

In this diff, I check if the macro is already inside the IdentifierTable and if it is skip it so that it is not resolved twice.

Reviewed By: kadircet

Differential Revision: https://reviews.llvm.org/D101870
vmaksimo pushed a commit that referenced this pull request May 18, 2021
vmaksimo pushed a commit that referenced this pull request May 25, 2021
Sample profile loader can be run in both LTO prelink and postlink. Currently the counts annoation in postilnk doesn't fully overwrite what's done in prelink. I'm adding a switch (`-overwrite-existing-weights=1`) to enable a full overwrite, which includes:

1. Clear old metadata for calls when their parent block has a zero count. This could be caused by prelink code duplication.

2. Clear indirect call metadata if somehow all the rest targets have a sum of zero count.

3. Overwrite branch weight for basic blocks.

With a CS profile, I was seeing #1 and #2 help reduce code size by preventing post-sample ICP and CGSCC inliner working on obsolete metadata, which come from a partial global inlining in prelink.  It's not expected to work well for non-CS case with a less-accurate post-inline count quality.

It's worth calling out that some prelink optimizations can damage counts quality in an irreversible way. One example is the loop rotate optimization. Due to lack of exact loop entry count (profiling can only give loop iteration count and loop exit count), moving one iteration out of the loop body leaves the rest iteration count unknown. We had to turn off prelink loop rotate to achieve a better postlink counts quality. A even better postlink counts quality can be archived by turning off prelink CGSCC inlining which is not context-sensitive.

Reviewed By: wenlei, wmi

Differential Revision: https://reviews.llvm.org/D102537
vmaksimo pushed a commit that referenced this pull request May 25, 2021
  CONFLICT (content): Merge conflict in clang/lib/Driver/ToolChains/Clang.cpp
vmaksimo pushed a commit that referenced this pull request May 25, 2021
  CONFLICT (content): Merge conflict in clang/test/Misc/nvptx.languageOptsOpenCL.cl
vmaksimo pushed a commit that referenced this pull request Jun 15, 2021
  CONFLICT (content): Merge conflict in llvm/test/Transforms/OpenMP/hide_mem_transfer_latency.ll
vmaksimo pushed a commit that referenced this pull request Jun 28, 2021
Rust's v0 name mangling scheme [1] is easy to disambiguate from other
name mangling schemes because symbols always start with `_R`. The llvm
Demangle library supports demangling the Rust v0 scheme. Use it to
demangle Rust symbols.

Added unit tests that check simple symbols. Ran LLDB built with this
patch to debug some Rust programs compiled with the v0 name mangling
scheme. Confirmed symbol names were demangled as expected.

Note: enabling the new name mangling scheme requires a nightly
toolchain:

```
$ cat main.rs
fn main() {
    println!("Hello world!");
}
$ $(rustup which --toolchain nightly rustc) -Z symbol-mangling-version=v0 main.rs -g
$ /home/asm/hacking/llvm/build/bin/lldb ./main --one-line 'b main.rs:2'
(lldb) target create "./main"
Current executable set to '/home/asm/hacking/llvm/rust/main' (x86_64).
(lldb) b main.rs:2
Breakpoint 1: where = main`main::main + 4 at main.rs:2:5, address = 0x00000000000076a4
(lldb) r
Process 948449 launched: '/home/asm/hacking/llvm/rust/main' (x86_64)
warning: (x86_64) /lib64/libgcc_s.so.1 No LZMA support found for reading .gnu_debugdata section
Process 948449 stopped
* thread #1, name = 'main', stop reason = breakpoint 1.1
    frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5
   1    fn main() {
-> 2        println!("Hello world!");
   3    }
(lldb) bt
error: need to add support for DW_TAG_base_type '()' encoded with DW_ATE = 0x7, bit_size = 0
* thread #1, name = 'main', stop reason = breakpoint 1.1
  * frame #0: 0x000055555555b6a4 main`main::main at main.rs:2:5
    frame #1: 0x000055555555b78b main`<fn() as core::ops::function::FnOnce<()>>::call_once((null)=(main`main::main at main.rs:1), (null)=<unavailable>) at function.rs:227:5
    frame #2: 0x000055555555b66e main`std::sys_common::backtrace::__rust_begin_short_backtrace::<fn(), ()>(f=(main`main::main at main.rs:1)) at backtrace.rs:125:18
    frame #3: 0x000055555555b851 main`std::rt::lang_start::<()>::{closure#0} at rt.rs:49:18
    frame intel#4: 0x000055555556c9f9 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] core::ops::function::impls::_$LT$impl$u20$core..ops..function..FnOnce$LT$A$GT$$u20$for$u20$$RF$F$GT$::call_once::h04259e4a34d07c2f at function.rs:259:13
    frame intel#5: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::do_call::hb8da45704d5cfbbf at panicking.rs:401:40
    frame intel#6: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panicking::try::h4beadc19a78fec52 at panicking.rs:365:19
    frame intel#7: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a [inlined] std::panic::catch_unwind::hc58016cd36ba81a4 at panic.rs:433:14
    frame intel#8: 0x000055555556c9f2 main`std::rt::lang_start_internal::hc51399759a90501a at rt.rs:34:21
    frame intel#9: 0x000055555555b830 main`std::rt::lang_start::<()>(main=(main`main::main at main.rs:1), argc=1, argv=0x00007fffffffcb18) at rt.rs:48:5
    frame intel#10: 0x000055555555b6fc main`main + 28
    frame intel#11: 0x00007ffff73f2493 libc.so.6`__libc_start_main + 243
    frame intel#12: 0x000055555555b59e main`_start + 46
(lldb)
```

[1]: rust-lang/rust#60705

Reviewed By: clayborg, teemperor

Differential Revision: https://reviews.llvm.org/D104054
vmaksimo pushed a commit that referenced this pull request Jul 5, 2021
For now, the source variable locations are printed at about the same
space as the comments for disassembled code, which can make some ranges
for variables disappear if a line contains comments, for example:

                                        ┠─ bar = W1
0:  add x0, x2, #2, lsl intel#12     // =8192┃
4:  add z31.d, z31.d, #65280    // =0xff00
8:  nop                                 ┻

The patch shifts the report a bit to allow printing comments up to
approximately 16 characters without interferences.

Differential Revision: https://reviews.llvm.org/D104700
vmaksimo pushed a commit that referenced this pull request Jul 5, 2021
Reference commit:
ad05778 [clang-offload-bundler] Add new unbundling mode 'a' for archives (intel#2840)

Also added an override keyword that was missed during eda76af.

  CONFLICT (content): Merge conflict in clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
  CONFLICT (content): Merge conflict in clang/test/Driver/clang-offload-bundler.c
vmaksimo pushed a commit that referenced this pull request Jul 5, 2021
While on regular Linux system (Fedora 34 GA, not updated):

* thread #1, name = '1', stop reason = hit program assert
    frame #0: 0x00007ffff7e242a2 libc.so.6`raise + 322
    frame #1: 0x00007ffff7e0d8a4 libc.so.6`abort + 278
    frame #2: 0x00007ffff7e0d789 libc.so.6`__assert_fail_base.cold + 15
    frame #3: 0x00007ffff7e1ca16 libc.so.6`__assert_fail + 70
  * frame intel#4: 0x00000000004011bd 1`main at assert.c:7:3

On Fedora 35 pre-release one gets:

* thread #1, name = '1', stop reason = signal SIGABRT
  * frame #0: 0x00007ffff7e48ee3 libc.so.6`pthread_kill@GLIBC_2.2.5 + 67
    frame #1: 0x00007ffff7dfb986 libc.so.6`raise + 22
    frame #2: 0x00007ffff7de5806 libc.so.6`abort + 230
    frame #3: 0x00007ffff7de571b libc.so.6`__assert_fail_base.cold + 15
    frame intel#4: 0x00007ffff7df4646 libc.so.6`__assert_fail + 70
    frame intel#5: 0x00000000004011bd 1`main at assert.c:7:3

I did not write a testcase as one needs the specific glibc. An
artificial test would just copy the changed source.

Reviewed By: mib

Differential Revision: https://reviews.llvm.org/D105133
vmaksimo pushed a commit that referenced this pull request Jul 12, 2021
  CONFLICT (content): Merge conflict in llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
vmaksimo pushed a commit that referenced this pull request Jul 12, 2021
  CONFLICT (content): Merge conflict in clang/lib/Sema/Sema.cpp
vmaksimo pushed a commit that referenced this pull request Jul 19, 2021
…mpd.so path

There is no guarantee that the space allocated in `libname`
is enough to accomodate the whole `dl_info.dli_fname`,
because it could e.g. have an suffix  - `.5`,
and that highlights another problem - what it should do about suffxies,
and should it do anything to resolve the symlinks before changing the filename?

```
$ LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/lib"  ./src/utilities/rstest/rstest -c /tmp/f49137920.NEF
dl_info.dli_fname "/usr/local/lib/libomp.so.5"
strlen(dl_info.dli_fname) 26
lib_path_length 14
lib_path_length + 12 26
=================================================================
==30949==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60300000002a at pc 0x000000548648 bp 0x7ffdfa0aa780 sp 0x7ffdfa0a9f40
WRITE of size 27 at 0x60300000002a thread T0
    #0 0x548647 in strcpy (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x548647)
    #1 0x7fb9e3e3d234 in ompd_init() /repositories/llvm-project/openmp/runtime/src/ompd-specific.cpp:102:5
    #2 0x7fb9e3dcb446 in __kmp_do_serial_initialize() /repositories/llvm-project/openmp/runtime/src/kmp_runtime.cpp:6742:3
    #3 0x7fb9e3dcb40b in __kmp_get_global_thread_id_reg /repositories/llvm-project/openmp/runtime/src/kmp_runtime.cpp:251:7
    intel#4 0x59e035 in main /home/lebedevri/rawspeed/build-Clang-SANITIZE/../src/utilities/rstest/rstest.cpp:491
    intel#5 0x7fb9e3762d09 in __libc_start_main csu/../csu/libc-start.c:308:16
    intel#6 0x4df449 in _start (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x4df449)

0x60300000002a is located 0 bytes to the right of 26-byte region [0x603000000010,0x60300000002a)
allocated by thread T0 here:
    #0 0x55cc5d in malloc (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x55cc5d)
    #1 0x7fb9e3e3d224 in ompd_init() /repositories/llvm-project/openmp/runtime/src/ompd-specific.cpp:101:17
    #2 0x7fb9e3762d09 in __libc_start_main csu/../csu/libc-start.c:308:16

SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x548647) in strcpy
Shadow bytes around the buggy address:
  0x0c067fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c067fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c067fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c067fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c067fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c067fff8000: fa fa 00 00 00[02]fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==30949==ABORTING
Aborted
```
vmaksimo pushed a commit that referenced this pull request Jul 19, 2021
It turns out that D105040 broke `std::rel_ops`; we actually do need
both a one-template-parameter and a two-template-parameter version of
all the comparison operators, because if we have only the heterogeneous
two-parameter version, then `x > x` is ambiguous:

    template<class T, class U> int f(S<T>, S<U>) { return 1; }
    template<class T> int f(T, T) { return 2; }  // rel_ops
    S<int> s; f(s,s);  // ambiguous between #1 and #2

Adding the one-template-parameter version fixes the ambiguity:

    template<class T, class U> int f(S<T>, S<U>) { return 1; }
    template<class T> int f(T, T) { return 2; }  // rel_ops
    template<class T> int f(S<T>, S<T>) { return 3; }
    S<int> s; f(s,s);  // #3 beats both #1 and #2

We have the same problem with `reverse_iterator` as with `__wrap_iter`.
But so do libstdc++ and Microsoft, so we're not going to worry about it.

Differential Revision: https://reviews.llvm.org/D105894
vmaksimo pushed a commit that referenced this pull request Jul 19, 2021
…s (1/n)

libc++ has started splicing standard library headers into much more
fine-grained content for maintainability. It's very likely that outdated
and naive tooling (some of which is outside of LLVM's scope) will
suggest users include things such as `<__algorithm/find.h>` instead of
`<algorithm>`, and Hyrum's law suggests that users will eventually begin
to rely on this without the help of tooling. As such, this commit
intends to protect users from themselves, by making it a hard error for
anyone outside of the standard library to include libc++ detail headers.

This is the first of four patches. Patch #2 will solve the problem for
pre-processor `#include`s; patches #3 and intel#4 will solve the problem for
`<__tree>` and `<__hash_table>` (since I've never touched the test cases
that are failing for these two, I want to split them out into their own
commits to be extra careful). Patch intel#5 will concern itself with
`<__threading_support>`, which intersects with libcxxabi (which I know
even less about).

Differential Revision: https://reviews.llvm.org/D105932
vmaksimo pushed a commit that referenced this pull request Jul 19, 2021
  CONFLICT (content): Merge conflict in clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
vmaksimo pushed a commit that referenced this pull request Aug 4, 2021
This amends c5243c6 to fix formatting
continued function calls with BinPacking = false.

Differential Revision: https://reviews.llvm.org/D106773
vmaksimo pushed a commit that referenced this pull request Aug 4, 2021
There is a SIGSEGV at `DeduceTemplateArgumentsByTypeMatch`. The bug [#51171](https://bugs.llvm.org/show_bug.cgi?id=51171) was filled. The reproducer can be found at the bug description.

LIT test for the issue was added:
```
./bin/llvm-lit -v ../clang/test/SemaCXX/pr51171-crash.cpp
```

The debug stack trace is below:
```
 #0 0x00000000055afcb9 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/ivanmurashko/local/llvm-project/llvm/lib/Support/Unix/Signals.inc:565:22
 #1 0x00000000055afd70 PrintStackTraceSignalHandler(void*) /home/ivanmurashko/local/llvm-project/llvm/lib/Support/Unix/Signals.inc:632:1
 #2 0x00000000055add2d llvm::sys::RunSignalHandlers() /home/ivanmurashko/local/llvm-project/llvm/lib/Support/Signals.cpp:97:20
 #3 0x00000000055af701 SignalHandler(int) /home/ivanmurashko/local/llvm-project/llvm/lib/Support/Unix/Signals.inc:407:1
 intel#4 0x00007ffff7bc2b20 __restore_rt sigaction.c:0:0
 intel#5 0x00007ffff66a337f raise (/lib64/libc.so.6+0x3737f)
 intel#6 0x00007ffff668ddb5 abort (/lib64/libc.so.6+0x21db5)
 intel#7 0x00007ffff668dc89 _nl_load_domain.cold.0 loadmsgcat.c:0:0
 intel#8 0x00007ffff669ba76 .annobin___GI___assert_fail.end assert.c:0:0
 intel#9 0x000000000594b210 clang::QualType::getCommonPtr() const /home/ivanmurashko/local/llvm-project/clang/include/clang/AST/Type.h:684:5
intel#10 0x0000000005a12ca6 clang::QualType::getCanonicalType() const /home/ivanmurashko/local/llvm-project/clang/include/clang/AST/Type.h:6467:36
intel#11 0x0000000005a137a6 clang::ASTContext::getCanonicalType(clang::QualType) const /home/ivanmurashko/local/llvm-project/clang/include/clang/AST/ASTContext.h:2433:58
intel#12 0x0000000009204584 DeduceTemplateArgumentsByTypeMatch(clang::Sema&, clang::TemplateParameterList*, clang::QualType, clang::QualType, clang::sema::TemplateDeductionInfo&, llvm::SmallVectorImpl<clang::DeducedTemplateArgument>&, unsigned int, bool, bool) /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaTemplateDeduction.cpp:1355:54
intel#13 0x000000000920df0d clang::Sema::DeduceTemplateArguments(clang::FunctionTemplateDecl*, clang::TemplateArgumentListInfo*, clang::QualType, clang::FunctionDecl*&, clang::sema::TemplateDeductionInfo&, bool) /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaTemplateDeduction.cpp:4354:47
intel#14 0x0000000009012b09 (anonymous namespace)::AddressOfFunctionResolver::AddMatchingTemplateFunction(clang::FunctionTemplateDecl*, clang::DeclAccessPair const&) /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaOverload.cpp:12026:38
intel#15 0x0000000009013030 (anonymous namespace)::AddressOfFunctionResolver::FindAllFunctionsThatMatchTargetTypeExactly() /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaOverload.cpp:12119:9
intel#16 0x0000000009012679 (anonymous namespace)::AddressOfFunctionResolver::AddressOfFunctionResolver(clang::Sema&, clang::Expr*, clang::QualType const&, bool) /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaOverload.cpp:11931:5
intel#17 0x0000000009013c91 clang::Sema::ResolveAddressOfOverloadedFunction(clang::Expr*, clang::QualType, bool, clang::DeclAccessPair&, bool*) /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaOverload.cpp:12286:42
intel#18 0x0000000008fed85d IsStandardConversion(clang::Sema&, clang::Expr*, clang::QualType, bool, clang::StandardConversionSequence&, bool, bool) /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaOverload.cpp:1712:49
intel#19 0x0000000008fec8ea TryImplicitConversion(clang::Sema&, clang::Expr*, clang::QualType, bool, clang::Sema::AllowedExplicit, bool, bool, bool, bool) /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaOverload.cpp:1433:27
intel#20 0x0000000008ff90ba TryCopyInitialization(clang::Sema&, clang::Expr*, clang::QualType, bool, bool, bool, bool) /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaOverload.cpp:5273:71
intel#21 0x00000000090024fb clang::Sema::AddBuiltinCandidate(clang::QualType*, llvm::ArrayRef<clang::Expr*>, clang::OverloadCandidateSet&, bool, unsigned int) /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaOverload.cpp:7755:32
intel#22 0x000000000900513f (anonymous namespace)::BuiltinOperatorOverloadBuilder::addGenericBinaryArithmeticOverloads() /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaOverload.cpp:8633:30
intel#23 0x0000000009007624 clang::Sema::AddBuiltinOperatorCandidates(clang::OverloadedOperatorKind, clang::SourceLocation, llvm::ArrayRef<clang::Expr*>, clang::OverloadCandidateSet&) /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaOverload.cpp:9205:51
intel#24 0x0000000009018734 clang::Sema::LookupOverloadedBinOp(clang::OverloadCandidateSet&, clang::OverloadedOperatorKind, clang::UnresolvedSetImpl const&, llvm::ArrayRef<clang::Expr*>, bool) /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaOverload.cpp:13469:1
intel#25 0x0000000009018d56 clang::Sema::CreateOverloadedBinOp(clang::SourceLocation, clang::BinaryOperatorKind, clang::UnresolvedSetImpl const&, clang::Expr*, clang::Expr*, bool, bool, clang::FunctionDecl*) /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaOverload.cpp:13568:24
intel#26 0x0000000008b24797 BuildOverloadedBinOp(clang::Sema&, clang::Scope*, clang::SourceLocation, clang::BinaryOperatorKind, clang::Expr*, clang::Expr*) /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaExpr.cpp:14606:65
intel#27 0x0000000008b24ed5 clang::Sema::BuildBinOp(clang::Scope*, clang::SourceLocation, clang::BinaryOperatorKind, clang::Expr*, clang::Expr*) /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaExpr.cpp:14691:73
intel#28 0x0000000008b245d4 clang::Sema::ActOnBinOp(clang::Scope*, clang::SourceLocation, clang::tok::TokenKind, clang::Expr*, clang::Expr*) /home/ivanmurashko/local/llvm-project/clang/lib/Sema/SemaExpr.cpp:14566:1
intel#29 0x00000000085bfafb clang::Parser::ParseRHSOfBinaryExpression(clang::ActionResult<clang::Expr*, true>, clang::prec::Level) /home/ivanmurashko/local/llvm-project/clang/lib/Parse/ParseExpr.cpp:630:71
intel#30 0x00000000085bd922 clang::Parser::ParseAssignmentExpression(clang::Parser::TypeCastState) /home/ivanmurashko/local/llvm-project/clang/lib/Parse/ParseExpr.cpp:177:1
intel#31 0x00000000085cbbcd clang::Parser::ParseExpressionList(llvm::SmallVectorImpl<clang::Expr*>&, llvm::SmallVectorImpl<clang::SourceLocation>&, llvm::function_ref<void ()>) /home/ivanmurashko/local/llvm-project/clang/lib/Parse/ParseExpr.cpp:3368:40
intel#32 0x000000000857f49c clang::Parser::ParseDeclarationAfterDeclaratorAndAttributes(clang::Declarator&, clang::Parser::ParsedTemplateInfo const&, clang::Parser::ForRangeInit*) /home/ivanmurashko/local/llvm-project/clang/lib/Parse/ParseDecl.cpp:2416:5
intel#33 0x000000000857df16 clang::Parser::ParseDeclGroup(clang::ParsingDeclSpec&, clang::DeclaratorContext, clang::SourceLocation*, clang::Parser::ForRangeInit*) /home/ivanmurashko/local/llvm-project/clang/lib/Parse/ParseDecl.cpp:2092:65
intel#34 0x000000000855f07b clang::Parser::ParseDeclOrFunctionDefInternal(clang::ParsedAttributesWithRange&, clang::ParsingDeclSpec&, clang::AccessSpecifier) /home/ivanmurashko/local/llvm-project/clang/lib/Parse/Parser.cpp:1138:1
intel#35 0x000000000855f136 clang::Parser::ParseDeclarationOrFunctionDefinition(clang::ParsedAttributesWithRange&, clang::ParsingDeclSpec*, clang::AccessSpecifier) /home/ivanmurashko/local/llvm-project/clang/lib/Parse/Parser.cpp:1153:57
intel#36 0x000000000855e644 clang::Parser::ParseExternalDeclaration(clang::ParsedAttributesWithRange&, clang::ParsingDeclSpec*) /home/ivanmurashko/local/llvm-project/clang/lib/Parse/Parser.cpp:975:58
intel#37 0x000000000855d717 clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, bool) /home/ivanmurashko/local/llvm-project/clang/lib/Parse/Parser.cpp:720:42
intel#38 0x0000000008558e01 clang::ParseAST(clang::Sema&, bool, bool) /home/ivanmurashko/local/llvm-project/clang/lib/Parse/ParseAST.cpp:158:37
intel#39 0x000000000627a221 clang::ASTFrontendAction::ExecuteAction() /home/ivanmurashko/local/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1058:11
intel#40 0x0000000006bdcc31 clang::CodeGenAction::ExecuteAction() /home/ivanmurashko/local/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:1045:5
intel#41 0x0000000006279b4d clang::FrontendAction::Execute() /home/ivanmurashko/local/llvm-project/clang/lib/Frontend/FrontendAction.cpp:955:38
intel#42 0x00000000061c3fe9 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /home/ivanmurashko/local/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:974:42
intel#43 0x00000000063f9c5e clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /home/ivanmurashko/local/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:278:38
intel#44 0x0000000002603a03 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /home/ivanmurashko/local/llvm-project/clang/tools/driver/cc1_main.cpp:246:40
intel#45 0x00000000025f8a39 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) /home/ivanmurashko/local/llvm-project/clang/tools/driver/driver.cpp:338:20
intel#46 0x00000000025f9107 main /home/ivanmurashko/local/llvm-project/clang/tools/driver/driver.cpp:415:26
intel#47 0x00007ffff668f493 __libc_start_main (/lib64/libc.so.6+0x23493)
intel#48 0x00000000025f729e _start (/data/users/ivanmurashko/llvm-project/build/bin/clang-13+0x25f729e)
```

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D106583
vmaksimo pushed a commit that referenced this pull request Aug 10, 2021
  CONFLICT (content): Merge conflict in clang/lib/Frontend/CompilerInvocation.cpp
vmaksimo pushed a commit that referenced this pull request Aug 16, 2021
vmaksimo pushed a commit that referenced this pull request Aug 23, 2021
Change 636428c enabled BlockingRegion hooks for pthread_once().
Unfortunately this seems to cause crashes on Mac OS X which uses
pthread_once() from locations that seem to result in crashes:

| ThreadSanitizer:DEADLYSIGNAL
| ==31465==ERROR: ThreadSanitizer: stack-overflow on address 0x7ffee73fffd8 (pc 0x00010807fd2a bp 0x7ffee7400050 sp 0x7ffee73fffb0 T93815)
|     #0 __tsan::MetaMap::GetSync(__tsan::ThreadState*, unsigned long, unsigned long, bool, bool) tsan_sync.cpp:195 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x78d2a)
|     #1 __tsan::MutexPreLock(__tsan::ThreadState*, unsigned long, unsigned long, unsigned int) tsan_rtl_mutex.cpp:143 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x6cefc)
|     #2 wrap_pthread_mutex_lock sanitizer_common_interceptors.inc:4240 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x3dae0)
|     #3 flockfile <null>:2 (libsystem_c.dylib:x86_64+0x38a69)
|     intel#4 puts <null>:2 (libsystem_c.dylib:x86_64+0x3f69b)
|     intel#5 wrap_puts sanitizer_common_interceptors.inc (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x34d83)
|     intel#6 __tsan::OnPotentiallyBlockingRegionBegin() cxa_guard_acquire.cpp:8 (foo:x86_64+0x100000e48)
|     intel#7 wrap_pthread_once tsan_interceptors_posix.cpp:1512 (libclang_rt.tsan_osx_dynamic.dylib:x86_64+0x2f6e6)

From the stack trace it can be seen that the caller is unknown, and the
resulting stack-overflow seems to indicate that whoever the caller is
does not have enough stack space or otherwise is running in a limited
environment not yet ready for full instrumentation.

Fix it by reverting behaviour on Mac OS X to not call BlockingRegion
hooks from pthread_once().

Reported-by: azharudd

Reviewed By: glider

Differential Revision: https://reviews.llvm.org/D108305
vmaksimo pushed a commit that referenced this pull request Sep 1, 2021
  CONFLICT (content): Merge conflict in clang/lib/CodeGen/CodeGenModule.cpp
vmaksimo pushed a commit that referenced this pull request Sep 1, 2021
Reverted intel#4011 as part of the resolution (which would be reverted
otherwise in 2 commits) to fix the build. Essentially a cherry-pick of:

9189ef6 Mon Aug 30 01:59:34 2021  Alexey Bader:  Revert
"[SYCL][ESIMD][EMU] pi_esimd_cpu bringing up with CM library (intel#4011)"
(intel#4421)
vmaksimo pushed a commit that referenced this pull request Sep 1, 2021
  CONFLICT (content): Merge conflict in clang/lib/Sema/SemaType.cpp
  CONFLICT (content): Merge conflict in clang/lib/Sema/SemaDecl.cpp
vmaksimo pushed a commit that referenced this pull request Sep 6, 2021
This reverts commit 5cd63e9.

https://bugs.llvm.org/show_bug.cgi?id=51707

The sequence feeding in/out of the rev32/ushr isn't quite right:

 _swap:
         ldr     h0, [x0]
         ldr     h1, [x0, #2]
-        mov     v0.h[1], v1.h[0]
+        mov     v0.s[1], v1.s[0]
         rev32   v0.8b, v0.8b
         ushr    v0.2s, v0.2s, intel#16
-        mov     h1, v0.h[1]
+        mov     s1, v0.s[1]
         str     h0, [x0]
         str     h1, [x0, #2]
         ret
vmaksimo pushed a commit that referenced this pull request Sep 7, 2021
This reverts commit a2768b4.

Breaks sanitizer-x86_64-linux-fast buildbot:
https://lab.llvm.org/buildbot/#/builders/5/builds/11334

Log snippet:
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80
FAIL: LLVM :: Transforms/SampleProfile/early-inline.ll (65549 of 78729)
******************** TEST 'LLVM :: Transforms/SampleProfile/early-inline.ll' FAILED ********************
Script:
--
: 'RUN: at line 1';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/early-inline.ll -instcombine -sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/einline.prof -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/early-inline.ll
--
Exit Code: 2
Command Output (stderr):
--
/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53: runtime error: member call on null pointer of type 'llvm::sampleprof::FunctionSamples'
    #0 0x5a730f8 in shouldInlineCandidate /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53
    #1 0x5a730f8 in (anonymous namespace)::SampleProfileLoader::tryInlineCandidate((anonymous namespace)::InlineCandidate&, llvm::SmallVector<llvm::CallBase*, 8u>*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1178:21
    #2 0x5a6cda6 in inlineHotFunctions /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1105:13
    #3 0x5a6cda6 in (anonymous namespace)::SampleProfileLoader::emitAnnotations(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1633:16
    intel#4 0x5a5fcbe in runOnFunction /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2008:12
    intel#5 0x5a5fcbe in (anonymous namespace)::SampleProfileLoader::runOnModule(llvm::Module&, llvm::AnalysisManager<llvm::Module>*, llvm::ProfileSummaryInfo*, llvm::CallGraph*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1922:15
    intel#6 0x5a5de55 in llvm::SampleProfileLoaderPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2038:21
    intel#7 0x6552a01 in llvm::detail::PassModel<llvm::Module, llvm::SampleProfileLoaderPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:88:17
    intel#8 0x57f807c in llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/IR/PassManager.h:526:21
    intel#9 0x37c8522 in llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::StringRef>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/opt/NewPMDriver.cpp:489:7
    intel#10 0x37e7c11 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/opt/opt.cpp:830:12
    intel#11 0x7fbf4de4009a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
    intel#12 0x379e519 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt+0x379e519)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53 in
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/early-inline.ll
--
********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80
FAIL: LLVM :: Transforms/SampleProfile/inline-cold.ll (65643 of 78729)
******************** TEST 'LLVM :: Transforms/SampleProfile/inline-cold.ll' FAILED ********************
Script:
--
: 'RUN: at line 4';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/inline-cold.prof -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=NOTINLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll
: 'RUN: at line 5';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -passes=sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/inline-cold.prof -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=NOTINLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll
: 'RUN: at line 8';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/inline-cold.prof -sample-profile-inline-size -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=INLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll
: 'RUN: at line 11';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -passes=sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/inline-cold.prof -sample-profile-inline-size -sample-profile-cold-inline-threshold=9999999 -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=INLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll
: 'RUN: at line 14';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll -passes=sample-profile -sample-profile-file=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/Inputs/inline-cold.prof -sample-profile-inline-size -sample-profile-cold-inline-threshold=-500 -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=NOTINLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll
--
Exit Code: 2
Command Output (stderr):
--
/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53: runtime error: member call on null pointer of type 'llvm::sampleprof::FunctionSamples'
    #0 0x5a730f8 in shouldInlineCandidate /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53
    #1 0x5a730f8 in (anonymous namespace)::SampleProfileLoader::tryInlineCandidate((anonymous namespace)::InlineCandidate&, llvm::SmallVector<llvm::CallBase*, 8u>*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1178:21
    #2 0x5a6cda6 in inlineHotFunctions /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1105:13
    #3 0x5a6cda6 in (anonymous namespace)::SampleProfileLoader::emitAnnotations(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1633:16
    intel#4 0x5a5fcbe in runOnFunction /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2008:12
    intel#5 0x5a5fcbe in (anonymous namespace)::SampleProfileLoader::runOnModule(llvm::Module&, llvm::AnalysisManager<llvm::Module>*, llvm::ProfileSummaryInfo*, llvm::CallGraph*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1922:15
    intel#6 0x5a5de55 in llvm::SampleProfileLoaderPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:2038:21
    intel#7 0x6552a01 in llvm::detail::PassModel<llvm::Module, llvm::SampleProfileLoaderPass, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:88:17
    intel#8 0x57f807c in llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/IR/PassManager.h:526:21
    intel#9 0x37c8522 in llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::StringRef>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/opt/NewPMDriver.cpp:489:7
    intel#10 0x37e7c11 in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/opt/opt.cpp:830:12
    intel#11 0x7fcd534a209a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
    intel#12 0x379e519 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/opt+0x379e519)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/IPO/SampleProfile.cpp:1309:53 in
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /b/sanitizer-x86_64-linux-fast/build/llvm_build_ubsan/bin/FileCheck -check-prefix=INLINE /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SampleProfile/inline-cold.ll
--
********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
********************
Failed Tests (2):
  LLVM :: Transforms/SampleProfile/early-inline.ll
  LLVM :: Transforms/SampleProfile/inline-cold.ll
vladimirlaz pushed a commit that referenced this pull request Sep 17, 2021
  CONFLICT (content): Merge conflict in clang/test/SemaSYCL/int128.cpp
  CONFLICT (content): Merge conflict in clang/test/SemaSYCL/float128.cpp
vmaksimo pushed a commit that referenced this pull request Oct 5, 2021
This patch re-introduces the fix in the commit llvm/llvm-project@66b0cebf7f736 by @yrnkrn

> In DwarfEHPrepare, after all passes are run, RewindFunction may be a dangling
>
> pointer to a dead function. To make sure it's valid, doFinalization nullptrs
> RewindFunction just like the constructor and so it will be found on next run.
>
> llvm-svn: 217737

It seems that the fix was not migrated to `DwarfEHPrepareLegacyPass`.

This patch also updates `llvm/test/CodeGen/X86/dwarf-eh-prepare.ll` to include `-run-twice` to exercise the cleanup. Without this patch `llvm-lit -v llvm/test/CodeGen/X86/dwarf-eh-prepare.ll` fails with

```
-- Testing: 1 tests, 1 workers --
FAIL: LLVM :: CodeGen/X86/dwarf-eh-prepare.ll (1 of 1)
******************** TEST 'LLVM :: CodeGen/X86/dwarf-eh-prepare.ll' FAILED ********************
Script:
--
: 'RUN: at line 1';   /home/arakaki/build/llvm-project/main/bin/opt -mtriple=x86_64-linux-gnu -dwarfehprepare -simplifycfg-require-and-preserve-domtree=1 -run-twice < /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll -S | /home/arakaki/build/llvm-project/main/bin/FileCheck /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll
--
Exit Code: 2

Command Output (stderr):
--
Referencing function in another module!
  call void @_Unwind_Resume(i8* %ehptr) #1
; ModuleID = '<stdin>'
void (i8*)* @_Unwind_Resume
; ModuleID = '<stdin>'
in function simple_cleanup_catch
LLVM ERROR: Broken function found, compilation aborted!
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.      Program arguments: /home/arakaki/build/llvm-project/main/bin/opt -mtriple=x86_64-linux-gnu -dwarfehprepare -simplifycfg-require-and-preserve-domtree=1 -run-twice -S
1.      Running pass 'Function Pass Manager' on module '<stdin>'.
2.      Running pass 'Module Verifier' on function '@simple_cleanup_catch'
 #0 0x000056121b570a2c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Unix/Signals.inc:569:0
 #1 0x000056121b56eb64 llvm::sys::RunSignalHandlers() /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Signals.cpp:97:0
 #2 0x000056121b56f28e SignalHandler(int) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/Unix/Signals.inc:397:0
 #3 0x00007fc7e9b22980 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12980)
 intel#4 0x00007fc7e87d3fb7 raise /build/glibc-S7xCS9/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0
 intel#5 0x00007fc7e87d5921 abort /build/glibc-S7xCS9/glibc-2.27/stdlib/abort.c:81:0
 intel#6 0x000056121b4e1386 llvm::raw_svector_ostream::raw_svector_ostream(llvm::SmallVectorImpl<char>&) /home/arakaki/repos/watch/llvm-project/llvm/include/llvm/Support/raw_ostream.h:674:0
 intel#7 0x000056121b4e1386 llvm::report_fatal_error(llvm::Twine const&, bool) /home/arakaki/repos/watch/llvm-project/llvm/lib/Support/ErrorHandling.cpp:114:0
 intel#8 0x000056121b4e1528 (/home/arakaki/build/llvm-project/main/bin/opt+0x29e3528)
 intel#9 0x000056121adfd03f llvm::raw_ostream::operator<<(llvm::StringRef) /home/arakaki/repos/watch/llvm-project/llvm/include/llvm/Support/raw_ostream.h:218:0
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /home/arakaki/build/llvm-project/main/bin/FileCheck /home/arakaki/repos/watch/llvm-project/llvm/test/CodeGen/X86/dwarf-eh-prepare.ll

--

********************
********************
Failed Tests (1):
  LLVM :: CodeGen/X86/dwarf-eh-prepare.ll

Testing Time: 0.22s
  Failed: 1
```

Reviewed By: loladiro

Differential Revision: https://reviews.llvm.org/D110979
vmaksimo pushed a commit that referenced this pull request Oct 12, 2021
Script for automatic 'opt' pipeline reduction for when using the new
pass-manager (NPM). Based around the '-print-pipeline-passes' option.

The reduction algorithm consists of several phases (steps).

Step #0: Verify that input fails with the given pipeline and make note of the
error code.

Step #1: Split pipeline in two starting from front and move forward as long as
first pipeline exits normally and the second pipeline fails with the expected
error code. Move on to step #2 with the IR from the split point and the
pipeline from the second invocation.

Step #2: Remove passes from end of the pipeline as long as the pipeline fails
with the expected error code.

Step #3: Make several sweeps over the remaining pipeline trying to remove one
pass at a time. Repeat sweeps until unable to remove any more passes.

Usage example:
./utils/reduce_pipeline.py --opt-binary=./build-all-Debug/bin/opt --input=input.ll --output=output.ll --passes=PIPELINE [EXTRA-OPT-ARGS ...]

Differential Revision: https://reviews.llvm.org/D110908
vmaksimo pushed a commit that referenced this pull request Oct 12, 2021
Although THREADLOCAL variables are supported on Darwin they cannot be
used very early on during process init (before dyld has set it up).

Unfortunately the checked lock is used before dyld has setup TLS leading
to an abort call (`_tlv_boostrap()` is never supposed to be called at
runtime).

To avoid this problem `SANITIZER_CHECK_DEADLOCKS` is now disabled on
Darwin platforms. This fixes running TSan tests (an possibly other
Sanitizers) when `COMPILER_RT_DEBUG=ON`.

For reference the crashing backtrace looks like this:

```
* thread #1, stop reason = signal SIGABRT
  * frame #0: 0x00000002044da0ae dyld`__abort_with_payload + 10
    frame #1: 0x00000002044f01af dyld`abort_with_payload_wrapper_internal + 80
    frame #2: 0x00000002044f01e1 dyld`abort_with_payload + 9
    frame #3: 0x000000010c989060 dyld_sim`abort_with_payload + 26
    frame intel#4: 0x000000010c94908b dyld_sim`dyld4::halt(char const*) + 375
    frame intel#5: 0x000000010c988f5c dyld_sim`abort + 16
    frame intel#6: 0x000000010c96104f dyld_sim`dyld4::APIs::_tlv_bootstrap() + 9
    frame intel#7: 0x000000010cd8d6d2 libclang_rt.tsan_iossim_dynamic.dylib`__sanitizer::CheckedMutex::LockImpl(this=<unavailable>, pc=<unavailable>) at sanitizer_mutex.cpp:218:58 [opt]
    frame intel#8: 0x000000010cd8a0f7 libclang_rt.tsan_iossim_dynamic.dylib`__sanitizer::Mutex::Lock() [inlined] __sanitizer::CheckedMutex::Lock(this=0x000000010d733c90) at sanitizer_mutex.h:124:5 [opt]
    frame intel#9: 0x000000010cd8a0ee libclang_rt.tsan_iossim_dynamic.dylib`__sanitizer::Mutex::Lock(this=0x000000010d733c90) at sanitizer_mutex.h:162:19 [opt]
    frame intel#10: 0x000000010cd8a0bf libclang_rt.tsan_iossim_dynamic.dylib`__sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock(this=0x000000030c7479a8, mu=<unavailable>) at sanitizer_mutex.h:364:10 [opt]
    frame intel#11: 0x000000010cd89819 libclang_rt.tsan_iossim_dynamic.dylib`__sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock(this=0x000000030c7479a8, mu=<unavailable>) at sanitizer_mutex.h:363:67 [opt]
    frame intel#12: 0x000000010cd8985b libclang_rt.tsan_iossim_dynamic.dylib`__sanitizer::LibIgnore::OnLibraryLoaded(this=0x000000010d72f480, name=0x0000000000000000) at sanitizer_libignore.cpp:39:8 [opt]
    frame intel#13: 0x000000010cda7aaa libclang_rt.tsan_iossim_dynamic.dylib`__tsan::InitializeLibIgnore() at tsan_interceptors_posix.cpp:219:16 [opt]
    frame intel#14: 0x000000010cdce0bb libclang_rt.tsan_iossim_dynamic.dylib`__tsan::Initialize(thr=0x0000000110141400) at tsan_rtl.cpp:403:3 [opt]
    frame intel#15: 0x000000010cda7b8e libclang_rt.tsan_iossim_dynamic.dylib`__tsan::ScopedInterceptor::ScopedInterceptor(__tsan::ThreadState*, char const*, unsigned long) [inlined] __tsan::LazyInitialize(thr=0x0000000110141400) at tsan_rtl.h:665:5 [opt]
    frame intel#16: 0x000000010cda7b86 libclang_rt.tsan_iossim_dynamic.dylib`__tsan::ScopedInterceptor::ScopedInterceptor(this=0x000000030c747af8, thr=0x0000000110141400, fname=<unavailable>, pc=4568918787) at tsan_interceptors_posix.cpp:247:3 [opt]
    frame intel#17: 0x000000010cda7bb9 libclang_rt.tsan_iossim_dynamic.dylib`__tsan::ScopedInterceptor::ScopedInterceptor(this=0x000000030c747af8, thr=<unavailable>, fname=<unavailable>, pc=<unavailable>) at tsan_interceptors_posix.cpp:246:59 [opt]
    frame intel#18: 0x000000010cdb72b7 libclang_rt.tsan_iossim_dynamic.dylib`::wrap_strlcpy(dst="\xd2", src="0xd1d398d1bb0a007b", size=20) at sanitizer_common_interceptors.inc:7386:3 [opt]
    frame intel#19: 0x0000000110542b03 libsystem_c.dylib`__guard_setup + 140
    frame intel#20: 0x00000001104f8ab4 libsystem_c.dylib`_libc_initializer + 65
    ...
```

rdar://83723445

Differential Revision: https://reviews.llvm.org/D111243
vmaksimo pushed a commit that referenced this pull request Nov 19, 2021
Fixes a CHECK-failure caused by glibc's pthread_getattr_np
implementation calling realloc.  Essentially, Thread::GenerateRandomTag
gets called during Thread::Init and before Thread::InitRandomState:

  HWAddressSanitizer: CHECK failed: hwasan_thread.cpp:134 "((random_buffer_)) != (0)" (0x0, 0x0) (tid=314)
    #0 0x55845475a662 in __hwasan::CheckUnwind()
    #1 0x558454778797 in __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long)
    #2 0x558454766461 in __hwasan::Thread::GenerateRandomTag(unsigned long)
    #3 0x55845475c58b in __hwasan::HwasanAllocate(__sanitizer::StackTrace*, unsigned long, unsigned long, bool)
    intel#4 0x55845475c80a in __hwasan::hwasan_realloc(void*, unsigned long, __sanitizer::StackTrace*)
    intel#5 0x5584547608aa in realloc
    intel#6 0x7f6f3a3d8c2c in pthread_getattr_np
    intel#7 0x5584547790dc in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*)
    intel#8 0x558454779651 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*)
    intel#9 0x558454761bca in __hwasan::Thread::InitStackAndTls(__hwasan::Thread::InitState const*)
    intel#10 0x558454761e5c in __hwasan::HwasanThreadList::CreateCurrentThread(__hwasan::Thread::InitState const*)
    intel#11 0x55845476184f in __hwasan_thread_enter
    intel#12 0x558454760def in HwasanThreadStartFunc(void*)
    intel#13 0x7f6f3a3d6fa2 in start_thread
    intel#14 0x7f6f3a15b4ce in __clone

Also reverts 7a3fb71, as it's now
unneeded.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D113045
vmaksimo pushed a commit that referenced this pull request Nov 25, 2021
This reverts commit 3cf4a2c.

It makes llc hang on the following test case.
```
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-unknown-linux-gnu"

define dso_local void @_PyUnicode_EncodeUTF16() local_unnamed_addr #0 {
entry:
  br label %while.body117.i

while.body117.i:                                  ; preds = %cleanup149.i, %entry
  %out.6269.i = phi i16* [ undef, %cleanup149.i ], [ undef, %entry ]
  %0 = load i16, i16* undef, align 2
  %1 = icmp eq i16 undef, -10240
  br i1 %1, label %fail.i, label %cleanup149.i

cleanup149.i:                                     ; preds = %while.body117.i
  %or130.i = call i16 @llvm.bswap.i16(i16 %0) #2
  store i16 %or130.i, i16* %out.6269.i, align 2
  br label %while.body117.i

fail.i:                                           ; preds = %while.body117.i
  ret void
}

; Function Attrs: nofree nosync nounwind readnone speculatable willreturn
declare i16 @llvm.bswap.i16(i16) #1

attributes #0 = { "target-features"="+neon,+v8a" }
attributes #1 = { nofree nosync nounwind readnone speculatable willreturn }
attributes #2 = { mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn "frame-pointer"="non-leaf" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+neon,+v8a" }
```
vmaksimo pushed a commit that referenced this pull request Dec 13, 2021
…cxx with compiler implied -lunwind

This does mostly the same as D112126, but for the runtimes cmake files.
Most of that is straightforward, but the interdependency between
libcxx and libunwind is tricky:

Libunwind is built at the same time as libcxx, but libunwind is not
installed yet. LIBCXXABI_USE_LLVM_UNWINDER makes libcxx link directly
against the just-built libunwind, but the compiler implicit -lunwind
isn't found. This patch avoids that by adding --unwindlib=none if
supported, if we are going to link explicitly against a newly built
unwinder anyway.

Since the previous attempt, this no longer uses
llvm_enable_language_nolink (and thus doesn't set
CMAKE_TRY_COMPILE_TARGET_TYPE=STATIC_LIBRARY during the compiler
sanity checks). Setting CMAKE_TRY_COMPILE_TARGET_TYPE=STATIC_LIBRARY
during compiler sanity checks makes cmake not learn about some
aspects of the compiler, which can make further find_library or
find_package fail. This caused OpenMP to not detect libelf and libffi,
disabling some OpenMP target plugins.

Instead, require the caller to set CMAKE_{C,CXX}_COMPILER_WORKS=YES
when building in a configuration with an incomplete toolchain.

Differential Revision: https://reviews.llvm.org/D113253
vmaksimo pushed a commit that referenced this pull request Dec 20, 2021
…turn to external addr part)

Before we have an issue with artificial LBR whose source is a return, recalling that "an internal code(A) can return to external address, then from the external address call a new internal code(B), making an artificial branch that looks like a return from A to B can confuse the unwinder". We just ignore the LBRs after this artificial LBR which can miss some samples. This change aims at fixing this by correctly unwinding them instead of ignoring them.

List some typical scenarios covered by this change.

1)  multiple sequential call back happen in external address, e.g.

```
[ext, call, foo] [foo, return, ext] [ext, call, bar]
```
Unwinder should avoid having foo return from bar. Wrong call stack is like [foo, bar]

2) the call stack before and after external call should be correctly unwinded.
```
 {call stack1}                                            {call stack2}
 [foo, call, ext]  [ext, call, bar]  [bar, return, ext]  [ext, return, foo ]
```
call stack 1 should be the same to call stack2. Both shouldn't be truncated

3) call stack should be truncated after call into external code since we can't do inlining with external code.

```
 [foo, call, ext]  [ext, call, bar]  [bar, call, baz] [baz, return, bar ] [bar, return, ext]
```
the call stack of code in baz should not include foo.

### Implementation:

We leverage artificial frame to fix #2 and #3: when we got a return artificial LBR, push an extra artificial frame to the stack. when we pop frame, check if the parent is an artificial frame to pop(fix #2). Therefore, call/ return artificial LBR is just the same as regular LBR which can keep the call stack.

While recording context on the trie, artificial frame is used as a tag indicating that we should truncate the call stack(fix #3).

To differentiate #1 and #2, we leverage `getCallAddrFromFrameAddr`.  Normally the target of the return should be the next inst of a call inst and `getCallAddrFromFrameAddr` will return the address of call inst. Otherwise, getCallAddrFromFrameAddr will return to 0 which is the case of #1.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D115550
vmaksimo pushed a commit that referenced this pull request Dec 27, 2021
…he parser"

This reverts commit b0e8667.

ASAN/UBSAN bot is broken with this trace:

[ RUN      ] FlatAffineConstraintsTest.FindSampleTest
llvm-project/mlir/include/mlir/Support/MathExtras.h:27:15: runtime error: signed integer overflow: 1229996100002 * 809999700000 cannot be represented in type 'long'
    #0 0x7f63ace960e4 in mlir::ceilDiv(long, long) llvm-project/mlir/include/mlir/Support/MathExtras.h:27:15
    #1 0x7f63ace8587e in ceil llvm-project/mlir/include/mlir/Analysis/Presburger/Fraction.h:57:42
    #2 0x7f63ace8587e in operator* llvm-project/llvm/include/llvm/ADT/STLExtras.h:347:42
    #3 0x7f63ace8587e in uninitialized_copy<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long>, long *> include/c++/v1/__memory/uninitialized_algorithms.h:36:62
    intel#4 0x7f63ace8587e in uninitialized_copy<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long>, long *> llvm-project/llvm/include/llvm/ADT/SmallVector.h:490:5
    intel#5 0x7f63ace8587e in append<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long>, void> llvm-project/llvm/include/llvm/ADT/SmallVector.h:662:5
    intel#6 0x7f63ace8587e in SmallVector<llvm::mapped_iterator<mlir::Fraction *, long (*)(mlir::Fraction), long> > llvm-project/llvm/include/llvm/ADT/SmallVector.h:1204:11
    intel#7 0x7f63ace8587e in mlir::FlatAffineConstraints::findIntegerSample() const llvm-project/mlir/lib/Analysis/AffineStructures.cpp:1171:27
    intel#8 0x7f63ae95a84d in mlir::checkSample(bool, mlir::FlatAffineConstraints const&, mlir::TestFunction) llvm-project/mlir/unittests/Analysis/AffineStructuresTest.cpp:37:23
    intel#9 0x7f63ae957545 in mlir::FlatAffineConstraintsTest_FindSampleTest_Test::TestBody() llvm-project/mlir/unittests/Analysis/AffineStructuresTest.cpp:222:3
vmaksimo pushed a commit that referenced this pull request Jan 10, 2022
Segmentation fault in ompt_tsan_dependences function due to an unchecked NULL pointer dereference is as follows:

```
ThreadSanitizer:DEADLYSIGNAL
	==140865==ERROR: ThreadSanitizer: SEGV on unknown address 0x000000000050 (pc 0x7f217c2d3652 bp 0x7ffe8cfc7e00 sp 0x7ffe8cfc7d90 T140865)
	==140865==The signal is caused by a READ memory access.
	==140865==Hint: address points to the zero page.
	/usr/bin/addr2line: DWARF error: could not find variable specification at offset 1012a
	/usr/bin/addr2line: DWARF error: could not find variable specification at offset 133b5
	/usr/bin/addr2line: DWARF error: could not find variable specification at offset 1371a
	/usr/bin/addr2line: DWARF error: could not find variable specification at offset 13a58
	#0 ompt_tsan_dependences(ompt_data_t*, ompt_dependence_t const*, int) /ptmp/bhararit/llvm-project/openmp/tools/archer/ompt-tsan.cpp:1004 (libarcher.so+0x15652)
	#1 __kmpc_doacross_post /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_csupport.cpp:4280 (libomp.so+0x74d98)
	#2 .omp_outlined. for_ordered_01.c:? (for_ordered_01.exe+0x5186cb)
	#3 __kmp_invoke_microtask /ptmp/bhararit/llvm-project/openmp/runtime/src/z_Linux_asm.S:1166 (libomp.so+0x14e592)
	intel#4 __kmp_invoke_task_func /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_runtime.cpp:7556 (libomp.so+0x909ad)
	intel#5 __kmp_fork_call /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_runtime.cpp:2284 (libomp.so+0x8461a)
	intel#6 __kmpc_fork_call /ptmp/bhararit/llvm-project/openmp/runtime/src/kmp_csupport.cpp:308 (libomp.so+0x6db55)
	intel#7 main ??:? (for_ordered_01.exe+0x51828f)
	intel#8 __libc_start_main ??:? (libc.so.6+0x24349)
	intel#9 _start /home/abuild/rpmbuild/BUILD/glibc-2.26/csu/../sysdeps/x86_64/start.S:120 (for_ordered_01.exe+0x4214e9)

	ThreadSanitizer can not provide additional info.
	SUMMARY: ThreadSanitizer: SEGV /ptmp/bhararit/llvm-project/openmp/tools/archer/ompt-tsan.cpp:1004 in ompt_tsan_dependences(ompt_data_t*, ompt_dependence_t const*, int)
	==140865==ABORTING
```

	To reproduce the error, use the following openmp code snippet:

```
/* initialise  testMatrixInt Matrix, cols, r and c */
	  #pragma omp parallel private(r,c) shared(testMatrixInt)
	    {
	      #pragma omp for ordered(2)
	      for (r=1; r < rows; r++) {
	        for (c=1; c < cols; c++) {
	          #pragma omp ordered depend(sink:r-1, c+1) depend(sink:r-1,c-1)
	          testMatrixInt[r][c] = (testMatrixInt[r-1][c] + testMatrixInt[r-1][c-1]) % cols ;
	          #pragma omp ordered depend (source)
	        }
	      }
	    }
```

	Compilation:
```
clang -g -stdlib=libc++ -fsanitize=thread -fopenmp -larcher test_case.c
```

	It seems like the changes introduced by the commit https://reviews.llvm.org/D114005 causes this particular SEGV while using Archer.

Reviewed By: protze.joachim

Differential Revision: https://reviews.llvm.org/D115328
vmaksimo pushed a commit that referenced this pull request Jan 10, 2022
For code below:
        {
                r7 = addasl(r3,r0,#2)
                r8 = addasl(r3,r2,#2)
                r5 = memw(r3+r0<<#2)
                r6 = memw(r3+r2<<#2)
        }
        {
                p1 = cmp.gtu(r6,r5)
                if (p1.new) memw(r8+#0) = r5
                if (p1.new) memw(r7+#0) = r6
        }
        {
                r0 = mux(p1,r2,r4)

        }

In packetizer, a new packet is created for the cmp instruction since
there arent enough resources in previous packet. Also it is determined
that the cmp stalls by 2 cycles since it depends on the prior load of r5.
In current packetizer implementation, the predicated store is evaluated
for whether it can go in the same packet as compare, and since the compare
stalls, the stall of the predicated store does not matter and it can go in
the same packet as the cmp. However the predicated store will stall for
more cycles because of its dependence on the addasl instruction and to
avoid that stall we can put it in a new packet.

Improve the packetizer to check if an instruction being added to packet
will stall longer than instruction already in packet and if so create a
new packet.
vmaksimo pushed a commit that referenced this pull request Feb 7, 2022
We experienced some deadlocks when we used multiple threads for logging
using `scan-builds` intercept-build tool when we used multiple threads by
e.g. logging `make -j16`

```
(gdb) bt
#0  0x00007f2bb3aff110 in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f2bb3af70a3 in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007f2bb3d152e4 in ?? ()
#3  0x00007ffcc5f0cc80 in ?? ()
intel#4  0x00007f2bb3d2bf5b in ?? () from /lib64/ld-linux-x86-64.so.2
intel#5  0x00007f2bb3b5da27 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
intel#6  0x00007f2bb3b5dbe0 in exit () from /lib/x86_64-linux-gnu/libc.so.6
intel#7  0x00007f2bb3d144ee in ?? ()
intel#8  0x746e692f706d742f in ?? ()
intel#9  0x692d747065637265 in ?? ()
intel#10 0x2f653631326b3034 in ?? ()
intel#11 0x646d632e35353532 in ?? ()
intel#12 0x0000000000000000 in ?? ()
```

I think the gcc's exit call caused the injected `libear.so` to be unloaded
by the `ld`, which in turn called the `void on_unload() __attribute__((destructor))`.
That tried to acquire an already locked mutex which was left locked in the
`bear_report_call()` call, that probably encountered some error and
returned early when it forgot to unlock the mutex.

All of these are speculation since from the backtrace I could not verify
if frames 2 and 3 are in fact corresponding to the `libear.so` module.
But I think it's a fairly safe bet.

So, hereby I'm releasing the held mutex on *all paths*, even if some failure
happens.

PS: I would use lock_guards, but it's C.

Reviewed-by: NoQ

Differential Revision: https://reviews.llvm.org/D118439
vmaksimo pushed a commit that referenced this pull request Feb 14, 2022
There is a clangd crash at `__memcmp_avx2_movbe`. Short problem description is below.

The method `HeaderIncludes::addExistingInclude` stores `Include` objects by reference at 2 places: `ExistingIncludes` (primary storage) and `IncludesByPriority` (pointer to the object's location at ExistingIncludes). `ExistingIncludes` is a map where value is a `SmallVector`. A new element is inserted by `push_back`. The operation might do resize. As result pointers stored at `IncludesByPriority` might become invalid.

Typical stack trace
```
    frame #0: 0x00007f11460dcd94 libc.so.6`__memcmp_avx2_movbe + 308
    frame #1: 0x00000000004782b8 clangd`llvm::StringRef::compareMemory(Lhs="
\"t2.h\"", Rhs="", Length=6) at StringRef.h:76:22
    frame #2: 0x0000000000701253 clangd`llvm::StringRef::compare(this=0x0000
7f10de7d8610, RHS=(Data = "", Length = 7166742329480737377)) const at String
Ref.h:206:34
  * frame #3: 0x00000000007603ab clangd`llvm::operator<(llvm::StringRef, llv
m::StringRef)(LHS=(Data = "\"t2.h\"", Length = 6), RHS=(Data = "", Length =
7166742329480737377)) at StringRef.h:907:23
    frame intel#4: 0x0000000002d0ad9f clangd`clang::tooling::HeaderIncludes::inse
rt(this=0x00007f10de7fb1a0, IncludeName=(Data = "t2.h\"", Length = 4), IsAng
led=false) const at HeaderIncludes.cpp:365:22
    frame intel#5: 0x00000000012ebfdd clangd`clang::clangd::IncludeInserter::inse
rt(this=0x00007f10de7fb148, VerbatimHeader=(Data = "\"t2.h\"", Length = 6))
const at Headers.cpp:262:70
```

A unit test test for the crash was created (`HeaderIncludesTest.RepeatedIncludes`). The proposed solution is to use std::list instead of llvm::SmallVector

Test Plan
```
./tools/clang/unittests/Tooling/ToolingTests --gtest_filter=HeaderIncludesTest.RepeatedIncludes
```

Reviewed By: sammccall

Differential Revision: https://reviews.llvm.org/D118755
vmaksimo pushed a commit that referenced this pull request Mar 9, 2022
This patch fixes a data race in IOHandlerProcessSTDIO. The race is
happens between the main thread and the event handling thread. The main
thread is running the IOHandler (IOHandlerProcessSTDIO::Run()) when an
event comes in that makes us pop the process IO handler which involves
cancelling the IOHandler (IOHandlerProcessSTDIO::Cancel). The latter
calls SetIsDone(true) which modifies m_is_done. At the same time, we
have the main thread reading the variable through GetIsDone().

This patch avoids the race by using a mutex to synchronize the two
threads. On the event thread, in IOHandlerProcessSTDIO ::Cancel method,
we obtain the lock before changing the value of m_is_done. On the main
thread, in IOHandlerProcessSTDIO::Run(), we obtain the lock before
reading the value of m_is_done. Additionally, we delay calling SetIsDone
until after the loop exists, to avoid a potential race between the two
writes.

  Write of size 1 at 0x00010b66bb68 by thread T7 (mutexes: write M2862, write M718324145051843688):
    #0 lldb_private::IOHandler::SetIsDone(bool) IOHandler.h:90 (liblldb.15.0.0git.dylib:arm64+0x971d84)
    #1 IOHandlerProcessSTDIO::Cancel() Process.cpp:4382 (liblldb.15.0.0git.dylib:arm64+0x5ddfec)
    #2 lldb_private::Debugger::PopIOHandler(std::__1::shared_ptr<lldb_private::IOHandler> const&) Debugger.cpp:1156 (liblldb.15.0.0git.dylib:arm64+0x3cb2a8)
    #3 lldb_private::Debugger::RemoveIOHandler(std::__1::shared_ptr<lldb_private::IOHandler> const&) Debugger.cpp:1063 (liblldb.15.0.0git.dylib:arm64+0x3cbd2c)
    intel#4 lldb_private::Process::PopProcessIOHandler() Process.cpp:4487 (liblldb.15.0.0git.dylib:arm64+0x5c583c)
    intel#5 lldb_private::Debugger::HandleProcessEvent(std::__1::shared_ptr<lldb_private::Event> const&) Debugger.cpp:1549 (liblldb.15.0.0git.dylib:arm64+0x3ceabc)
    intel#6 lldb_private::Debugger::DefaultEventHandler() Debugger.cpp:1622 (liblldb.15.0.0git.dylib:arm64+0x3cf2c0)
    intel#7 std::__1::__function::__func<lldb_private::Debugger::StartEventHandlerThread()::$_2, std::__1::allocator<lldb_private::Debugger::StartEventHandlerThread()::$_2>, void* ()>::operator()() function.h:352 (liblldb.15.0.0git.dylib:arm64+0x3d1bd8)
    intel#8 lldb_private::HostNativeThreadBase::ThreadCreateTrampoline(void*) HostNativeThreadBase.cpp:62 (liblldb.15.0.0git.dylib:arm64+0x4c71ac)
    intel#9 lldb_private::HostThreadMacOSX::ThreadCreateTrampoline(void*) HostThreadMacOSX.mm:18 (liblldb.15.0.0git.dylib:arm64+0x29ef544)

  Previous read of size 1 at 0x00010b66bb68 by main thread:
    #0 lldb_private::IOHandler::GetIsDone() IOHandler.h:92 (liblldb.15.0.0git.dylib:arm64+0x971db8)
    #1 IOHandlerProcessSTDIO::Run() Process.cpp:4339 (liblldb.15.0.0git.dylib:arm64+0x5ddc7c)
    #2 lldb_private::Debugger::RunIOHandlers() Debugger.cpp:982 (liblldb.15.0.0git.dylib:arm64+0x3cb48c)
    #3 lldb_private::CommandInterpreter::RunCommandInterpreter(lldb_private::CommandInterpreterRunOptions&) CommandInterpreter.cpp:3298 (liblldb.15.0.0git.dylib:arm64+0x506478)
    intel#4 lldb::SBDebugger::RunCommandInterpreter(bool, bool) SBDebugger.cpp:1166 (liblldb.15.0.0git.dylib:arm64+0x53604)
    intel#5 Driver::MainLoop() Driver.cpp:634 (lldb:arm64+0x100006294)
    intel#6 main Driver.cpp:853 (lldb:arm64+0x100007344)

Differential revision: https://reviews.llvm.org/D120762
vmaksimo pushed a commit that referenced this pull request Jan 16, 2025
…6386)

The overloads for single_task and parallel_for in the
sycl_ext_oneapi_kernel_properties extension are being deprecated as
mentioned in intel#14785. So I'm rewriting
tests containg such overloads so that they can still run after the
deprecation.

---------

Signed-off-by: Hu, Peisen <peisen.hu@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant