Skip to content

Conversation

@RitanyaB
Copy link
Owner

..

Comment on lines +5036 to +5038
IntegerLiteral IfCond(getContext(), TrueOrFalse,
getContext().getIntTypeForBitwidth(32, /*Signed=*/0),
SourceLocation());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
IntegerLiteral IfCond(getContext(), TrueOrFalse,
getContext().getIntTypeForBitwidth(32, /*Signed=*/0),
SourceLocation());
IntegerLiteral IfCond(getContext(), TrueOrFalse,
getContext().getIntTypeForBitwidth(32, /*Signed=*/0),
SourceLocation());

Comment on lines +5039 to +5040
CGM.getOpenMPRuntime().emitTaskCall(*this, S.getBeginLoc(), S, OutlinedFn,
SharedsTy, CapturedStruct, &IfCond, Data);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CGM.getOpenMPRuntime().emitTaskCall(*this, S.getBeginLoc(), S, OutlinedFn,
SharedsTy, CapturedStruct, &IfCond, Data);
CGM.getOpenMPRuntime().emitTaskCall(*this, S.getBeginLoc(), S, OutlinedFn,
SharedsTy, CapturedStruct, &IfCond, Data);

Comment on lines +5070 to +5076
Address(CGF.EmitScalarConversion(
Replacement.getPointer(), CGF.getContext().VoidPtrTy,
CGF.getContext().getPointerType(
Data.ReductionCopies[Cnt]->getType()),
Data.ReductionCopies[Cnt]->getExprLoc()),
CGF.ConvertTypeForMem(Data.ReductionCopies[Cnt]->getType()),
Replacement.getAlignment());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Address(CGF.EmitScalarConversion(
Replacement.getPointer(), CGF.getContext().VoidPtrTy,
CGF.getContext().getPointerType(
Data.ReductionCopies[Cnt]->getType()),
Data.ReductionCopies[Cnt]->getExprLoc()),
CGF.ConvertTypeForMem(Data.ReductionCopies[Cnt]->getType()),
Replacement.getAlignment());
Address(CGF.EmitScalarConversion(
Replacement.getPointer(), CGF.getContext().VoidPtrTy,
CGF.getContext().getPointerType(
Data.ReductionCopies[Cnt]->getType()),
Data.ReductionCopies[Cnt]->getExprLoc()),
CGF.ConvertTypeForMem(Data.ReductionCopies[Cnt]->getType()),
Replacement.getAlignment());

Replacement.getPointer(), CGF.getContext().VoidPtrTy,
CGF.getContext().getPointerType(InRedPrivs[Cnt]->getType()),
InRedPrivs[Cnt]->getExprLoc()),
CGF.ConvertTypeForMem(InRedPrivs[Cnt]->getType()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CGF.ConvertTypeForMem(InRedPrivs[Cnt]->getType()),
CGF.ConvertTypeForMem(InRedPrivs[Cnt]->getType()),

const RegionCodeGenTy &BodyGen,
OMPTargetDataInfo &InputInfo);

OMPTargetDataInfo &InputInfo);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an unnecessary change

Comment on lines +3502 to +3505
OMPTaskDataTy &Data,
CodeGenFunction &CGF,
const CapturedStmt *CS,
OMPPrivateScope &Scope);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
OMPTaskDataTy &Data,
CodeGenFunction &CGF,
const CapturedStmt *CS,
OMPPrivateScope &Scope);
OMPTaskDataTy &Data,
CodeGenFunction &CGF,
const CapturedStmt *CS,
OMPPrivateScope &Scope);

Comment on lines +4570 to +4571
(isOpenMPTaskingDirective(DSAStack->getCurrentDirective()) ||
DSAStack->getCurrentDirective() == OMPD_target) &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(isOpenMPTaskingDirective(DSAStack->getCurrentDirective()) ||
DSAStack->getCurrentDirective() == OMPD_target) &&
(isOpenMPTaskingDirective(DSAStack->getCurrentDirective()) ||
DSAStack->getCurrentDirective() == OMPD_target) &&

RitanyaB pushed a commit that referenced this pull request Mar 28, 2023
Previously we only looked at the si_signo field, so you got:
```
(lldb) bt
* thread #1, name = 'a.out.mte', stop reason = signal SIGSEGV
  * frame #0: 0x00000000004007f4
```
This patch adds si_code so we can show:
```
(lldb) bt
* thread #1, name = 'a.out.mte', stop reason = signal SIGSEGV: sync tag check fault
  * frame #0: 0x00000000004007f4
```

The order of errno and code was incorrect in ElfLinuxSigInfo::Parse.
It was the order that a "swapped" siginfo arch would use, which for Linux,
is only MIPS. We removed MIPS Linux support some time ago.

See:
https://github.com/torvalds/linux/blob/fe15c26ee26efa11741a7b632e9f23b01aca4cc6/include/uapi/asm-generic/siginfo.h#L121

A test is added using memory tagging faults. Which were the original
motivation for the changes.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D146045
RitanyaB pushed a commit that referenced this pull request Mar 28, 2023
This change prevents rare deadlocks observed for specific macOS/iOS GUI
applications which issue many `dlopen()` calls from multiple different
threads at startup and where TSan finds and reports a race during
startup.  Providing a reliable test for this has been deemed infeasible.

Although I've only observed this deadlock on Apple platforms,
conceptually the cause is not confined to Apple code so the fix lives in
platform-independent code.

Deadlock scenario:
```
Thread 2                    | Thread 4
ReportRace()                |
Lock internal TSan mutexes  |
  &ctx->slot_mtx            |
                            | dlopen() interceptor
                            | OnLibraryLoaded()
                            | MemoryMappingLayout::DumpListOfModules()
                            | calls dyld API, which takes internal lock
                            | lock() interceptor
                            | TSan tries to take internal mutexes again
                            |   &ctx->slot_mtx
call into symbolizer        |
MemoryMappingLayout::DumpListOfModules()
calls dyld API, which hangs on trying to take lock
```
Resulting in:
* Thread 2 has internal TSan mutex, blocked on dyld lock
* Thread 4 has dyld lock, blocked on internal TSan mutex

The fix prevents this situation by not intercepting any of the calls
originating from `MemoryMappingLayout::DumpListOfModules()`.

Stack traces for deadlock between ReportRace() and dlopen() interceptor:
```
thread #2, queue = 'com.apple.root.default-qos'
  frame #0: libsystem_kernel.dylib
  frame #1: libclang_rt.tsan_osx_dynamic.dylib`::wrap_os_unfair_lock_lock_with_options(lock=<unavailable>, options=<unavailable>) at tsan_interceptors_mac.cpp:306:3
  frame #2: dyld`dyld4::RuntimeLocks::withLoadersReadLock(this=0x000000016f21b1e0, work=0x00000001814523c0) block_pointer) at DyldRuntimeState.cpp:227:28
  frame #3: dyld`dyld4::APIs::_dyld_get_image_header(this=0x0000000101012a20, imageIndex=614) at DyldAPIs.cpp:240:11
  frame llvm#4: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::CurrentImageHeader(this=<unavailable>) at sanitizer_procmaps_mac.cpp:391:35
  frame llvm#5: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::Next(this=0x000000016f2a2800, segment=0x000000016f2a2738) at sanitizer_procmaps_mac.cpp:397:51
  frame llvm#6: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::DumpListOfModules(this=0x000000016f2a2800, modules=0x00000001011000a0) at sanitizer_procmaps_mac.cpp:460:10
  frame llvm#7: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::ListOfModules::init(this=0x00000001011000a0) at sanitizer_mac.cpp:610:18
  frame llvm#8: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Symbolizer::FindModuleForAddress(unsigned long) [inlined] __sanitizer::Symbolizer::RefreshModules(this=0x0000000101100078) at sanitizer_symbolizer_libcdep.cpp:185:12
  frame llvm#9: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Symbolizer::FindModuleForAddress(this=0x0000000101100078, address=6465454512) at sanitizer_symbolizer_libcdep.cpp:204:5
  frame llvm#10: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Symbolizer::SymbolizePC(this=0x0000000101100078, addr=6465454512) at sanitizer_symbolizer_libcdep.cpp:88:15
  frame llvm#11: libclang_rt.tsan_osx_dynamic.dylib`__tsan::SymbolizeCode(addr=6465454512) at tsan_symbolize.cpp:106:35
  frame llvm#12: libclang_rt.tsan_osx_dynamic.dylib`__tsan::SymbolizeStack(trace=StackTrace @ 0x0000600002d66d00) at tsan_rtl_report.cpp:112:28
  frame llvm#13: libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedReportBase::AddMemoryAccess(this=0x000000016f2a2a90, addr=4381057136, external_tag=<unavailable>, s=<unavailable>, tid=<unavailable>, stack=<unavailable>, mset=0x00000001012fc310) at tsan_rtl_report.cpp:190:16
  frame llvm#14: libclang_rt.tsan_osx_dynamic.dylib`__tsan::ReportRace(thr=0x00000001012fc000, shadow_mem=0x000008020a4340e0, cur=<unavailable>, old=<unavailable>, typ0=1) at tsan_rtl_report.cpp:795:9
  frame llvm#15: libclang_rt.tsan_osx_dynamic.dylib`__tsan::DoReportRace(thr=0x00000001012fc000, shadow_mem=0x000008020a4340e0, cur=Shadow @ x22, old=Shadow @ 0x0000600002d6b4f0, typ=1) at tsan_rtl_access.cpp:166:3
  frame llvm#16: libclang_rt.tsan_osx_dynamic.dylib`::__tsan_read8(void *) at tsan_rtl_access.cpp:220:5
  frame llvm#17: libclang_rt.tsan_osx_dynamic.dylib`::__tsan_read8(void *) [inlined] __tsan::MemoryAccess(thr=0x00000001012fc000, pc=<unavailable>, addr=<unavailable>, size=8, typ=1) at tsan_rtl_access.cpp:442:3
  frame llvm#18: libclang_rt.tsan_osx_dynamic.dylib`::__tsan_read8(addr=<unavailable>) at tsan_interface.inc:34:3
  <call into TSan from from instrumented code>

thread llvm#4, queue = 'com.apple.dock.fullscreen'
  frame #0:  libsystem_kernel.dylib
  frame #1:  libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::FutexWait(p=<unavailable>, cmp=<unavailable>) at sanitizer_mac.cpp:540:3
  frame #2:  libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Semaphore::Wait(this=<unavailable>) at sanitizer_mutex.cpp:35:7
  frame #3:  libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::Mutex::Lock(this=0x0000000102992a80) at sanitizer_mutex.h:196:18
  frame llvm#4:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock(this=<unavailable>, mu=0x0000000102992a80) at sanitizer_mutex.h:383:10
  frame llvm#5:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __sanitizer::GenericScopedLock<__sanitizer::Mutex>::GenericScopedLock(this=<unavailable>, mu=0x0000000102992a80) at sanitizer_mutex.h:382:77
  frame llvm#6:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() at tsan_rtl.h:708:10
  frame llvm#7:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __tsan::TryTraceFunc(thr=0x000000010f084000, pc=0) at tsan_rtl.h:751:7
  frame llvm#8:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor() [inlined] __tsan::FuncExit(thr=0x000000010f084000) at tsan_rtl.h:798:7
  frame llvm#9:  libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor(this=0x000000016f3ba280) at tsan_interceptors_posix.cpp:300:5
  frame llvm#10: libclang_rt.tsan_osx_dynamic.dylib`__tsan::ScopedInterceptor::~ScopedInterceptor(this=<unavailable>) at tsan_interceptors_posix.cpp:293:41
  frame llvm#11: libclang_rt.tsan_osx_dynamic.dylib`::wrap_os_unfair_lock_lock_with_options(lock=0x000000016f21b1e8, options=OS_UNFAIR_LOCK_NONE) at tsan_interceptors_mac.cpp:310:1
  frame llvm#12: dyld`dyld4::RuntimeLocks::withLoadersReadLock(this=0x000000016f21b1e0, work=0x00000001814525d4) block_pointer) at DyldRuntimeState.cpp:227:28
  frame llvm#13: dyld`dyld4::APIs::_dyld_get_image_vmaddr_slide(this=0x0000000101012a20, imageIndex=412) at DyldAPIs.cpp:273:11
  frame llvm#14: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::Next(__sanitizer::MemoryMappedSegment*) at sanitizer_procmaps_mac.cpp:286:17
  frame llvm#15: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::Next(this=0x000000016f3ba560, segment=0x000000016f3ba498) at sanitizer_procmaps_mac.cpp:432:15
  frame llvm#16: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::MemoryMappingLayout::DumpListOfModules(this=0x000000016f3ba560, modules=0x000000016f3ba618) at sanitizer_procmaps_mac.cpp:460:10
  frame llvm#17: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::ListOfModules::init(this=0x000000016f3ba618) at sanitizer_mac.cpp:610:18
  frame llvm#18: libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::LibIgnore::OnLibraryLoaded(this=0x0000000101f3aa40, name="<some library>") at sanitizer_libignore.cpp:54:11
  frame llvm#19: libclang_rt.tsan_osx_dynamic.dylib`::wrap_dlopen(filename="<some library>", flag=<unavailable>) at sanitizer_common_interceptors.inc:6466:3
  <library code>
```

rdar://106766395

Differential Revision: https://reviews.llvm.org/D146593
RitanyaB pushed a commit that referenced this pull request May 8, 2023
…callback

The `TypeSystemMap::m_mutex` guards against concurrent modifications
of members of `TypeSystemMap`. In particular, `m_map`.

`TypeSystemMap::ForEach` iterates through the entire `m_map` calling
a user-specified callback for each entry. This is all done while
`m_mutex` is locked. However, there's nothing that guarantees that
the callback itself won't call back into `TypeSystemMap` APIs on the
same thread. This lead to double-locking `m_mutex`, which is undefined
behaviour. We've seen this cause a deadlock in the swift plugin with
following backtrace:

```

int main() {
    std::unique_ptr<int> up = std::make_unique<int>(5);

    volatile int val = *up;
    return val;
}

clang++ -std=c++2a -g -O1 main.cpp

./bin/lldb -o “br se -p return” -o run -o “v *up” -o “expr *up” -b
```

```
frame llvm#4: std::lock_guard<std::mutex>::lock_guard
frame llvm#5: lldb_private::TypeSystemMap::GetTypeSystemForLanguage <<<< Lock #2
frame llvm#6: lldb_private::TypeSystemMap::GetTypeSystemForLanguage
frame llvm#7: lldb_private::Target::GetScratchTypeSystemForLanguage
...
frame llvm#26: lldb_private::SwiftASTContext::LoadLibraryUsingPaths
frame llvm#27: lldb_private::SwiftASTContext::LoadModule
frame llvm#30: swift::ModuleDecl::collectLinkLibraries
frame llvm#31: lldb_private::SwiftASTContext::LoadModule
frame llvm#34: lldb_private::SwiftASTContext::GetCompileUnitImportsImpl
frame llvm#35: lldb_private::SwiftASTContext::PerformCompileUnitImports
frame llvm#36: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetSwiftASTContext
frame llvm#37: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetPersistentExpressionState
frame llvm#38: lldb_private::Target::GetPersistentSymbol
frame llvm#41: lldb_private::TypeSystemMap::ForEach                 <<<< Lock #1
frame llvm#42: lldb_private::Target::GetPersistentSymbol
frame llvm#43: lldb_private::IRExecutionUnit::FindInUserDefinedSymbols
frame llvm#44: lldb_private::IRExecutionUnit::FindSymbol
frame llvm#45: lldb_private::IRExecutionUnit::MemoryManager::GetSymbolAddressAndPresence
frame llvm#46: lldb_private::IRExecutionUnit::MemoryManager::findSymbol
frame llvm#47: non-virtual thunk to lldb_private::IRExecutionUnit::MemoryManager::findSymbol
frame llvm#48: llvm::LinkingSymbolResolver::findSymbol
frame llvm#49: llvm::LegacyJITSymbolResolver::lookup
frame llvm#50: llvm::RuntimeDyldImpl::resolveExternalSymbols
frame llvm#51: llvm::RuntimeDyldImpl::resolveRelocations
frame llvm#52: llvm::MCJIT::finalizeLoadedModules
frame llvm#53: llvm::MCJIT::finalizeObject
frame llvm#54: lldb_private::IRExecutionUnit::ReportAllocations
frame llvm#55: lldb_private::IRExecutionUnit::GetRunnableInfo
frame llvm#56: lldb_private::ClangExpressionParser::PrepareForExecution
frame llvm#57: lldb_private::ClangUserExpression::TryParse
frame llvm#58: lldb_private::ClangUserExpression::Parse
```

Our solution is to simply iterate over a local copy of `m_map`.

**Testing**

* Confirmed on manual reproducer (would reproduce 100% of the time
  before the patch)

Differential Revision: https://reviews.llvm.org/D149949
RitanyaB pushed a commit that referenced this pull request Jan 4, 2024
The upstream test relies on jump-tables, which are lowered in
dramatically different ways with later arm64e/ptrauth patches.

Concretely, it's failing for at least two reasons:
- ptrauth removes x16/x17 from tcGPR64 to prevent indirect tail-calls
  from using either register as the callee, conflicting with their usage
  as scratch for the tail-call LR auth checking sequence.  In the
  1/2_available_regs_left tests, this causes the MI scheduler to move
  the load up across some of the inlineasm register clobbers.

- ptrauth adds an x16/x17-using pseudo for jump-table dispatch, which
  looks somewhat different from the regular jump-table dispatch codegen
  by itself, but also prevents compression currently.

They seem like sensible changes.  But they mean the tests aren't really
testing what they're intented to, because there's always an implicit
x16/x17 clobber when using jump-tables.

This updates the test in a way that should work identically regardless
of ptrauth support, with one exception, #1 above, which merely reorders
the load/inlineasm w.r.t. eachother.
I verified the tests still fail the live-reg assertions when
applicable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants