Description
Describe the bug
HostAccessorDeadLockTest.CheckThreadOrder
unit test sporadically hangs when launched on M1 hardware.
To Reproduce
Just build the compiler and launch the test on M1 hardware. The hang is not stable and the test sometimes passes:
asachkov@Alexeys-MacBook-Pro build % ./tools/sycl/unittests/thread_safety/ThreadSafetyTests
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from HostAccessorDeadLockTest
[ RUN ] HostAccessorDeadLockTest.CheckThreadOrder
^C
asachkov@Alexeys-MacBook-Pro build % ./tools/sycl/unittests/thread_safety/ThreadSafetyTests
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from HostAccessorDeadLockTest
[ RUN ] HostAccessorDeadLockTest.CheckThreadOrder
[ OK ] HostAccessorDeadLockTest.CheckThreadOrder (1 ms)
[----------] 1 test from HostAccessorDeadLockTest (1 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (1 ms total)
[ PASSED ] 1 test.
asachkov@Alexeys-MacBook-Pro build % ./tools/sycl/unittests/thread_safety/ThreadSafetyTests
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from HostAccessorDeadLockTest
[ RUN ] HostAccessorDeadLockTest.CheckThreadOrder
^C
asachkov@Alexeys-MacBook-Pro build % ./tools/sycl/unittests/thread_safety/ThreadSafetyTests
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from HostAccessorDeadLockTest
[ RUN ] HostAccessorDeadLockTest.CheckThreadOrder
[ OK ] HostAccessorDeadLockTest.CheckThreadOrder (1 ms)
[----------] 1 test from HostAccessorDeadLockTest (1 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (2 ms total)
[ PASSED ] 1 test.
asachkov@Alexeys-MacBook-Pro build % ./tools/sycl/unittests/thread_safety/ThreadSafetyTests
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from HostAccessorDeadLockTest
[ RUN ] HostAccessorDeadLockTest.CheckThreadOrder
[ OK ] HostAccessorDeadLockTest.CheckThreadOrder (1 ms)
[----------] 1 test from HostAccessorDeadLockTest (1 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (1 ms total)
[ PASSED ] 1 test.
asachkov@Alexeys-MacBook-Pro build %
asachkov@Alexeys-MacBook-Pro build % ./tools/sycl/unittests/thread_safety/ThreadSafetyTests
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from HostAccessorDeadLockTest
[ RUN ] HostAccessorDeadLockTest.CheckThreadOrder
^C
asachkov@Alexeys-MacBook-Pro build % ./tools/sycl/unittests/thread_safety/ThreadSafetyTests
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from HostAccessorDeadLockTest
[ RUN ] HostAccessorDeadLockTest.CheckThreadOrder
[ OK ] HostAccessorDeadLockTest.CheckThreadOrder (1 ms)
[----------] 1 test from HostAccessorDeadLockTest (1 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (1 ms total)
[ PASSED ] 1 test.
asachkov@Alexeys-MacBook-Pro build % ./tools/sycl/unittests/thread_safety/ThreadSafetyTests
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from HostAccessorDeadLockTest
[ RUN ] HostAccessorDeadLockTest.CheckThreadOrder
^C
asachkov@Alexeys-MacBook-Pro build % ./tools/sycl/unittests/thread_safety/ThreadSafetyTests
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from HostAccessorDeadLockTest
[ RUN ] HostAccessorDeadLockTest.CheckThreadOrder
[ OK ] HostAccessorDeadLockTest.CheckThreadOrder (0 ms)
[----------] 1 test from HostAccessorDeadLockTest (0 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (0 ms total)
[ PASSED ] 1 test.
As you can see from the log above, it failed in ~50% cases for me. If you reduce the number of threads in the test down to two, it almost always passes (if not always).
Environment (please complete the following information):
- OS: MacOS 12.6
- Target device and vendor: Apple M1 Max
- DPC++ version: nearly the most up-to-date commit at the time of writing this
Additional context
Per-thread backtrace:
Thread 1:
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
* frame #0: 0x000000018d4b0834 libsystem_kernel.dylib`__ulock_wait + 8
frame #1: 0x000000018d4ee5a0 libsystem_pthread.dylib`_pthread_join + 444
frame #2: 0x000000018d447a14 libc++.1.dylib`std::__1::thread::join() + 36
frame #3: 0x00000001000050ac ThreadSafetyTests`(anonymous namespace)::HostAccessorDeadLockTest_CheckThr
eadOrder_Test::TestBody() + 1028
frame #4: 0x00000001000c5444 ThreadSafetyTests`testing::Test::Run() + 488
frame #5: 0x00000001000c62e4 ThreadSafetyTests`testing::TestInfo::Run() + 528
frame #6: 0x00000001000c6a48 ThreadSafetyTests`testing::TestSuite::Run() + 280
frame #7: 0x00000001000d3714 ThreadSafetyTests`testing::internal::UnitTestImpl::RunAllTests() + 1740
frame #8: 0x00000001000d2fd0 ThreadSafetyTests`testing::UnitTest::Run() + 124
frame #9: 0x00000001000c106c ThreadSafetyTests`main + 148
frame #10: 0x00000001001b908c dyld`start + 520
Thread 2:
thread #2
frame #0: 0x000000018d4b2270 libsystem_kernel.dylib`__psynch_cvwait + 8
frame #1: 0x000000018d4ec83c libsystem_pthread.dylib`_pthread_cond_wait + 1236
frame #2: 0x000000018d43b284 libc++.1.dylib`std::__1::condition_variable::wait(std::__1::unique_lock<st
d::__1::mutex>&) + 28
frame #3: 0x000000018d43df2c libc++.1.dylib`std::__1::__shared_mutex_base::lock_shared() + 92
frame #4: 0x000000010007c370 ThreadSafetyTests`sycl::_V1::detail::Scheduler::releaseHostAccessor(sycl::
_V1::detail::AccessorImplHost*) + 52
frame #5: 0x000000010000b67c ThreadSafetyTests`sycl::_V1::detail::AccessorImplHost::~AccessorImplHost()
+ 44
frame #6: 0x0000000100089fd4 ThreadSafetyTests`std::__1::__shared_ptr_pointer<sycl::_V1::detail::Access
orImplHost*, std::__1::shared_ptr<sycl::_V1::detail::AccessorImplHost>::__shared_ptr_default_delete<sycl::_
V1::detail::AccessorImplHost, sycl::_V1::detail::AccessorImplHost>, std::__1::allocator<sycl::_V1::detail::
AccessorImplHost> >::__on_zero_shared() + 20
frame #7: 0x0000000100005928 ThreadSafetyTests`void* std::__1::__thread_proxy<std::__1::tuple<std::__1:
:unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, (anonymous na
mespace)::HostAccessorDeadLockTest_CheckThreadOrder_Test::TestBody()::$_0, unsigned long> >(void*) + 392
frame #8: 0x000000018d4ec26c libsystem_pthread.dylib`_pthread_start + 148
Thread 3:
thread #3
frame #0: 0x000000018d4b2270 libsystem_kernel.dylib`__psynch_cvwait + 8
frame #1: 0x000000018d4ec83c libsystem_pthread.dylib`_pthread_cond_wait + 1236
frame #2: 0x000000018d43b284 libc++.1.dylib`std::__1::condition_variable::wait(std::__1::unique_lock<st
d::__1::mutex>&) + 28
frame #3: 0x000000018d43de18 libc++.1.dylib`std::__1::__shared_mutex_base::lock() + 100
frame #4: 0x000000010007c0d0 ThreadSafetyTests`sycl::_V1::detail::Scheduler::addHostAccessor(sycl::_V1:
:detail::AccessorImplHost*) + 60
frame #5: 0x000000010000b748 ThreadSafetyTests`sycl::_V1::detail::addHostAccessorAndWait(sycl::_V1::det
ail::AccessorImplHost*) + 36
frame #6: 0x0000000100005cd4 ThreadSafetyTests`sycl::_V1::accessor<unsigned long, 1, (sycl::_V1::access
::mode)1026, (sycl::_V1::access::target)2018, (sycl::_V1::access::placeholder)0, sycl::_V1::ext::oneapi::ac
cessor_property_list<> >::accessor<unsigned long, 1, sycl::_V1::detail::aligned_allocator<unsigned long>, v
oid>(sycl::_V1::buffer<unsigned long, 1, sycl::_V1::detail::aligned_allocator<unsigned long>, std::__1::ena
ble_if<(((1) > (0))) && ((1) <= (3)), void>::type>&, sycl::_V1::property_list const&, sycl::_V1::detail::co
de_location) + 488
frame #7: 0x0000000100005a4c ThreadSafetyTests`sycl::_V1::accessor<unsigned long, 1, (sycl::_V1::access
::mode)1026, (sycl::_V1::access::target)2018, (sycl::_V1::access::placeholder)0, sycl::_V1::ext::oneapi::ac
cessor_property_list<> > sycl::_V1::buffer<unsigned long, 1, sycl::_V1::detail::aligned_allocator<unsigned
long>, void>::get_access<(sycl::_V1::access::mode)1026>(sycl::_V1::detail::code_location) + 64
frame #8: 0x0000000100005888 ThreadSafetyTests`void* std::__1::__thread_proxy<std::__1::tuple<std::__1:
:unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, (anonymous na
mespace)::HostAccessorDeadLockTest_CheckThreadOrder_Test::TestBody()::$_0, unsigned long> >(void*) + 232
frame #9: 0x000000018d4ec26c libsystem_pthread.dylib`_pthread_start + 148
Thread 4:
thread #4
frame #0: 0x0000000100060508 ThreadSafetyTests`sycl::_V1::detail::Command::enqueue(sycl::_V1::detail::E
nqueueResultT&, sycl::_V1::detail::BlockingT, std::__1::vector<sycl::_V1::detail::Command*, std::__1::alloc
ator<sycl::_V1::detail::Command*> >&) + 476
frame #1: 0x000000010007e900 ThreadSafetyTests`sycl::_V1::detail::Scheduler::GraphProcessor::enqueueCom
mand(sycl::_V1::detail::Command*, sycl::_V1::detail::EnqueueResultT&, std::__1::vector<sycl::_V1::detail::C
ommand*, std::__1::allocator<sycl::_V1::detail::Command*> >&, sycl::_V1::detail::BlockingT) + 272
frame #2: 0x000000010007e704 ThreadSafetyTests`sycl::_V1::detail::Scheduler::GraphProcessor::waitForEve
nt(std::__1::shared_ptr<sycl::_V1::detail::event_impl>, std::__1::shared_lock<std::__1::shared_timed_mutex>
&, std::__1::vector<sycl::_V1::detail::Command*, std::__1::allocator<sycl::_V1::detail::Command*> >&, bool)
+ 76
frame #3: 0x000000010007bbc0 ThreadSafetyTests`sycl::_V1::detail::Scheduler::waitForEvent(std::__1::sha
red_ptr<sycl::_V1::detail::event_impl>) + 84
frame #4: 0x0000000100035d88 ThreadSafetyTests`sycl::_V1::detail::event_impl::wait(std::__1::shared_ptr
<sycl::_V1::detail::event_impl>) + 476
frame #5: 0x000000010000b768 ThreadSafetyTests`sycl::_V1::detail::addHostAccessorAndWait(sycl::_V1::det
ail::AccessorImplHost*) + 68
frame #6: 0x0000000100005cd4 ThreadSafetyTests`sycl::_V1::accessor<unsigned long, 1, (sycl::_V1::access
::mode)1026, (sycl::_V1::access::target)2018, (sycl::_V1::access::placeholder)0, sycl::_V1::ext::oneapi::ac
cessor_property_list<> >::accessor<unsigned long, 1, sycl::_V1::detail::aligned_allocator<unsigned long>, v
oid>(sycl::_V1::buffer<unsigned long, 1, sycl::_V1::detail::aligned_allocator<unsigned long>, std::__1::ena
ble_if<(((1) > (0))) && ((1) <= (3)), void>::type>&, sycl::_V1::property_list const&, sycl::_V1::detail::co
de_location) + 488
frame #7: 0x0000000100005a4c ThreadSafetyTests`sycl::_V1::accessor<unsigned long, 1, (sycl::_V1::access
::mode)1026, (sycl::_V1::access::target)2018, (sycl::_V1::access::placeholder)0, sycl::_V1::ext::oneapi::ac
cessor_property_list<> > sycl::_V1::buffer<unsigned long, 1, sycl::_V1::detail::aligned_allocator<unsigned
long>, void>::get_access<(sycl::_V1::access::mode)1026>(sycl::_V1::detail::code_location) + 64
frame #8: 0x0000000100005888 ThreadSafetyTests`void* std::__1::__thread_proxy<std::__1::tuple<std::__1:
:unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, (anonymous na
mespace)::HostAccessorDeadLockTest_CheckThreadOrder_Test::TestBody()::$_0, unsigned long> >(void*) + 232
frame #9: 0x000000018d4ec26c libsystem_pthread.dylib`_pthread_start + 148
So it looks like at least two of those threads are blocked on the following locks:
We already had this problem reported and fixed in #889, but the lock was returned back (with additional changes) in #1597 and it was later refactored to operated on separate read/write locks in #2292