Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TiFlash crash on releasing snapshot #6303

Closed
JaySon-Huang opened this issue Nov 14, 2022 · 5 comments
Closed

TiFlash crash on releasing snapshot #6303

JaySon-Huang opened this issue Nov 14, 2022 · 5 comments

Comments

@JaySon-Huang
Copy link
Contributor

JaySon-Huang commented Nov 14, 2022

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

After alter table xxx compact tiflash replica, tiflash (v6.1.0) get crashes with the below stack traces

2. What did you expect to see? (Required)

3. What did you see instead (Required)

[2022/11/13 18:42:10.426 +08:00] [ERROR] [BaseDaemon.cpp:377] [BaseDaemon:########################################] [thread_id=57]
[2022/11/13 18:42:10.427 +08:00] [ERROR] [BaseDaemon.cpp:378] ["BaseDaemon:(from thread 34) Received signal Segmentation fault(11)."] [thread_id=57]
[2022/11/13 18:42:10.427 +08:00] [ERROR] [BaseDaemon.cpp:408] ["BaseDaemon:Address: 0xb32ed"] [thread_id=57]
[2022/11/13 18:42:10.427 +08:00] [ERROR] [BaseDaemon.cpp:414] ["BaseDaemon:Access: read."] [thread_id=57]
[2022/11/13 18:42:10.427 +08:00] [ERROR] [BaseDaemon.cpp:423] ["BaseDaemon:Address not mapped to object."] [thread_id=57]
[2022/11/13 18:42:12.720 +08:00] [ERROR] [BaseDaemon.cpp:570] ["BaseDaemon:
       0x1ed2681    faultSignalHandler(int, siginfo_t*, void*) [tiflash+32319105]
                    libs/libdaemon/src/BaseDaemon.cpp:221
  0x7fc13a0875d0    <unknown symbol> [libpthread.so.0+62928]
       0x76fc3a6    void std::__1::__hash_table<std::__1::__hash_value_type<unsigned long, unsigned long>, std::__1::__unordered_map_hasher<unsigned long, std::__1::__hash_value_type<unsigned long, unsigned long>, std::__1::hash<unsigned long>, std::__1::equal_to<unsigned long>, true>, std::__1::__unordered_map_equal<unsigned long, std::__1::__hash_value_type<unsigned long, unsigned long>, std::__1::equal_to<unsigned long>, std::__1::hash<unsigned long>, true>, std::__1::allocator<std::__1::__hash_value_type<unsigned long, unsigned long> > >::__assign_multi<std::__1::__hash_const_iterator<std::__1::__hash_node<std::__1::__hash_value_type<unsigned long, unsigned long>, void*>*> >(std::__1::__hash_const_iterator<std::__1::__hash_node<std::__1::__hash_value_type<unsigned long, unsigned long>, void*>*>, std::__1::__hash_const_iterator<std::__1::__hash_node<std::__1::__hash_value_type<unsigned long, unsigned long>, void*>*>) [tiflash+124765094]
                    /usr/local/bin/../include/c++/v1/__hash_table:1769
       0x7a1b9ba    DB::PS::V2::PageEntriesForDelta::compactDeltaAndBase(std::__1::shared_ptr<DB::PS::V2::PageEntriesForDelta> const&, std::__1::shared_ptr<DB::PS::V2::PageEntriesForDelta> const&) [tiflash+128039354]
                    dbms/src/Storages/Page/V2/PageEntries.h:578
       0x7a1946b    DB::PS::V2::PageEntriesVersionSetWithDelta::compactOnDeltaRelease(std::__1::shared_ptr<DB::PS::V2::PageEntriesForDelta>) [tiflash+128029803]
                    dbms/src/Storages/Page/V2/VersionSet/PageEntriesVersionSetWithDelta.cpp:214
       0x7a1f4da    DB::PS::V2::PageEntriesVersionSetWithDelta::Snapshot::~Snapshot() [tiflash+128054490]
                    dbms/src/Storages/Page/V2/VersionSet/PageEntriesVersionSetWithDelta.h:143
       0x78aca6f    DB::DM::DeltaValueSpace::compact(DB::DM::DMContext&) [tiflash+126536303]
                    dbms/src/Storages/DeltaMerge/Delta/DeltaValueSpace.cpp:270
       0x77d3f1a    DB::DM::Segment::compactDelta(DB::DM::DMContext&) [tiflash+125648666]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:1338
       0x7799d54    DB::DM::DeltaMergeStore::handleBackgroundTask(bool) [tiflash+125410644]
                    dbms/src/Storages/DeltaMerge/DeltaMergeStore.cpp:1607
       0x76d16e1    DB::BackgroundProcessingPool::threadFunction() [tiflash+124589793]
                    dbms/src/Storages/BackgroundProcessingPool.cpp:233
       0x76d2061    void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, DB::BackgroundProcessingPool::BackgroundProcessingPool(int)::$_2> >(void*) [tiflash+124592225]
                    /usr/local/bin/../include/c++/v1/thread:291
  0x7fc13a07fdd5    start_thread [libpthread.so.0+32213]"] [thread_id=57]

[2022/11/13 18:42:12.721 +08:00] [ERROR] [BaseDaemon.cpp:406] ["BaseDaemon:Address: NULL pointer."] [thread_id=57]
[2022/11/13 18:42:12.721 +08:00] [ERROR] [BaseDaemon.cpp:414] ["BaseDaemon:Access: read."] [thread_id=57]
[2022/11/13 18:42:12.721 +08:00] [ERROR] [BaseDaemon.cpp:426] ["BaseDaemon:Unknown si_code."] [thread_id=57]
[2022/11/13 18:42:12.721 +08:00] [ERROR] [BaseDaemon.cpp:369] ["BaseDaemon:(from thread 19) Terminate called after throwing an instance of std::length_error what(): vector Stack trace:
       0x1d272f3    StackTrace::StackTrace() [tiflash+30569203]
                    dbms/src/Common/StackTrace.cpp:23
       0x1ed22af    terminate_handler() [tiflash+32318127]
                    libs/libdaemon/src/BaseDaemon.cpp:634
  0x7fc13dee1a13    std::__terminate(void (*)()) [libc++abi.so.1+236051]
  0x7fc13dee19b8    std::terminate() [libc++abi.so.1+235960]
       0x1d10ebb    __clang_call_terminate [tiflash+30478011]
       0x7a1f5c7    DB::PS::V2::PageEntriesVersionSetWithDelta::Snapshot::~Snapshot() [tiflash+128054727]
                    dbms/src/Storages/Page/V2/VersionSet/PageEntriesVersionSetWithDelta.h:149
       0x78aca6f    DB::DM::DeltaValueSpace::compact(DB::DM::DMContext&) [tiflash+126536303]
                    dbms/src/Storages/DeltaMerge/Delta/DeltaValueSpace.cpp:270
       0x77d3f1a    DB::DM::Segment::compactDelta(DB::DM::DMContext&) [tiflash+125648666]
                    dbms/src/Storages/DeltaMerge/Segment.cp"] [thread_id=57]
[2022/11/13 18:42:12.721 +08:00] [ERROR] [BaseDaemon.cpp:377] [BaseDaemon:########################################] [thread_id=57]
[2022/11/13 18:42:12.721 +08:00] [ERROR] [BaseDaemon.cpp:378] ["BaseDaemon:(from thread 19) Received signal Aborted(6)."] [thread_id=57]
[2022/11/13 18:42:12.722 +08:00] [ERROR] [BaseDaemon.cpp:369] ["BaseDaemon:(from thread 27) Terminate called after throwing an instance of std::length_error what(): vector Stack trace:
       0x1d272f3    StackTrace::StackTrace() [tiflash+30569203]
                    dbms/src/Common/StackTrace.cpp:23
       0x1ed22af    terminate_handler() [tiflash+32318127]
                    libs/libdaemon/src/BaseDaemon.cpp:634
  0x7fc13dee1a13    std::__terminate(void (*)()) [libc++abi.so.1+236051]
  0x7fc13dee19b8    std::terminate() [libc++abi.so.1+235960]
       0x1d10ebb    __clang_call_terminate [tiflash+30478011]
       0x7a1f5c7    DB::PS::V2::PageEntriesVersionSetWithDelta::Snapshot::~Snapshot() [tiflash+128054727]
                    dbms/src/Storages/Page/V2/VersionSet/PageEntriesVersionSetWithDelta.h:149
       0x78aca6f    DB::DM::DeltaValueSpace::compact(DB::DM::DMContext&) [tiflash+126536303]
                    dbms/src/Storages/DeltaMerge/Delta/DeltaValueSpace.cpp:270
       0x77d3f1a    DB::DM::Segment::compactDelta(DB::DM::DMContext&) [tiflash+125648666]
                    dbms/src/Storages/DeltaMerge/Segment.cp"] [thread_id=57]
[2022/11/13 18:42:12.722 +08:00] [ERROR] [BaseDaemon.cpp:377] [BaseDaemon:########################################] [thread_id=57]

4. What is your TiFlash version? (Required)

v6.1.0

@JaySon-Huang JaySon-Huang added the type/bug The issue is confirmed as a bug. label Nov 14, 2022
@JaySon-Huang JaySon-Huang self-assigned this Nov 14, 2022
@JaySon-Huang
Copy link
Contributor Author

New error stack traces

[2022/11/14 23:02:53.339 +08:00] [ERROR] [BaseDaemon.cpp:377] [BaseDaemon:########################################] [thread_id=56]
[2022/11/14 23:02:53.339 +08:00] [ERROR] [BaseDaemon.cpp:378] ["BaseDaemon:(from thread 15) Received signal Segmentation fault(11)."] [thread_id=56]
[2022/11/14 23:02:53.339 +08:00] [ERROR] [BaseDaemon.cpp:408] ["BaseDaemon:Address: 0x13536f"] [thread_id=56]
[2022/11/14 23:02:53.339 +08:00] [ERROR] [BaseDaemon.cpp:414] ["BaseDaemon:Access: read."] [thread_id=56]
[2022/11/14 23:02:53.339 +08:00] [ERROR] [BaseDaemon.cpp:423] ["BaseDaemon:Address not mapped to object."] [thread_id=56]
[2022/11/14 23:02:55.507 +08:00] [ERROR] [BaseDaemon.cpp:570] ["BaseDaemon:
       0x1ed2681    faultSignalHandler(int, siginfo_t*, void*) [tiflash+32319105]
                    libs/libdaemon/src/BaseDaemon.cpp:221
  0x7feadf68d5d0    <unknown symbol> [libpthread.so.0+62928]
       0x1e5f65f    std::__1::__hash_table<std::__1::__hash_value_type<unsigned long, unsigned long>, std::__1::__unordered_map_hasher<unsigned long, std::__1::__hash_value_type<unsigned long, unsigned long>, std::__1::hash<unsigned long>, std::__1::equal_to<unsigned long>, true>, std::__1::__unordered_map_equal<unsigned long, std::__1::__hash_value_type<unsigned long, unsigned long>, std::__1::equal_to<unsigned long>, std::__1::hash<unsigned long>, true>, std::__1::allocator<std::__1::__hash_value_type<unsigned long, unsigned long> > >::__rehash(unsigned long) [tiflash+31848031]
                    /usr/local/bin/../include/c++/v1/__hash_table:2374
       0x76fc76f    std::__1::__hash_table<std::__1::__hash_value_type<unsigned long, unsigned long>, std::__1::__unordered_map_hasher<unsigned long, std::__1::__hash_value_type<unsigned long, unsigned long>, std::__1::hash<unsigned long>, std::__1::equal_to<unsigned long>, true>, std::__1::__unordered_map_equal<unsigned long, std::__1::__hash_value_type<unsigned long, unsigned long>, std::__1::equal_to<unsigned long>, std::__1::hash<unsigned long>, true>, std::__1::allocator<std::__1::__hash_value_type<unsigned long, unsigned long> > >::__node_insert_multi_prepare(unsigned long, std::__1::__hash_value_type<unsigned long, unsigned long>&) [tiflash+124766063]
                    /usr/local/bin/../include/c++/v1/__hash_table:1944
       0x76fc46e    std::__1::__hash_table<std::__1::__hash_value_type<unsigned long, unsigned long>, std::__1::__unordered_map_hasher<unsigned long, std::__1::__hash_value_type<unsigned long, unsigned long>, std::__1::hash<unsigned long>, std::__1::equal_to<unsigned long>, true>, std::__1::__unordered_map_equal<unsigned long, std::__1::__hash_value_type<unsigned long, unsigned long>, std::__1::equal_to<unsigned long>, std::__1::hash<unsigned long>, true>, std::__1::allocator<std::__1::__hash_value_type<unsigned long, unsigned long> > >::__node_insert_multi(std::__1::__hash_node<std::__1::__hash_value_type<unsigned long, unsigned long>, void*>*) [tiflash+124765294]
                    /usr/local/bin/../include/c++/v1/__hash_table:2017
       0x76fc3ca    void std::__1::__hash_table<std::__1::__hash_value_type<unsigned long, unsigned long>, std::__1::__unordered_map_hasher<unsigned long, std::__1::__hash_value_type<unsigned long, unsigned long>, std::__1::hash<unsigned long>, std::__1::equal_to<unsigned long>, true>, std::__1::__unordered_map_equal<unsigned long, std::__1::__hash_value_type<unsigned long, unsigned long>, std::__1::equal_to<unsigned long>, std::__1::hash<unsigned long>, true>, std::__1::allocator<std::__1::__hash_value_type<unsigned long, unsigned long> > >::__assign_multi<std::__1::__hash_const_iterator<std::__1::__hash_node<std::__1::__hash_value_type<unsigned long, unsigned long>, void*>*> >(std::__1::__hash_const_iterator<std::__1::__hash_node<std::__1::__hash_value_type<unsigned long, unsigned long>, void*>*>, std::__1::__hash_const_iterator<std::__1::__hash_node<std::__1::__hash_value_type<unsigned long, unsigned long>, void*>*>) [tiflash+124765130]
                    /usr/local/bin/../include/c++/v1/__hash_table:1769
       0x7a1b9ba    DB::PS::V2::PageEntriesForDelta::compactDeltaAndBase(std::__1::shared_ptr<DB::PS::V2::PageEntriesForDelta> const&, std::__1::shared_ptr<DB::PS::V2::PageEntriesForDelta> const&) [tiflash+128039354]
                    dbms/src/Storages/Page/V2/PageEntries.h:578
       0x7a1946b    DB::PS::V2::PageEntriesVersionSetWithDelta::compactOnDeltaRelease(std::__1::shared_ptr<DB::PS::V2::PageEntriesForDelta>) [tiflash+128029803]
                    dbms/src/Storages/Page/V2/VersionSet/PageEntriesVersionSetWithDelta.cpp:214
       0x7a1f4da    DB::PS::V2::PageEntriesVersionSetWithDelta::Snapshot::~Snapshot() [tiflash+128054490]
                    dbms/src/Storages/Page/V2/VersionSet/PageEntriesVersionSetWithDelta.h:143
       0x78aca6f    DB::DM::DeltaValueSpace::compact(DB::DM::DMContext&) [tiflash+126536303]
                    dbms/src/Storages/DeltaMerge/Delta/DeltaValueSpace.cpp:270
       0x77d3f1a    DB::DM::Segment::compactDelta(DB::DM::DMContext&) [tiflash+125648666]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:1338
       0x7799d54    DB::DM::DeltaMergeStore::handleBackgroundTask(bool) [tiflash+125410644]
                    dbms/src/Storages/DeltaMerge/DeltaMergeStore.cpp:1607
       0x76d16e1    DB::BackgroundProcessingPool::threadFunction() [tiflash+124589793]
                    dbms/src/Storages/BackgroundProcessingPool.cpp:233
       0x76d2061    void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, DB::BackgroundProcessingPool::BackgroundProcessingPool(int)::$_2> >(void*) [tiflash+124592225]
                    /usr/local/bin/../include/c++/v1/thread:291
  0x7feadf685dd5    start_thread [libpthread.so.0+32213]"] [thread_id=56]

[2022/11/14 23:02:55.507 +08:00] [ERROR] [BaseDaemon.cpp:377] [BaseDaemon:########################################] [thread_id=56]
[2022/11/14 23:02:55.507 +08:00] [ERROR] [BaseDaemon.cpp:378] ["BaseDaemon:(from thread 28) Received signal Segmentation fault(11)."] [thread_id=56]
[2022/11/14 23:02:55.507 +08:00] [ERROR] [BaseDaemon.cpp:408] ["BaseDaemon:Address: 0x3912"] [thread_id=56]
[2022/11/14 23:02:55.507 +08:00] [ERROR] [BaseDaemon.cpp:414] ["BaseDaemon:Access: read."] [thread_id=56]
[2022/11/14 23:02:55.507 +08:00] [ERROR] [BaseDaemon.cpp:423] ["BaseDaemon:Address not mapped to object."] [thread_id=56]
[2022/11/14 23:43:27.964 +08:00] [ERROR] [BaseDaemon.cpp:570] ["BaseDaemon:
       0x1ed2681    faultSignalHandler(int, siginfo_t*, void*) [tiflash+32319105]
                    libs/libdaemon/src/BaseDaemon.cpp:221
  0x7f128fd6a5d0    <unknown symbol> [libpthread.so.0+62928]
       0x79e9339    std::__1::unordered_map<unsigned long, unsigned long, std::__1::hash<unsigned long>, std::__1::equal_to<unsigned long>, std::__1::allocator<std::__1::pair<unsigned long const, unsigned long> > >::erase(unsigned long const&) [tiflash+127832889]
                    /usr/local/bin/../include/c++/v1/unordered_map:1272
       0x7a1c655    DB::PS::V2::PageEntriesForDelta::merge(DB::PS::V2::PageEntriesForDelta&) [tiflash+128042581]
                    dbms/src/Storages/Page/V2/PageEntries.h:625
       0x7a1ba20    DB::PS::V2::PageEntriesForDelta::compactDeltaAndBase(std::__1::shared_ptr<DB::PS::V2::PageEntriesForDelta> const&, std::__1::shared_ptr<DB::PS::V2::PageEntriesForDelta> const&) [tiflash+128039456]
                    dbms/src/Storages/Page/V2/PageEntries.h:580
       0x7a1946b    DB::PS::V2::PageEntriesVersionSetWithDelta::compactOnDeltaRelease(std::__1::shared_ptr<DB::PS::V2::PageEntriesForDelta>) [tiflash+128029803]
                    dbms/src/Storages/Page/V2/VersionSet/PageEntriesVersionSetWithDelta.cpp:214
       0x7a1f4da    DB::PS::V2::PageEntriesVersionSetWithDelta::Snapshot::~Snapshot() [tiflash+128054490]
                    dbms/src/Storages/Page/V2/VersionSet/PageEntriesVersionSetWithDelta.h:143
       0x78aca6f    DB::DM::DeltaValueSpace::compact(DB::DM::DMContext&) [tiflash+126536303]
                    dbms/src/Storages/DeltaMerge/Delta/DeltaValueSpace.cpp:270
       0x77d3f1a    DB::DM::Segment::compactDelta(DB::DM::DMContext&) [tiflash+125648666]
                    dbms/src/Storages/DeltaMerge/Segment.cpp:1338
       0x7799d54    DB::DM::DeltaMergeStore::handleBackgroundTask(bool) [tiflash+125410644]
                    dbms/src/Storages/DeltaMerge/DeltaMergeStore.cpp:1607
       0x76d16e1    DB::BackgroundProcessingPool::threadFunction() [tiflash+124589793]
                    dbms/src/Storages/BackgroundProcessingPool.cpp:233
       0x76d2061    void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, DB::BackgroundProcessingPool::BackgroundProcessingPool(int)::$_2> >(void*) [tiflash+124592225]
                    /usr/local/bin/../include/c++/v1/thread:291
  0x7f128fd62dd5    start_thread [libpthread.so.0+32213]"] [thread_id=56]

@lidezhu
Copy link
Contributor

lidezhu commented Nov 15, 2022

There are also some stack which is not related to releasing tiflash snapshot:
[2022/11/14 23:30:23.045 +08:00] [ERROR] [BaseDaemon.cpp:377] [BaseDaemon:########################################] [thread_id=56]
[2022/11/14 23:30:23.045 +08:00] [ERROR] [BaseDaemon.cpp:378] ["BaseDaemon:(from thread 57) Received signal Segmentation fault(11)."] [thread_id=56]
[2022/11/14 23:30:23.045 +08:00] [ERROR] [BaseDaemon.cpp:406] ["BaseDaemon:Address: NULL pointer."] [thread_id=56]
[2022/11/14 23:30:23.045 +08:00] [ERROR] [BaseDaemon.cpp:414] ["BaseDaemon:Access: read."] [thread_id=56]
[2022/11/14 23:30:23.045 +08:00] [ERROR] [BaseDaemon.cpp:426] ["BaseDaemon:Unknown si_code."] [thread_id=56]
[2022/11/14 23:30:23.052 +08:00] [ERROR] [BaseDaemon.cpp:570] ["BaseDaemon:
0x1ed2681\tfaultSignalHandler(int, siginfo_t*, void*) [tiflash+32319105]
\tlibs/libdaemon/src/BaseDaemon.cpp:221
0x7f89142525d0\t [libpthread.so.0+62928]
0x7f89176491ef\trocksdb::MemTable::KeyComparator::operator()(char const*, rocksdb::Slice const&) const [libtiflash_proxy.so+44941807]
0x7f8917647584\trocksdb::DoublySkipList<rocksdb::MemTableRep::KeyComparator const&>::InsertPrevListCAS(rocksdb::DoublySkipList<rocksdb::MemTableRep::KeyComparator const&>::Node*, rocksdb::DoublySkipList<rocksdb::MemTableRep::KeyComparator const&>::Splice*, rocksdb::Slice const&) [libtiflash_proxy.so+44934532]
0x7f8917647292\tbool rocksdb::DoublySkipList<rocksdb::MemTableRep::KeyComparator const&>::Insert(char const*, rocksdb::DoublySkipList<rocksdb::MemTableRep::KeyComparator const&>::Splice*, bool) [libtiflash_proxy.so+44933778]
0x7f8917646477\trocksdb::(anonymous namespace)::SkipListReprocksdb::DoublySkipList::InsertKeyConcurrently(void*) [libtiflash_proxy.so+44930167]
0x7f8917649a15\trocksdb::MemTable::Add(unsigned long, rocksdb::ValueType, rocksdb::Slice const&, rocksdb::Slice const&, bool, rocksdb::MemTablePostProcessInfo*) [libtiflash_proxy.so+44943893]
0x7f8917790b08\trocksdb::MemTableInserter::PutCFImpl(unsigned int, rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::ValueType) [libtiflash_proxy.so+46283528]
0x7f891778ee04\trocksdb::MemTableInserter::PutCF(unsigned int, rocksdb::Slice const&, rocksdb::Slice const&) [libtiflash_proxy.so+46276100]
0x7f891778ab12\trocksdb::WriteBatch::Iterate(rocksdb::WriteBatch::Handler*) const [libtiflash_proxy.so+46258962]
0x7f891778e258\trocksdb::WriteBatchInternal::InsertInto(rocksdb::WriteThread::Writer*, unsigned long, rocksdb::ColumnFamilyMemTables*, rocksdb::FlushScheduler*, bool, unsigned long, rocksdb::DB*, bool, bool, unsigned long, bool) [libtiflash_proxy.so+46273112]
0x7f89178a1658\trocksdb::DBImpl::PipelinedWriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*) [libtiflash_proxy.so+47400536]
0x7f891789b858\trocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*) [libtiflash_proxy.so+47376472]
0x7f891789b328\trocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*) [libtiflash_proxy.so+47375144]
0x7f8917617a67\tcrocksdb_write [libtiflash_proxy.so+44739175]
0x7f89159af842\tengine_rocks::raft_engine::$LT$impl$u20$engine_traits..raft_engine..RaftEngine$u20$for$u20$engine_rocks..engine..RocksEngine$GT$::consume::hf6ee5ed393aad2a3 [libtiflash_proxy.so+14952514]
0x7f89159af94d\tengine_rocks::raft_engine::
$LT$impl$u20$engine_traits..raft_engine..RaftEngine$u20$for$u20$engine_rocks..engine..RocksEngine$GT$::consume_and_shrink::hfb711f9c27030b7e [libtiflash_proxy.so+14952781]
0x7f89167f1ba2\traftstore::store::async_io::write::Worker$LT$EK$C$ER$C$N$C$T$GT$::write_to_db::hdb8b8f65063fed69 [libtiflash_proxy.so+29903778]
0x7f89163399ec\tbatch_system::batch::Poller$LT$N$C$C$C$Handler$GT$::poll::hd6ab13beb1469de7 [libtiflash_proxy.so+24955372]
0x7f89169effbe\tstd::sys_common::backtrace::__rust_begin_short_backtrace::h716f8f3468ff30a3 [libtiflash_proxy.so+31993790]
0x7f891622a0a7\tcore::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::he4699ce7755d8105 [libtiflash_proxy.so+23842983]
0x7f8916c00fca\tstd::sys::unix::thread::Thread::new::thread_start::hd39c5f08bdcda277 [libtiflash_proxy.so+34160586]
0x7f891424add5\tstart_thread [libpthread.so.0+32213]"] [thread_id=56]
[2022/11/14 23:30:23.052 +08:00] [ERROR] [BaseDaemon.cpp:377] [BaseDaemon:########################################] [thread_id=56]

@JinheLin
Copy link
Contributor

JinheLin commented Nov 15, 2022

[2022/11/13 18:59:22.741 +08:00] [ERROR] [BaseDaemon.cpp:377] [BaseDaemon:########################################] [thread_id=50]
[2022/11/13 18:59:22.741 +08:00] [ERROR] [BaseDaemon.cpp:378] ["BaseDaemon:(from thread 63) Received signal Segmentation fault(11)."] [thread_id=50]
[2022/11/13 18:59:22.741 +08:00] [ERROR] [BaseDaemon.cpp:406] ["BaseDaemon:Address: NULL pointer."] [thread_id=50]
[2022/11/13 18:59:22.741 +08:00] [ERROR] [BaseDaemon.cpp:414] ["BaseDaemon:Access: read."] [thread_id=50]
[2022/11/13 18:59:22.741 +08:00] [ERROR] [BaseDaemon.cpp:426] ["BaseDaemon:Unknown si_code."] [thread_id=50]
[2022/11/13 18:59:22.741 +08:00] [ERROR] [BaseDaemon.cpp:570] ["BaseDaemon:\n 0x1ed2681\tfaultSignalHandler(int, siginfo_t*, void*) [tiflash+32319105]\n \tlibs/libdaemon/src/BaseDaemon.cpp:221\n 0x7f49c2baf5d0\t [libpthread.so.0+62928]\n 0x7f49c63d982f\tgrpc_chttp2_data_parser_parse(void*, grpc_chttp2_transport*, grpc_chttp2_stream*, grpc_slice const&, int) [libtiflash_proxy.so+49346607]\n 0x7f49c63ee990\tparse_frame_slice(grpc_chttp2_transport*, grpc_slice const&, int) [libtiflash_proxy.so+49432976]\n 0x7f49c63ee7d1\tgrpc_chttp2_perform_read(grpc_chttp2_transport*, grpc_slice const&) [libtiflash_proxy.so+49432529]\n 0x7f49c63d2123\tread_action_locked(void*, grpc_error*) [libtiflash_proxy.so+49316131]\n 0x7f49c6408ec3\tgrpc_combiner_continue_exec_ctx() [libtiflash_proxy.so+49540803]\n 0x7f49c640bb4d\tgrpc_core::ExecCtx::Flush() [libtiflash_proxy.so+49552205]\n 0x7f49c64117ac\tpollset_work(grpc_pollset*, grpc_pollset_worker**, long) [libtiflash_proxy.so+49575852]\n 0x7f49c6448550\tcq_next(grpc_completion_queue*, gpr_timespec, void*) [libtiflash_proxy.so+49800528]\n 0x7f49c4461db0\tstd::sys_common::backtrace::__rust_begin_short_backtrace::h7ae75534452a0898 [libtiflash_proxy.so+16350640]\n 0x7f49c4450b3e\tcore::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::ha650122bff0c6cc4 [libtiflash_proxy.so+16280382]\n 0x7f49c555dfca\tstd::sys::unix::thread::Thread::new::thread_start::hd39c5f08bdcda277 [libtiflash_proxy.so+34160586]\n 0x7f49c2ba7dd5\tstart_thread [libpthread.so.0+32213]"] [thread_id=50]
[2022/11/13 18:59:23.683 +08:00] [ERROR] [BaseDaemon.cpp:377] [BaseDaemon:########################################] [thread_id=50]
[2022/11/13 18:59:23.684 +08:00] [ERROR] [BaseDaemon.cpp:378] ["BaseDaemon:(from thread 32) Received signal Segmentation fault(11)."] [thread_id=50]
[2022/11/13 18:59:23.684 +08:00] [ERROR] [BaseDaemon.cpp:406] ["BaseDaemon:Address: NULL pointer."] [thread_id=50]
[2022/11/13 18:59:23.684 +08:00] [ERROR] [BaseDaemon.cpp:414] ["BaseDaemon:Access: read."] [thread_id=50]
[2022/11/13 18:59:23.684 +08:00] [ERROR] [BaseDaemon.cpp:426] ["BaseDaemon:Unknown si_code."] [thread_id=50]
[2022/11/13 18:59:23.690 +08:00] [ERROR] [BaseDaemon.cpp:570] ["BaseDaemon:\n 0x1ed2681\tfaultSignalHandler(int, siginfo_t*, void*) [tiflash+32319105]\n \tlibs/libdaemon/src/BaseDaemon.cpp:221\n 0x7f49c2baf5d0\t [libpthread.so.0+62928]\n 0x7953692\tstd::__1::__tree<DB::PS::V2::PageFile, DB::PS::V2::PageFile::Comparator, std::__1::allocatorDB::PS::V2::PageFile >::destroy(std::__1::__tree_node<DB::PS::V2::PageFile, void*>) [tiflash+127219346]\n \t/usr/local/bin/../include/c++/v1/__tree:0\n 0x795369a\tstd::__1::__tree<DB::PS::V2::PageFile, DB::PS::V2::PageFile::Comparator, std::__1::allocatorDB::PS::V2::PageFile >::destroy(std::__1::__tree_node<DB::PS::V2::PageFile, void>) [tiflash+127219354]\n \t/usr/local/bin/../include/c++/v1/__tree:1800\n 0x795369a\tstd::__1::__tree<DB::PS::V2::PageFile, DB::PS::V2::PageFile::Comparator, std::__1::allocatorDB::PS::V2::PageFile >::destroy(std::__1::__tree_node<DB::PS::V2::PageFile, void>) [tiflash+127219354]\n \t/usr/local/bin/../include/c++/v1/__tree:1800\n 0x795369a\tstd::__1::__tree<DB::PS::V2::PageFile, DB::PS::V2::PageFile::Comparator, std::__1::allocatorDB::PS::V2::PageFile >::destroy(std::__1::__tree_node<DB::PS::V2::PageFile, void>) [tiflash+127219354]\n \t/usr/local/bin/../include/c++/v1/__tree:1800\n 0x795369a\tstd::__1::__tree<DB::PS::V2::PageFile, DB::PS::V2::PageFile::Comparator, std::__1::allocatorDB::PS::V2::PageFile >::destroy(std::__1::__tree_node<DB::PS::V2::PageFile, void>) [tiflash+127219354]\n \t/usr/local/bin/../include/c++/v1/__tree:1800\n 0x795369a\tstd::__1::__tree<DB::PS::V2::PageFile, DB::PS::V2::PageFile::Comparator, std::__1::allocatorDB::PS::V2::PageFile >::destroy(std::__1::__tree_node<DB::PS::V2::PageFile, void>) [tiflash+127219354]\n \t/usr/local/bin/../include/c++/v1/__tree:1800\n 0x795369a\tstd::__1::__tree<DB::PS::V2::PageFile, DB::PS::V2::PageFile::Comparator, std::__1::allocatorDB::PS::V2::PageFile >::destroy(std::__1::__tree_node<DB::PS::V2::PageFile, void>) [tiflash+127219354]\n \t/usr/local/bin/../include/c++/v1/__tree:1800\n 0x7a2fe4b\tDB::PS::V2::LegacyCompactor::tryCompact(std::__1::set<DB::PS::V2::PageFile, DB::PS::V2::PageFile::Comparator, std::__1::allocatorDB::PS::V2::PageFile >&&, DB::PS::V2::PageStorage::WritingFilesSnapshot const&) [tiflash+128122443]\n \tdbms/src/Storages/Page/V2/gc/LegacyCompactor.cpp:129\n 0x7a0e918\tDB::PS::V2::PageStorage::gcImpl(bool, std::__1::shared_ptrDB::WriteLimiter const&, std::__1::shared_ptrDB::ReadLimiter const&) [tiflash+127985944]\n \tdbms/src/Storages/Page/V2/PageStorage.cpp:1166\n 0x78205c5\tDB::DM::StoragePool::doV2Gc(DB::Settings const&) [tiflash+125961669]\n \tdbms/src/Storages/DeltaMerge/StoragePool.cpp:570\n 0x76d16e1\tDB::BackgroundProcessingPool::threadFunction() [tiflash+124589793]\n \tdbms/src/Storages/BackgroundProcessingPool.cpp:233\n 0x76d2061\tvoid std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_deletestd::__1::__thread_struct >, DB::BackgroundProcessingPool::BackgroundProcessingPool(int)::$_2> >(void*) [tiflash+124592225]\n \t/usr/local/bin/../include/c++/v1/thread:291\n 0x7f49c2ba7dd5\tstart_thread [libpthread.so.0+32213]"] [thread_id=50]
[2022/11/13 18:59:24.805 +08:00] [ERROR] [BaseDaemon.cpp:377] [BaseDaemon:########################################] [thread_id=50]
[2022/11/13 18:59:24.805 +08:00] [ERROR] [BaseDaemon.cpp:378] ["BaseDaemon:(from thread 41) Received signal Segmentation fault(11)."] [thread_id=50]
[2022/11/13 18:59:24.805 +08:00] [ERROR] [BaseDaemon.cpp:406] ["BaseDaemon:Address: NULL pointer."] [thread_id=50]
[2022/11/13 18:59:24.805 +08:00] [ERROR] [BaseDaemon.cpp:414] ["BaseDaemon:Access: read."] [thread_id=50]
[2022/11/13 18:59:24.805 +08:00] [ERROR] [BaseDaemon.cpp:426] ["BaseDaemon:Unknown si_code."] [thread_id=50]
[2022/11/13 18:59:24.806 +08:00] [ERROR] [BaseDaemon.cpp:570] ["BaseDaemon:\n 0x1ed2681\tfaultSignalHandler(int, siginfo_t*, void*) [tiflash+32319105]\n \tlibs/libdaemon/src/BaseDaemon.cpp:221\n 0x7f49c2baf5d0\t [libpthread.so.0+62928]\n 0x7a1c63f\tDB::PS::V2::PageEntriesForDelta::merge(DB::PS::V2::PageEntriesForDelta&) [tiflash+128042559]\n \tdbms/src/Storages/Page/V2/PageEntries.h:623\n 0x7a1ba20\tDB::PS::V2::PageEntriesForDelta::compactDeltaAndBase(std::__1::shared_ptrDB::PS::V2::PageEntriesForDelta const&, std::__1::shared_ptrDB::PS::V2::PageEntriesForDelta const&) [tiflash+128039456]\n \tdbms/src/Storages/Page/V2/PageEntries.h:580\n 0x7a1946b\tDB::PS::V2::PageEntriesVersionSetWithDelta::compactOnDeltaRelease(std::__1::shared_ptrDB::PS::V2::PageEntriesForDelta) [tiflash+128029803]\n \tdbms/src/Storages/Page/V2/VersionSet/PageEntriesVersionSetWithDelta.cpp:214\n 0x7a1f4da\tDB::PS::V2::PageEntriesVersionSetWithDelta::Snapshot::~Snapshot() [tiflash+128054490]\n \tdbms/src/Storages/Page/V2/VersionSet/PageEntriesVersionSetWithDelta.h:143\n 0x7a8f887\tDB::PageReaderImplNormal::~PageReaderImplNormal() [tiflash+128514183]\n \tdbms/src/Storages/Page/PageStorage.cpp:75\n 0x780b82f\tstd::__1::__shared_ptr_emplace<DB::DM::ColumnFileSetSnapshot, std::__1::allocatorDB::DM::ColumnFileSetSnapshot >::__on_zero_shared() [tiflash+125876271]\n \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:315\n 0x780b491\tDB::DM::DeltaValueSnapshot::~DeltaValueSnapshot() [tiflash+125875345]\n \tdbms/src/Storages/DeltaMerge/Delta/DeltaValueSpace.h:297\n 0x77e90b9\tstd::__1::__shared_ptr_emplace<DB::DM::SegmentSnapshot, std::__1::allocatorDB::DM::SegmentSnapshot >::__on_zero_shared() [tiflash+125735097]\n \t/usr/local/bin/../include/c++/v1/__memory/shared_ptr.h:315\n 0x7799cf7\tDB::DM::DeltaMergeStore::handleBackgroundTask(bool) [tiflash+125410551]\n \tdbms/src/Storages/DeltaMerge/DeltaMergeStore.cpp:1600\n 0x76d16e1\tDB::BackgroundProcessingPool::threadFunction() [tiflash+124589793]\n \tdbms/src/Storages/BackgroundProcessingPool.cpp:233\n 0x76d2061\tvoid* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_deletestd::__1::__thread_struct >, DB::BackgroundProcessingPool::BackgroundProcessingPool(int)::$_2> >(void*) [tiflash+124592225]\n \t/usr/local/bin/../include/c++/v1/thread:291\n 0x7f49c2ba7dd5\tstart_thread [libpthread.so.0+32213]"] [thread_id=50]
[2022/11/13 18:59:30.001 +08:00] [ERROR] [BaseDaemon.cpp:377] [BaseDaemon:########################################] [thread_id=50]
[2022/11/13 18:59:30.001 +08:00] [ERROR] [BaseDaemon.cpp:378] ["BaseDaemon:(from thread 25) Received signal Segmentation fault(11)."] [thread_id=50]
[2022/11/13 18:59:30.001 +08:00] [ERROR] [BaseDaemon.cpp:406] ["BaseDaemon:Address: NULL pointer."] [thread_id=50]
[2022/11/13 18:59:30.001 +08:00] [ERROR] [BaseDaemon.cpp:414] ["BaseDaemon:Access: read."] [thread_id=50]
[2022/11/13 18:59:30.001 +08:00] [ERROR] [BaseDaemon.cpp:426] ["BaseDaemon:Unknown si_code."] [thread_id=50]
[2022/11/13 18:59:30.009 +08:00] [ERROR] [BaseDaemon.cpp:570] ["BaseDaemon:\n 0x1ed2681\tfaultSignalHandler(int, siginfo_t*, void*) [tiflash+32319105]\n \tlibs/libdaemon/src/BaseDaemon.cpp:221\n 0x7f49c2baf5d0\t [libpthread.so.0+62928]\n 0x788b109\tDB::DM::ColumnFile::tryToTinyFile() [tiflash+126398729]\n \tdbms/src/Storages/DeltaMerge/ColumnFile/ColumnFile.cpp:86\n 0x78b1741\tDB::DM::ColumnFilePersistedSet::getTotalCacheRows() const [tiflash+126555969]\n \tdbms/src/Storages/DeltaMerge/Delta/ColumnFilePersistedSet.cpp:248\n 0x78aa5cd\tDB::DM::DeltaValueSpace::getTotalCacheRows() const [tiflash+126526925]\n \tdbms/src/Storages/DeltaMerge/Delta/DeltaValueSpace.cpp:82\n 0x77a1eec\tDB::DM::DeltaMergeStore::getStat() [tiflash+125443820]\n \tdbms/src/Storages/DeltaMerge/DeltaMergeStore.cpp:2428\n 0x719b1bc\tDB::AsynchronousMetrics::update() [tiflash+119124412]\n \tdbms/src/Interpreters/AsynchronousMetrics.cpp:190\n 0x719abba\tDB::AsynchronousMetrics::run() [tiflash+119122874]\n \tdbms/src/Interpreters/AsynchronousMetrics.cpp:107\n 0x1d6fff1\tvoid* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_deletestd::__1::__thread_struct >, DB::AsynchronousMetrics::AsynchronousMetrics(DB::Context&)::'lambda'()> >(void*) [tiflash+30867441]\n \t/usr/local/bin/../include/c++/v1/thread:291\n 0x7f49c2ba7dd5\tstart_thread [libpthread.so.0+32213]"] [thread_id=50]
[2022/11/13 20:56:05.890 +08:00] [ERROR] [Exception.cpp:85] ["DB::EngineStoreApplyRes DB::HandleWriteRa


There are many other coredump stacks from the logfile.
But I think none of them is root cause.

@flowbehappy
Copy link
Contributor

This issue only occurs in a user's environment, and is likely caused by OOM killed #6407. Since it is a corner case and very difficult to reproduce, I suggest change to severity/major.

@flowbehappy
Copy link
Contributor

Close because can not reproduce.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants