Skip to content

Drop the Space when Rebuilding Index, resulting in storaged crash #3170

Closed
@Donald-Su

Description

@Donald-Su

Describe the bug (must be provided)

  • a cluster run the same machine(use different port), and work well
  • create the space, and set replica_factor 3
  • Drop the Space when Rebuilding Index, resulting in storaged crash

Your Environments (must be provided)

  • OS:
    • CentOS release 6.6
    • CentOS Linux release 7.3.1611
  • nebula version: v2.5.1

How To Reproduce(must be provided)

Steps to reproduce the behavior:

  1. Step 1:create the space, and then rebuild index, the status like follow
    image

  2. Step 2: drop the space, the status still have QUEUE and RUNNING
    image

  3. Step 3: later, one of the the cluster will crash, use show hosts display "OFFLINE", like follow
    image

Additional context

  • test many times, the status same core in RebuildEdgeIndexTask or RebuildTagIndexTask like follow.
    image

  • the debug info of storaged process

#0 0x00007f9fd86c79d9 in raise () from /lib64/libc.so.6
#1 0x00007f9fd86c90e8 in abort () from /lib64/libc.so.6
#2 0x00000000028316ad in rocksdb::port::PthreadCall(char const*, int) [clone .part.1] ()
#3 0x0000000003df24b0 in rocksdb::port::Mutex::Lock() ()
#4 0x0000000003b8290b in rocksdb::(anonymous namespace)::CleanupIteratorState(void*, void*) ()
#5 0x0000000003d6c8ec in rocksdb::Cleanable::~Cleanable() ()
#6 0x0000000003bfff52 in rocksdb::DBIter::~DBIter() ()
#7 0x0000000003e107cc in rocksdb::ArenaWrappedDBIter::~ArenaWrappedDBIter() ()
#8 0x0000000002b5298b in std::default_delete<rocksdb::Iterator>::operator() (this=0x7f9fd5e08ea8, __ptr=0x7f9fd5e35a00)
at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/bits/unique_ptr.h:78
#9 0x0000000002b516d3 in std::unique_ptr<rocksdb::Iterator, std::default_delete<rocksdb::Iterator> >::~unique_ptr (this=0x7f9fd5e08ea8, __in_chrg=<optimized out>)
at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/bits/unique_ptr.h:263
#10 0x0000000002b565ff in nebula::kvstore::RocksPrefixIter::~RocksPrefixIter (this=0x7f9fd5e08ea0, __in_chrg=<optimized out>)
at /home/00-NG-Code/V2.5.1/nebula-storage-2.5.1/src/kvstore/RocksEngine.h:59
#11 0x0000000002b56628 in nebula::kvstore::RocksPrefixIter::~RocksPrefixIter (this=0x7f9fd5e08ea0, __in_chrg=<optimized out>)
at /home/00-NG-Code/V2.5.1/nebula-storage-2.5.1/src/kvstore/RocksEngine.h:59
#12 0x00000000028fe0c7 in std::default_delete<nebula::kvstore::KVIterator>::operator() (this=0x7f9f58fef278, __ptr=0x7f9fd5e08ea0)
at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/bits/unique_ptr.h:78
#13 0x00000000028fdbfb in std::unique_ptr<nebula::kvstore::KVIterator, std::default_delete<nebula::kvstore::KVIterator> >::~unique_ptr (this=0x7f9f58fef278,
__in_chrg=<optimized out>) at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/bits/unique_ptr.h:263
#14 0x0000000002a28acc in nebula::storage::RebuildEdgeIndexTask::buildIndexGlobal (this=0x7f9f6bda3100, space=1, part=18, items=...)
at /home/00-NG-Code/V2.5.1/nebula-storage-2.5.1/src/storage/admin/RebuildEdgeIndexTask.cpp:46

#15 0x0000000002a184d7 in nebula::storage::RebuildIndexTask::invoke (this=0x7f9f6bda3100, space=1, part=18, items=...)
at /home/00-NG-Code/V2.5.1/nebula-storage-2.5.1/src/storage/admin/RebuildIndexTask.cpp:84
#16 0x0000000002a24a6c in std::__invoke_impl<nebula::cpp2::ErrorCode, nebula::cpp2::ErrorCode (nebula::storage::RebuildIndexTask::*&)(int, int, std::vector<std::shared_ptr<nebula::meta::cpp2::IndexItem>, std::allocator<std::shared_ptr<nebula::meta::cpp2::IndexItem> > > const&), nebula::storage::RebuildIndexTask*&, int&, int&, std::vector<std::shared_ptr<nebula::meta::cpp2::IndexItem>, std::allocator<std::shared_ptr<nebula::meta::cpp2::IndexItem> > >&> (__f=
@0x7f9f91016c00: (nebula::cpp2::ErrorCode (nebula::storage::RebuildIndexTask::*)(nebula::storage::RebuildIndexTask * const, int, int, const std::vector<std::shared_ptr<nebula::meta::cpp2::IndexItem>, std::allocator<std::shared_ptr<nebula::meta::cpp2::IndexItem> > > &)) 0x2a181de <nebula::storage::RebuildIndexTask::invoke(int, int, std::vector<std::shared_ptr<nebula::meta::cpp2::IndexItem>, std::allocator<std::shared_ptr<nebula::meta::cpp2::IndexItem> > > const&)>, __t=@0x7f9f91016c30: 0x7f9f6bda3100)
at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/bits/invoke.h:73
#17 0x0000000002a22c0f in std::__invoke<nebula::cpp2::ErrorCode (nebula::storage::RebuildIndexTask::*&)(int, int, std::vector<std::shared_ptr<nebula::meta::cpp2::IndexItem>, std::allocator<std::shared_ptr<nebula::meta::cpp2::IndexItem> > > const&), nebula::storage::RebuildIndexTask*&, int&, int&, std::vector<std::shared_ptr<nebula::meta::cpp2::IndexItem>, std::allocator<std::shared_ptr<nebula::meta::cpp2::IndexItem> > >&> (__fn=
@0x7f9f91016c00: (nebula::cpp2::ErrorCode (nebula::storage::RebuildIndexTask::*)(nebula::storage::RebuildIndexTask * const, int, int, const std::vector<std::shared_ptr<nebula::meta::cpp2::IndexItem>, std::allocator<std::shared_ptr<nebula::meta::cpp2::IndexItem> > > &)) 0x2a181de <nebula::storage::RebuildIndexTask::invoke(int, int, std::vector<std::shared_ptr<nebula::meta::cpp2::IndexItem>, std::allocator<std::shared_ptr<nebula::meta::cpp2::IndexItem> > > const&)>)
at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/bits/invoke.h:96
---Type <return> to continue, or q <return> to quit---
#18 0x0000000002a2145d in std::_Bind<nebula::cpp2::ErrorCode (nebula::storage::RebuildIndexTask::*(nebula::storage::RebuildIndexTask*, int, int, std::vector<std::shared_ptr<nebula::meta::cpp2::IndexItem>, std::allocator<std::shared_ptr<nebula::meta::cpp2::IndexItem> > >))(int, int, std::vector<std::shared_ptr<nebula::meta::cpp2::IndexItem>, std::allocator<std::shared_ptr<nebula::meta::cpp2::IndexItem> > > const&)>::__call<nebula::cpp2::ErrorCode, , 0ul, 1ul, 2ul, 3ul>(std::tuple<>&&, std::_Index_tuple<0ul, 1ul, 2ul, 3ul>) (this=0x7f9f91016c00, __args=<unknown type in /apps/svr/nebula-2.5.1-R0924-glog40-mod/bin/nebula-storaged, CU 0x4c6d96b, DIE 0x4f61e3f>)
at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/functional:469
#19 0x0000000002a1f7e1 in std::_Bind<nebula::cpp2::ErrorCode (nebula::storage::RebuildIndexTask::*(nebula::storage::RebuildIndexTask*, int, int, std::vector<std::shared_ptr<nebula::meta::cpp2::IndexItem>, std::allocator<std::shared_ptr<nebula::meta::cpp2::IndexItem> > >))(int, int, std::vector<std::shared_ptr<nebula::meta::cpp2::IndexItem>, std::allocator<std::shared_ptr<nebula::meta::cpp2::IndexItem> > > const&)>::operator()<, nebula::cpp2::ErrorCode>() (this=0x7f9f91016c00)
at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/functional:551
#20 0x0000000002a1d498 in std::_Function_handler<nebula::cpp2::ErrorCode (), std::_Bind<nebula::cpp2::ErrorCode (nebula::storage::RebuildIndexTask::*(nebula::storage::RebuildIndexTask*, int, int, std::vector<std::shared_ptr<nebula::meta::cpp2::IndexItem>, std::allocator<std::shared_ptr<nebula::meta::cpp2::IndexItem> > >))(int, int, std::vector<std::shared_ptr<nebula::meta::cpp2::IndexItem>, std::allocator<std::shared_ptr<nebula::meta::cpp2::IndexItem> > > const&)> >::_M_invoke(std::_Any_data const&) (
__functor=...) at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/bits/std_function.h:302
#21 0x00000000029f642e in std::function<nebula::cpp2::ErrorCode ()>::operator()() const (this=0x7f9f58fefba0)
at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/bits/std_function.h:706
#22 0x00000000029f3862 in nebula::storage::AdminSubTask::invoke (this=0x7f9f58fefba0) at /home/00-NG-Code/V2.5.1/nebula-storage-2.5.1/src/storage/admin/AdminTask.h:30
#23 0x00000000029f5a4d in nebula::storage::AdminTaskManager::runSubTask (this=0x5686280 <nebula::storage::AdminTaskManager::instance()::sAdminTaskManager>, handle=...)
at /home/00-NG-Code/V2.5.1/nebula-storage-2.5.1/src/storage/admin/AdminTaskManager.cpp:162
#24 0x0000000002a0772a in std::__invoke_impl<void, void (nebula::storage::AdminTaskManager::*&)(std::pair<int, int>), nebula::storage::AdminTaskManager*&, std::pair<int, int>&> (__f=
@0x7f9f91000160: (void (nebula::storage::AdminTaskManager::*)(nebula::storage::AdminTaskManager * const, std::pair<int, int>)) 0x29f5844 <nebula::storage::AdminTaskManager::runSubTask(std::pair<int, int>)>, __t=@0x7f9f91000178: 0x5686280 <nebula::storage::AdminTaskManager::instance()::sAdminTaskManager>)
at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/bits/invoke.h:73
#25 0x0000000002a064ad in std::__invoke<void (nebula::storage::AdminTaskManager::*&)(std::pair<int, int>), nebula::storage::AdminTaskManager*&, std::pair<int, int>&> (__fn=
@0x7f9f91000160: (void (nebula::storage::AdminTaskManager::*)(nebula::storage::AdminTaskManager * const, std::pair<int, int>)) 0x29f5844 <nebula::storage::AdminTaskManager::runSubTask(std::pair<int, int>)>) at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/bits/invoke.h:95
#26 0x0000000002a04177 in std::_Bind<void (nebula::storage::AdminTaskManager::*(nebula::storage::AdminTaskManager*, std::pair<int, int>))(std::pair<int, int>)>::__call<void, , 0ul, 1ul>(std::tuple<>&&, std::_Index_tuple<0ul, 1ul>) (this=0x7f9f91000160,
__args=<unknown type in /apps/svr/nebula-2.5.1-R0924-glog40-mod/bin/nebula-storaged, CU 0x403b630, DIE 0x434615c>)
at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/functional:467
#27 0x0000000002a00a9f in std::_Bind<void (nebula::storage::AdminTaskManager::*(nebula::storage::AdminTaskManager*, std::pair<int, int>))(std::pair<int, int>)>::operator()<, void>() (this=0x7f9f91000160) at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/functional:551
#28 0x00000000029fde11 in folly::detail::function::FunctionTraits<void ()>::callBig<std::_Bind<void (nebula::storage::AdminTaskManager::*(nebula::storage::AdminTaskManager*, std::pair<int, int>))(std::pair<int, int>)> >(folly::detail::function::Data&) (p=...) at /opt/vesoft/third-party/2.0/include/folly/Function.h:385
#29 0x0000000004289706 in folly::ThreadPoolExecutor::runTask(std::shared_ptr<folly::ThreadPoolExecutor::Thread> const&, folly::ThreadPoolExecutor::Task&&) ()
#30 0x000000000427c7cc in ?? ()
#31 0x00000000042f58bb in bool folly::AtomicNotificationQueue<folly::Function<void ()> >::drive<folly::EventBase::FuncRunner&>(folly::EventBase::FuncRunner&) ()
#32 0x00000000042f6a2d in non-virtual thunk to folly::EventBaseAtomicNotificationQueue<folly::Function<void ()>, folly::EventBase::FuncRunner>::handlerReady(unsigned short)
---Type <return> to continue, or q <return> to quit---
()
#33 0x00000000043c782f in ?? ()
#34 0x00000000043c814f in event_base_loop ()
#35 0x00000000042f177e in folly::EventBase::loopBody(int, bool) ()
#36 0x00000000042f1c26 in folly::EventBase::loop() ()
#37 0x00000000042f36d6 in folly::EventBase::loopForever() ()
#38 0x000000000427d309 in folly::IOThreadPoolExecutor::threadRun(std::shared_ptr<folly::ThreadPoolExecutor::Thread>) ()
#39 0x000000000428aed9 in void folly::detail::function::FunctionTraits<void ()>::callBig<std::_Bind<void (folly::ThreadPoolExecutor::*(folly::ThreadPoolExecutor*, std::shared_ptr<folly::ThreadPoolExecutor::Thread>))(std::shared_ptr<folly::ThreadPoolExecutor::Thread>)> >(folly::detail::function::Data&) ()
#40 0x0000000002860578 in folly::detail::function::FunctionTraits<void ()>::operator()() (this=0x7f9f91030320) at /opt/vesoft/third-party/2.0/include/folly/Function.h:400
#41 0x000000000286ffe4 in folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}::operator()() (__closure=0x7f9f91030320)
at /opt/vesoft/third-party/2.0/include/folly/executors/thread_factory/NamedThreadFactory.h:40
#42 0x000000000287f6a2 in std::__invoke_impl<void, folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}>(std::__invoke_other, folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}&&) (__f=<unknown type in /apps/svr/nebula-2.5.1-R0924-glog40-mod/bin/nebula-storaged, CU 0x34da9c, DIE 0x6ee021>)
at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/bits/invoke.h:60
#43 0x000000000287a258 in std::__invoke<folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}>(std::__invoke_result&&, (folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}&&)...) (__fn=<unknown type in /apps/svr/nebula-2.5.1-R0924-glog40-mod/bin/nebula-storaged, CU 0x34da9c, DIE 0x6f981e>)
at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/bits/invoke.h:95
#44 0x00000000028b4dde in std::thread::_Invoker<std::tuple<folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) (this=0x7f9f91030320) at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/thread:234
#45 0x00000000028b46c3 in std::thread::_Invoker<std::tuple<folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}> >::operator()() (
this=0x7f9f91030320) at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/thread:243
#46 0x00000000028b1de6 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<folly::NamedThreadFactory::newThread(folly::Function<void ()>&&)::{lambda()#1}> > >::_M_run() (this=0x7f9f91030310) at /opt/vesoft/toolset/gcc/7.5.0/include/c++/7.5.0/thread:186
#47 0x000000000477c03f in execute_native_thread_routine ()
#48 0x00007f9fd8a5adf3 in start_thread () from /lib64/libpthread.so.0
#49 0x00007f9fd87882cd in clone () from /lib64/libc.so.6

Metadata

Metadata

Assignees

Labels

type/bugType: something is unexpected

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions