forked from ceph/ceph
    
        
        - 
                Notifications
    You must be signed in to change notification settings 
- Fork 1
wip #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
          
     Merged
      
      
            rzarzynski
  merged 1 commit into
  rzarzynski:wip-denc-container_base
from
tchaikov:wip-24636
  
      
      
   
  Aug 16, 2019 
      
    
                
     Merged
            
            wip #1
                    rzarzynski
  merged 1 commit into
  rzarzynski:wip-denc-container_base
from
tchaikov:wip-24636
  
      
      
   
  Aug 16, 2019 
              
            Conversation
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
    
              
                    rzarzynski
  
              
              approved these changes
              
                  
                    Aug 16, 2019 
                  
              
              
            
            
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the commit, @tchaikov! I will alter the commit title manually and rebase.
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Jan 16, 2020 
    
    
      
  
    
      
    
  
otherwise ODR is violated:
==449025==ERROR: AddressSanitizer: odr-violation (0x000000f03700):
  [1] size=8 'g_ceph_context' ../src/global/global_context.cc:24:14
  [2] size=8 'g_ceph_context' ../src/global/global_context.cc:24:14
These globals were registered at these points:
  [1]:
    #0 0x4779bd in __asan_register_globals (/var/ssd/ceph/clang-build/bin/ceph-conf+0x4779bd)
    #1 0x56e9cb in asan.module_ctor (/var/ssd/ceph/clang-build/bin/ceph-conf+0x56e9cb)
  [2]:
    #0 0x4779bd in __asan_register_globals (/var/ssd/ceph/clang-build/bin/ceph-conf+0x4779bd)
    #1 0x7fe5fed12aeb in asan.module_ctor (/var/ssd/ceph/clang-build/lib/libceph-common.so.2+0x2f34aeb)
==449025==HINT: if you don't care about these errors you may set ASAN_OPTIONS=detect_odr_violation=0
Signed-off-by: Kefu Chai <kchai@redhat.com>
    
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      Feb 12, 2020 
    
    
      
  
    
      
    
  
Accordingly to cppreference.com [1]: "If multiple threads of execution access the same std::shared_ptr object without synchronization and any of those accesses uses a non-const member function of shared_ptr then a data race will occur (...)" [1]: https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic One of the coredumps showed the `shared_ptr`-typed `OSD::osdmap` with healthy looking content but damaged control block: ``` [Current thread is 1 (Thread 0x7f7dcaf73700 (LWP 205295))] (gdb) bt #0 0x0000559cb81c3ea0 in ?? () #1 0x0000559c97675b27 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148 #2 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148 ceph#3 0x0000559c975ef8aa in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr_base.h:1167 ceph#4 std::__shared_ptr<OSDMap const, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr_base.h:1167 ceph#5 std::shared_ptr<OSDMap const>::~shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr.h:103 ceph#6 OSD::create_context (this=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9053 ceph#7 0x0000559c97655571 in OSD::dequeue_peering_evt (this=0x559ca22ac000, sdata=0x559ca2ef2900, pg=0x559cb4aa3400, evt=std::shared_ptr<PGPeeringEvent> (use count 2, weak count 0) = {...}, handle=...) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9665 ceph#8 0x0000559c97886db6 in ceph::osd::scheduler::PGPeeringItem::run (this=<optimized out>, osd=<optimized out>, sdata=<optimized out>, pg=..., handle=...) at /usr/include/c++/8/ext/atomicity.h:96 ceph#9 0x0000559c9764862f in ceph::osd::scheduler::OpSchedulerItem::run (handle=..., pg=..., sdata=<optimized out>, osd=<optimized out>, this=0x7f7dcaf703f0) at /usr/include/c++/8/bits/unique_ptr.h:342 ceph#10 OSD::ShardedOpWQ::_process (this=<optimized out>, thread_index=<optimized out>, hb=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:10677 ceph#11 0x0000559c97c76094 in ShardedThreadPool::shardedthreadpool_worker (this=0x559ca22aca28, thread_index=14) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/common/WorkQueue.cc:311 ceph#12 0x0000559c97c78cf4 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/common/WorkQueue.h:706 ceph#13 0x00007f7df17852de in start_thread () from /lib64/libpthread.so.0 ceph#14 0x00007f7df052f133 in __libc_ifunc_impl_list () from /lib64/libc.so.6 ceph#15 0x0000000000000000 in ?? () (gdb) frame 7 ceph#7 0x0000559c97655571 in OSD::dequeue_peering_evt (this=0x559ca22ac000, sdata=0x559ca2ef2900, pg=0x559cb4aa3400, evt=std::shared_ptr<PGPeeringEvent> (use count 2, weak count 0) = {...}, handle=...) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9665 9665 in /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc (gdb) print osdmap $24 = std::shared_ptr<const OSDMap> (expired, weak count 0) = {get() = 0x559cba028000} (gdb) print *osdmap # pretty sane OSDMap (gdb) print sizeof(osdmap) $26 = 16 (gdb) x/2a &osdmap 0x559ca22acef0: 0x559cba028000 0x559cba0ec900 (gdb) frame 2 #2 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148 148 /usr/include/c++/8/bits/shared_ptr_base.h: No such file or directory. (gdb) disassemble Dump of assembler code for function std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release(): ... 0x0000559c97675b1e <+62>: mov (%rdi),%rax 0x0000559c97675b21 <+65>: mov %rdi,%rbx 0x0000559c97675b24 <+68>: callq *0x10(%rax) => 0x0000559c97675b27 <+71>: test %rbp,%rbp ... End of assembler dump. (gdb) info registers rdi rbx rax rdi 0x559cba0ec900 94131624790272 rbx 0x559cba0ec900 94131624790272 rax 0x559cba0ec8a0 94131624790176 (gdb) x/a 0x559cba0ec8a0 + 0x10 0x559cba0ec8b0: 0x559cb81c3ea0 (gdb) bt #0 0x0000559cb81c3ea0 in ?? () ... (gdb) p $_siginfo._sifields._sigfault.si_addr $27 = (void *) 0x559cb81c3ea0 ``` Helgrind seems to agree: ``` ==00:00:02:54.519 510301== Possible data race during write of size 8 at 0xF123930 by thread ceph#90 ==00:00:02:54.519 510301== Locks held: 2, at addresses 0xF122A58 0xF1239A8 ==00:00:02:54.519 510301== at 0x7218DD: operator= (shared_ptr_base.h:1078) ==00:00:02:54.519 510301== by 0x7218DD: operator= (shared_ptr.h:103) ==00:00:02:54.519 510301== by 0x7218DD: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8116) ==00:00:02:54.519 510301== by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678) ==00:00:02:54.519 510301== by 0x72A06C: Context::complete(int) (Context.h:77) ==00:00:02:54.519 510301== by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66) ==00:00:02:54.519 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:02:54.519 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:02:54.519 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so) ==00:00:02:54.519 510301== ==00:00:02:54.519 510301== This conflicts with a previous read of size 8 by thread ceph#117 ==00:00:02:54.519 510301== Locks held: 1, at address 0x2123E9A0 ==00:00:02:54.519 510301== at 0x6B5842: __shared_ptr (shared_ptr_base.h:1165) ==00:00:02:54.519 510301== by 0x6B5842: shared_ptr (shared_ptr.h:129) ==00:00:02:54.519 510301== by 0x6B5842: get_osdmap (OSD.h:1700) ==00:00:02:54.519 510301== by 0x6B5842: OSD::create_context() (OSD.cc:9053) ==00:00:02:54.519 510301== by 0x71B570: OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&) (OSD.cc:9665) ==00:00:02:54.519 510301== by 0x71B997: OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&) (OSD.cc:9701) ==00:00:02:54.519 510301== by 0x70E62E: run (OpSchedulerItem.h:148) ==00:00:02:54.519 510301== by 0x70E62E: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (OSD.cc:10677) ==00:00:02:54.519 510301== by 0xD3C093: ShardedThreadPool::shardedthreadpool_worker(unsigned int) (WorkQueue.cc:311) ==00:00:02:54.519 510301== by 0xD3ECF3: ShardedThreadPool::WorkThreadSharded::entry() (WorkQueue.h:706) ==00:00:02:54.519 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:02:54.519 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:02:54.519 510301== Address 0xf123930 is 3,824 bytes inside a block of size 10,296 alloc'd ==00:00:02:54.519 510301== at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433) ==00:00:02:54.519 510301== by 0x66F766: main (ceph_osd.cc:688) ==00:00:02:54.519 510301== Block was alloc'd by thread #1 ``` Actually there is plenty of similar issues reported like: ``` ==00:00:05:04.903 510301== Possible data race during read of size 8 at 0x1E3E0588 by thread ceph#119 ==00:00:05:04.903 510301== Locks held: 1, at address 0x1EAD41D0 ==00:00:05:04.903 510301== at 0x753165: clear (hashtable.h:2051) ==00:00:05:04.903 510301== by 0x753165: std::_Hashtable<entity_addr_t, std::pair<entity_addr_t const, utime_t>, mempool::pool_allocator<(mempool::pool_index_t)15, std::pair<entity_addr_t const, utime_t> >, std::__detail::_Select1st, std::equal_to<entity_addr_t>, std::hash<entity_addr_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__deta il::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1369) ==00:00:05:04.903 510301== by 0x75331C: ~unordered_map (unordered_map.h:102) ==00:00:05:04.903 510301== by 0x75331C: OSDMap::~OSDMap() (OSDMap.h:350) ==00:00:05:04.903 510301== by 0x753606: operator() (shared_cache.hpp:100) ==00:00:05:04.903 510301== by 0x753606: std::_Sp_counted_deleter<OSDMap const*, SharedLRU<unsigned int, OSDMap const>::Cleanup, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr _base.h:471) ==00:00:05:04.903 510301== by 0x73BB26: _M_release (shared_ptr_base.h:155) ==00:00:05:04.903 510301== by 0x73BB26: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:148) ==00:00:05:04.903 510301== by 0x6B58A9: ~__shared_count (shared_ptr_base.h:728) ==00:00:05:04.903 510301== by 0x6B58A9: ~__shared_ptr (shared_ptr_base.h:1167) ==00:00:05:04.903 510301== by 0x6B58A9: ~shared_ptr (shared_ptr.h:103) ==00:00:05:04.903 510301== by 0x6B58A9: OSD::create_context() (OSD.cc:9053) ==00:00:05:04.903 510301== by 0x71B570: OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&) (OSD.cc:9665) ==00:00:05:04.903 510301== by 0x71B997: OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&) (OSD.cc:9701) ==00:00:05:04.903 510301== by 0x70E62E: run (OpSchedulerItem.h:148) ==00:00:05:04.903 510301== by 0x70E62E: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (OSD.cc:10677) ==00:00:05:04.903 510301== by 0xD3C093: ShardedThreadPool::shardedthreadpool_worker(unsigned int) (WorkQueue.cc:311) ==00:00:05:04.903 510301== by 0xD3ECF3: ShardedThreadPool::WorkThreadSharded::entry() (WorkQueue.h:706) ==00:00:05:04.903 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:05:04.903 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:05:04.903 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so) ==00:00:05:04.903 510301== ==00:00:05:04.903 510301== This conflicts with a previous write of size 8 by thread ceph#90 ==00:00:05:04.903 510301== Locks held: 2, at addresses 0xF122A58 0xF1239A8 ==00:00:05:04.903 510301== at 0x7531E1: clear (hashtable.h:2054) ==00:00:05:04.903 510301== by 0x7531E1: std::_Hashtable<entity_addr_t, std::pair<entity_addr_t const, utime_t>, mempool::pool_allocator<(mempool::pool_index_t)15, std::pair<entity_addr_t const, utime_t> >, std::__detail::_Select1st, std::equal_to<entity_addr_t>, std::hash<entity_addr_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1369) ==00:00:05:04.903 510301== by 0x75331C: ~unordered_map (unordered_map.h:102) ==00:00:05:04.903 510301== by 0x75331C: OSDMap::~OSDMap() (OSDMap.h:350) ==00:00:05:04.903 510301== by 0x753606: operator() (shared_cache.hpp:100) ==00:00:05:04.903 510301== by 0x753606: std::_Sp_counted_deleter<OSDMap const*, SharedLRU<unsigned int, OSDMap const>::Cleanup, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr_base.h:471) ==00:00:05:04.903 510301== by 0x73BB26: _M_release (shared_ptr_base.h:155) ==00:00:05:04.903 510301== by 0x73BB26: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:148) ==00:00:05:04.903 510301== by 0x72191E: operator= (shared_ptr_base.h:747) ==00:00:05:04.903 510301== by 0x72191E: operator= (shared_ptr_base.h:1078) ==00:00:05:04.903 510301== by 0x72191E: operator= (shared_ptr.h:103) ==00:00:05:04.903 510301== by 0x72191E: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8116) ==00:00:05:04.903 510301== by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678) ==00:00:05:04.903 510301== by 0x72A06C: Context::complete(int) (Context.h:77) ==00:00:05:04.903 510301== by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66) ==00:00:05:04.903 510301== Address 0x1e3e0588 is 872 bytes inside a block of size 1,208 alloc'd ==00:00:05:04.903 510301== at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433) ==00:00:05:04.903 510301== by 0x6C7C0C: OSDService::try_get_map(unsigned int) (OSD.cc:1606) ==00:00:05:04.903 510301== by 0x7213BD: get_map (OSD.h:699) ==00:00:05:04.903 510301== by 0x7213BD: get_map (OSD.h:1732) ==00:00:05:04.903 510301== by 0x7213BD: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8076) ==00:00:05:04.903 510301== by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678) ==00:00:05:04.903 510301== by 0x72A06C: Context::complete(int) (Context.h:77) ==00:00:05:04.903 510301== by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66) ==00:00:05:04.903 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:05:04.903 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:05:04.903 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so) ``` Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      Feb 12, 2020 
    
    
      
  
    
      
    
  
Accordingly to cppreference.com [1]: "If multiple threads of execution access the same std::shared_ptr object without synchronization and any of those accesses uses a non-const member function of shared_ptr then a data race will occur (...)" [1]: https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic One of the coredumps showed the `shared_ptr`-typed `OSD::osdmap` with healthy looking content but damaged control block: ``` [Current thread is 1 (Thread 0x7f7dcaf73700 (LWP 205295))] (gdb) bt #0 0x0000559cb81c3ea0 in ?? () #1 0x0000559c97675b27 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148 #2 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148 ceph#3 0x0000559c975ef8aa in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr_base.h:1167 ceph#4 std::__shared_ptr<OSDMap const, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr_base.h:1167 ceph#5 std::shared_ptr<OSDMap const>::~shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr.h:103 ceph#6 OSD::create_context (this=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9053 ceph#7 0x0000559c97655571 in OSD::dequeue_peering_evt (this=0x559ca22ac000, sdata=0x559ca2ef2900, pg=0x559cb4aa3400, evt=std::shared_ptr<PGPeeringEvent> (use count 2, weak count 0) = {...}, handle=...) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9665 ceph#8 0x0000559c97886db6 in ceph::osd::scheduler::PGPeeringItem::run (this=<optimized out>, osd=<optimized out>, sdata=<optimized out>, pg=..., handle=...) at /usr/include/c++/8/ext/atomicity.h:96 ceph#9 0x0000559c9764862f in ceph::osd::scheduler::OpSchedulerItem::run (handle=..., pg=..., sdata=<optimized out>, osd=<optimized out>, this=0x7f7dcaf703f0) at /usr/include/c++/8/bits/unique_ptr.h:342 ceph#10 OSD::ShardedOpWQ::_process (this=<optimized out>, thread_index=<optimized out>, hb=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:10677 ceph#11 0x0000559c97c76094 in ShardedThreadPool::shardedthreadpool_worker (this=0x559ca22aca28, thread_index=14) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/common/WorkQueue.cc:311 ceph#12 0x0000559c97c78cf4 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/common/WorkQueue.h:706 ceph#13 0x00007f7df17852de in start_thread () from /lib64/libpthread.so.0 ceph#14 0x00007f7df052f133 in __libc_ifunc_impl_list () from /lib64/libc.so.6 ceph#15 0x0000000000000000 in ?? () (gdb) frame 7 ceph#7 0x0000559c97655571 in OSD::dequeue_peering_evt (this=0x559ca22ac000, sdata=0x559ca2ef2900, pg=0x559cb4aa3400, evt=std::shared_ptr<PGPeeringEvent> (use count 2, weak count 0) = {...}, handle=...) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9665 9665 in /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc (gdb) print osdmap $24 = std::shared_ptr<const OSDMap> (expired, weak count 0) = {get() = 0x559cba028000} (gdb) print *osdmap # pretty sane OSDMap (gdb) print sizeof(osdmap) $26 = 16 (gdb) x/2a &osdmap 0x559ca22acef0: 0x559cba028000 0x559cba0ec900 (gdb) frame 2 #2 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148 148 /usr/include/c++/8/bits/shared_ptr_base.h: No such file or directory. (gdb) disassemble Dump of assembler code for function std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release(): ... 0x0000559c97675b1e <+62>: mov (%rdi),%rax 0x0000559c97675b21 <+65>: mov %rdi,%rbx 0x0000559c97675b24 <+68>: callq *0x10(%rax) => 0x0000559c97675b27 <+71>: test %rbp,%rbp ... End of assembler dump. (gdb) info registers rdi rbx rax rdi 0x559cba0ec900 94131624790272 rbx 0x559cba0ec900 94131624790272 rax 0x559cba0ec8a0 94131624790176 (gdb) x/a 0x559cba0ec8a0 + 0x10 0x559cba0ec8b0: 0x559cb81c3ea0 (gdb) bt #0 0x0000559cb81c3ea0 in ?? () ... (gdb) p $_siginfo._sifields._sigfault.si_addr $27 = (void *) 0x559cb81c3ea0 ``` Helgrind seems to agree: ``` ==00:00:02:54.519 510301== Possible data race during write of size 8 at 0xF123930 by thread ceph#90 ==00:00:02:54.519 510301== Locks held: 2, at addresses 0xF122A58 0xF1239A8 ==00:00:02:54.519 510301== at 0x7218DD: operator= (shared_ptr_base.h:1078) ==00:00:02:54.519 510301== by 0x7218DD: operator= (shared_ptr.h:103) ==00:00:02:54.519 510301== by 0x7218DD: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8116) ==00:00:02:54.519 510301== by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678) ==00:00:02:54.519 510301== by 0x72A06C: Context::complete(int) (Context.h:77) ==00:00:02:54.519 510301== by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66) ==00:00:02:54.519 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:02:54.519 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:02:54.519 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so) ==00:00:02:54.519 510301== ==00:00:02:54.519 510301== This conflicts with a previous read of size 8 by thread ceph#117 ==00:00:02:54.519 510301== Locks held: 1, at address 0x2123E9A0 ==00:00:02:54.519 510301== at 0x6B5842: __shared_ptr (shared_ptr_base.h:1165) ==00:00:02:54.519 510301== by 0x6B5842: shared_ptr (shared_ptr.h:129) ==00:00:02:54.519 510301== by 0x6B5842: get_osdmap (OSD.h:1700) ==00:00:02:54.519 510301== by 0x6B5842: OSD::create_context() (OSD.cc:9053) ==00:00:02:54.519 510301== by 0x71B570: OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&) (OSD.cc:9665) ==00:00:02:54.519 510301== by 0x71B997: OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&) (OSD.cc:9701) ==00:00:02:54.519 510301== by 0x70E62E: run (OpSchedulerItem.h:148) ==00:00:02:54.519 510301== by 0x70E62E: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (OSD.cc:10677) ==00:00:02:54.519 510301== by 0xD3C093: ShardedThreadPool::shardedthreadpool_worker(unsigned int) (WorkQueue.cc:311) ==00:00:02:54.519 510301== by 0xD3ECF3: ShardedThreadPool::WorkThreadSharded::entry() (WorkQueue.h:706) ==00:00:02:54.519 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:02:54.519 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:02:54.519 510301== Address 0xf123930 is 3,824 bytes inside a block of size 10,296 alloc'd ==00:00:02:54.519 510301== at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433) ==00:00:02:54.519 510301== by 0x66F766: main (ceph_osd.cc:688) ==00:00:02:54.519 510301== Block was alloc'd by thread #1 ``` Actually there is plenty of similar issues reported like: ``` ==00:00:05:04.903 510301== Possible data race during read of size 8 at 0x1E3E0588 by thread ceph#119 ==00:00:05:04.903 510301== Locks held: 1, at address 0x1EAD41D0 ==00:00:05:04.903 510301== at 0x753165: clear (hashtable.h:2051) ==00:00:05:04.903 510301== by 0x753165: std::_Hashtable<entity_addr_t, std::pair<entity_addr_t const, utime_t>, mempool::pool_allocator<(mempool::pool_index_t)15, std::pair<entity_addr_t const, utime_t> >, std::__detail::_Select1st, std::equal_to<entity_addr_t>, std::hash<entity_addr_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__deta il::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1369) ==00:00:05:04.903 510301== by 0x75331C: ~unordered_map (unordered_map.h:102) ==00:00:05:04.903 510301== by 0x75331C: OSDMap::~OSDMap() (OSDMap.h:350) ==00:00:05:04.903 510301== by 0x753606: operator() (shared_cache.hpp:100) ==00:00:05:04.903 510301== by 0x753606: std::_Sp_counted_deleter<OSDMap const*, SharedLRU<unsigned int, OSDMap const>::Cleanup, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr _base.h:471) ==00:00:05:04.903 510301== by 0x73BB26: _M_release (shared_ptr_base.h:155) ==00:00:05:04.903 510301== by 0x73BB26: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:148) ==00:00:05:04.903 510301== by 0x6B58A9: ~__shared_count (shared_ptr_base.h:728) ==00:00:05:04.903 510301== by 0x6B58A9: ~__shared_ptr (shared_ptr_base.h:1167) ==00:00:05:04.903 510301== by 0x6B58A9: ~shared_ptr (shared_ptr.h:103) ==00:00:05:04.903 510301== by 0x6B58A9: OSD::create_context() (OSD.cc:9053) ==00:00:05:04.903 510301== by 0x71B570: OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&) (OSD.cc:9665) ==00:00:05:04.903 510301== by 0x71B997: OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&) (OSD.cc:9701) ==00:00:05:04.903 510301== by 0x70E62E: run (OpSchedulerItem.h:148) ==00:00:05:04.903 510301== by 0x70E62E: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (OSD.cc:10677) ==00:00:05:04.903 510301== by 0xD3C093: ShardedThreadPool::shardedthreadpool_worker(unsigned int) (WorkQueue.cc:311) ==00:00:05:04.903 510301== by 0xD3ECF3: ShardedThreadPool::WorkThreadSharded::entry() (WorkQueue.h:706) ==00:00:05:04.903 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:05:04.903 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:05:04.903 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so) ==00:00:05:04.903 510301== ==00:00:05:04.903 510301== This conflicts with a previous write of size 8 by thread ceph#90 ==00:00:05:04.903 510301== Locks held: 2, at addresses 0xF122A58 0xF1239A8 ==00:00:05:04.903 510301== at 0x7531E1: clear (hashtable.h:2054) ==00:00:05:04.903 510301== by 0x7531E1: std::_Hashtable<entity_addr_t, std::pair<entity_addr_t const, utime_t>, mempool::pool_allocator<(mempool::pool_index_t)15, std::pair<entity_addr_t const, utime_t> >, std::__detail::_Select1st, std::equal_to<entity_addr_t>, std::hash<entity_addr_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1369) ==00:00:05:04.903 510301== by 0x75331C: ~unordered_map (unordered_map.h:102) ==00:00:05:04.903 510301== by 0x75331C: OSDMap::~OSDMap() (OSDMap.h:350) ==00:00:05:04.903 510301== by 0x753606: operator() (shared_cache.hpp:100) ==00:00:05:04.903 510301== by 0x753606: std::_Sp_counted_deleter<OSDMap const*, SharedLRU<unsigned int, OSDMap const>::Cleanup, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr_base.h:471) ==00:00:05:04.903 510301== by 0x73BB26: _M_release (shared_ptr_base.h:155) ==00:00:05:04.903 510301== by 0x73BB26: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:148) ==00:00:05:04.903 510301== by 0x72191E: operator= (shared_ptr_base.h:747) ==00:00:05:04.903 510301== by 0x72191E: operator= (shared_ptr_base.h:1078) ==00:00:05:04.903 510301== by 0x72191E: operator= (shared_ptr.h:103) ==00:00:05:04.903 510301== by 0x72191E: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8116) ==00:00:05:04.903 510301== by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678) ==00:00:05:04.903 510301== by 0x72A06C: Context::complete(int) (Context.h:77) ==00:00:05:04.903 510301== by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66) ==00:00:05:04.903 510301== Address 0x1e3e0588 is 872 bytes inside a block of size 1,208 alloc'd ==00:00:05:04.903 510301== at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433) ==00:00:05:04.903 510301== by 0x6C7C0C: OSDService::try_get_map(unsigned int) (OSD.cc:1606) ==00:00:05:04.903 510301== by 0x7213BD: get_map (OSD.h:699) ==00:00:05:04.903 510301== by 0x7213BD: get_map (OSD.h:1732) ==00:00:05:04.903 510301== by 0x7213BD: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8076) ==00:00:05:04.903 510301== by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678) ==00:00:05:04.903 510301== by 0x72A06C: Context::complete(int) (Context.h:77) ==00:00:05:04.903 510301== by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66) ==00:00:05:04.903 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:05:04.903 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:05:04.903 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so) ``` Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      Feb 12, 2020 
    
    
      
  
    
      
    
  
Helgrind complains about: ``` ==00:00:04:55.495 510301== Possible data race during read of size 8 at 0x1EF1B0B0 by thread ceph#96 ==00:00:04:55.495 510301== Locks held: 3, at addresses 0xEE8D650 0xEE8D748 0x1FFEFFEE08 ==00:00:04:55.495 510301== at 0x107F5C5: ~unique_ptr (unique_ptr.h:273) ==00:00:04:55.495 510301== by 0x107F5C5: ~AuthConnectionMeta (Auth.h:163) ==00:00:04:55.495 510301== by 0x107F5C5: std::_Sp_counted_ptr<AuthConnectionMeta*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr_base.h:377) ==00:00:04:55.495 510301== by 0x10C19EE: _M_release (shared_ptr_base.h:155) ==00:00:04:55.495 510301== by 0x10C19EE: _M_release (shared_ptr_base.h:148) ==00:00:04:55.495 510301== by 0x10C19EE: ~__shared_count (shared_ptr_base.h:728) ==00:00:04:55.495 510301== by 0x10C19EE: ~__shared_ptr (shared_ptr_base.h:1167) ==00:00:04:55.495 510301== by 0x10C19EE: ~shared_ptr (shared_ptr.h:103) ==00:00:04:55.495 510301== by 0x10C19EE: Protocol::~Protocol() (Protocol.cc:14) ==00:00:04:55.495 510301== by 0x1081A5C: ProtocolV2::~ProtocolV2() (ProtocolV2.cc:100) ==00:00:04:55.495 510301== by 0x105AA48: operator() (unique_ptr.h:81) ==00:00:04:55.495 510301== by 0x105AA48: ~unique_ptr (unique_ptr.h:274) ==00:00:04:55.495 510301== by 0x105AA48: AsyncConnection::~AsyncConnection() (AsyncConnection.cc:149) ==00:00:04:55.495 510301== by 0x105ABAC: AsyncConnection::~AsyncConnection() (AsyncConnection.cc:154) ==00:00:04:55.495 510301== by 0xD2BAA0: RefCountedObject::put() const (RefCountedObj.cc:27) ==00:00:04:55.495 510301== by 0xEB1E58: intrusive_ptr_release (RefCountedObj.h:193) ==00:00:04:55.495 510301== by 0xEB1E58: ~intrusive_ptr (intrusive_ptr.hpp:98) ==00:00:04:55.495 510301== by 0xEB1E58: ~pair (stl_pair.h:208) ==00:00:04:55.495 510301== by 0xEB1E58: destroy<std::pair<const entity_addrvec_t, boost::intrusive_ptr<AsyncConnection> > > (new_allocator.h:140) ==00:00:04:55.495 510301== by 0xEB1E58: destroy<std::pair<const entity_addrvec_t, boost::intrusive_ptr<AsyncConnection> > > (alloc_traits.h:487) ==00:00:04:55.495 510301== by 0xEB1E58: _M_deallocate_node (hashtable_policy.h:2100) ==00:00:04:55.495 510301== by 0xEB1E58: _M_erase (hashtable.h:1909) ==00:00:04:55.495 510301== by 0xEB1E58: std::_Hashtable<entity_addrvec_t, std::pair<entity_addrvec_t const, boost::intrusive_ptr<AsyncConnection> >, std::allocator<std::pair<entity_addrvec_t const, boost ::intrusive_ptr<AsyncConnection> > >, std::__detail::_Select1st, std::equal_to<entity_addrvec_t>, std::hash<entity_addrvec_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__ detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::erase(std::__detail::_Node_const_iterator<std::pair<entity_addrvec_t const, boost::intrusive_ptr<AsyncConnection> >, fals e, true>) (hashtable.h:1884) ==00:00:04:55.495 510301== by 0xEB24EA: erase (hashtable.h:762) ==00:00:04:55.495 510301== by 0xEB24EA: erase (unordered_map.h:798) ==00:00:04:55.495 510301== by 0xEB24EA: AsyncMessenger::_lookup_conn(entity_addrvec_t const&) (AsyncMessenger.h:312) ==00:00:04:55.495 510301== by 0xEAA08A: AsyncMessenger::connect_to(int, entity_addrvec_t const&, bool, bool) (AsyncMessenger.cc:701) ==00:00:04:55.495 510301== by 0xEF7477: connect_to_mon (Messenger.h:530) ==00:00:04:55.495 510301== by 0xEF7477: MonClient::_add_conn(unsigned int, unsigned long) (MonClient.cc:716) ==00:00:04:55.495 510301== by 0xEF7C30: MonClient::_add_conns(unsigned long) (MonClient.cc:779) ==00:00:04:55.495 510301== by 0xEFDED6: MonClient::_reopen_session(int) (MonClient.cc:691) ==00:00:04:55.495 510301== by 0xF025F2: MonClient::tick() (MonClient.cc:925) ==00:00:04:55.495 510301== by 0x72A06C: Context::complete(int) (Context.h:77) ==00:00:04:55.495 510301== by 0xD35276: SafeTimer::timer_thread() (Timer.cc:96) ==00:00:04:55.495 510301== by 0xD36850: SafeTimerThread::entry() (Timer.cc:30) ==00:00:04:55.495 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:04:55.495 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:04:55.495 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so) ==00:00:04:55.495 510301== Address 0x1ef1b0b0 is 128 bytes inside a block of size 136 alloc'd ==00:00:04:55.495 510301== at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433) ==00:00:04:55.495 510301== by 0x1082094: ProtocolV2::reset_recv_state()::{lambda()#2}::operator()() const [clone .isra.916] (ProtocolV2.cc:240) ==00:00:04:55.495 510301== by 0x108244B: EventCenter::C_submit_event<ProtocolV2::reset_recv_state()::{lambda()#2}>::do_request(unsigned long) (Event.h:228) ==00:00:04:55.495 510301== by 0xEB7465: EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*) (Event.cc:433) ==00:00:04:55.495 510301== by 0xEBC1BB: operator() (Stack.cc:53) ==00:00:04:55.495 510301== by 0xEBC1BB: std::_Function_handler<void (), NetworkStack::add_thread(unsigned int)::{lambda()#1}>::_M_invoke(std::_Any_data const&) (std_function.h:297) ==00:00:04:55.495 510301== by 0xCF4AA32: ??? (in /usr/lib64/libstdc++.so.6.0.25) ==00:00:04:55.495 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:04:55.495 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:04:55.495 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so) ==00:00:04:55.495 510301== Block was alloc'd by thread ceph#3 ``` Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      Feb 12, 2020 
    
    
      
  
    
      
    
  
Helgrind complains about: ``` ==00:00:04:55.495 510301== Possible data race during read of size 8 at 0x1EF1B0B0 by thread ceph#96 ==00:00:04:55.495 510301== Locks held: 3, at addresses 0xEE8D650 0xEE8D748 0x1FFEFFEE08 ==00:00:04:55.495 510301== at 0x107F5C5: ~unique_ptr (unique_ptr.h:273) ==00:00:04:55.495 510301== by 0x107F5C5: ~AuthConnectionMeta (Auth.h:163) ==00:00:04:55.495 510301== by 0x107F5C5: std::_Sp_counted_ptr<AuthConnectionMeta*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr_base.h:377) ==00:00:04:55.495 510301== by 0x10C19EE: _M_release (shared_ptr_base.h:155) ==00:00:04:55.495 510301== by 0x10C19EE: _M_release (shared_ptr_base.h:148) ==00:00:04:55.495 510301== by 0x10C19EE: ~__shared_count (shared_ptr_base.h:728) ==00:00:04:55.495 510301== by 0x10C19EE: ~__shared_ptr (shared_ptr_base.h:1167) ==00:00:04:55.495 510301== by 0x10C19EE: ~shared_ptr (shared_ptr.h:103) ==00:00:04:55.495 510301== by 0x10C19EE: Protocol::~Protocol() (Protocol.cc:14) ==00:00:04:55.495 510301== by 0x1081A5C: ProtocolV2::~ProtocolV2() (ProtocolV2.cc:100) ==00:00:04:55.495 510301== by 0x105AA48: operator() (unique_ptr.h:81) ==00:00:04:55.495 510301== by 0x105AA48: ~unique_ptr (unique_ptr.h:274) ==00:00:04:55.495 510301== by 0x105AA48: AsyncConnection::~AsyncConnection() (AsyncConnection.cc:149) ==00:00:04:55.495 510301== by 0x105ABAC: AsyncConnection::~AsyncConnection() (AsyncConnection.cc:154) ==00:00:04:55.495 510301== by 0xD2BAA0: RefCountedObject::put() const (RefCountedObj.cc:27) ==00:00:04:55.495 510301== by 0xEB1E58: intrusive_ptr_release (RefCountedObj.h:193) ==00:00:04:55.495 510301== by 0xEB1E58: ~intrusive_ptr (intrusive_ptr.hpp:98) ==00:00:04:55.495 510301== by 0xEB1E58: ~pair (stl_pair.h:208) ==00:00:04:55.495 510301== by 0xEB1E58: destroy<std::pair<const entity_addrvec_t, boost::intrusive_ptr<AsyncConnection> > > (new_allocator.h:140) ==00:00:04:55.495 510301== by 0xEB1E58: destroy<std::pair<const entity_addrvec_t, boost::intrusive_ptr<AsyncConnection> > > (alloc_traits.h:487) ==00:00:04:55.495 510301== by 0xEB1E58: _M_deallocate_node (hashtable_policy.h:2100) ==00:00:04:55.495 510301== by 0xEB1E58: _M_erase (hashtable.h:1909) ==00:00:04:55.495 510301== by 0xEB1E58: std::_Hashtable<entity_addrvec_t, std::pair<entity_addrvec_t const, boost::intrusive_ptr<AsyncConnection> >, std::allocator<std::pair<entity_addrvec_t const, boost ::intrusive_ptr<AsyncConnection> > >, std::__detail::_Select1st, std::equal_to<entity_addrvec_t>, std::hash<entity_addrvec_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__ detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::erase(std::__detail::_Node_const_iterator<std::pair<entity_addrvec_t const, boost::intrusive_ptr<AsyncConnection> >, fals e, true>) (hashtable.h:1884) ==00:00:04:55.495 510301== by 0xEB24EA: erase (hashtable.h:762) ==00:00:04:55.495 510301== by 0xEB24EA: erase (unordered_map.h:798) ==00:00:04:55.495 510301== by 0xEB24EA: AsyncMessenger::_lookup_conn(entity_addrvec_t const&) (AsyncMessenger.h:312) ==00:00:04:55.495 510301== by 0xEAA08A: AsyncMessenger::connect_to(int, entity_addrvec_t const&, bool, bool) (AsyncMessenger.cc:701) ==00:00:04:55.495 510301== by 0xEF7477: connect_to_mon (Messenger.h:530) ==00:00:04:55.495 510301== by 0xEF7477: MonClient::_add_conn(unsigned int, unsigned long) (MonClient.cc:716) ==00:00:04:55.495 510301== by 0xEF7C30: MonClient::_add_conns(unsigned long) (MonClient.cc:779) ==00:00:04:55.495 510301== by 0xEFDED6: MonClient::_reopen_session(int) (MonClient.cc:691) ==00:00:04:55.495 510301== by 0xF025F2: MonClient::tick() (MonClient.cc:925) ==00:00:04:55.495 510301== by 0x72A06C: Context::complete(int) (Context.h:77) ==00:00:04:55.495 510301== by 0xD35276: SafeTimer::timer_thread() (Timer.cc:96) ==00:00:04:55.495 510301== by 0xD36850: SafeTimerThread::entry() (Timer.cc:30) ==00:00:04:55.495 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:04:55.495 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:04:55.495 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so) ==00:00:04:55.495 510301== Address 0x1ef1b0b0 is 128 bytes inside a block of size 136 alloc'd ==00:00:04:55.495 510301== at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433) ==00:00:04:55.495 510301== by 0x1082094: ProtocolV2::reset_recv_state()::{lambda()#2}::operator()() const [clone .isra.916] (ProtocolV2.cc:240) ==00:00:04:55.495 510301== by 0x108244B: EventCenter::C_submit_event<ProtocolV2::reset_recv_state()::{lambda()#2}>::do_request(unsigned long) (Event.h:228) ==00:00:04:55.495 510301== by 0xEB7465: EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*) (Event.cc:433) ==00:00:04:55.495 510301== by 0xEBC1BB: operator() (Stack.cc:53) ==00:00:04:55.495 510301== by 0xEBC1BB: std::_Function_handler<void (), NetworkStack::add_thread(unsigned int)::{lambda()#1}>::_M_invoke(std::_Any_data const&) (std_function.h:297) ==00:00:04:55.495 510301== by 0xCF4AA32: ??? (in /usr/lib64/libstdc++.so.6.0.25) ==00:00:04:55.495 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:04:55.495 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:04:55.495 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so) ==00:00:04:55.495 510301== Block was alloc'd by thread ceph#3 ``` Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      Feb 14, 2020 
    
    
      
  
    
      
    
  
Accordingly to cppreference.com [1]: "If multiple threads of execution access the same std::shared_ptr object without synchronization and any of those accesses uses a non-const member function of shared_ptr then a data race will occur (...)" [1]: https://en.cppreference.com/w/cpp/memory/shared_ptr/atomic One of the coredumps showed the `shared_ptr`-typed `OSD::osdmap` with healthy looking content but damaged control block: ``` [Current thread is 1 (Thread 0x7f7dcaf73700 (LWP 205295))] (gdb) bt #0 0x0000559cb81c3ea0 in ?? () #1 0x0000559c97675b27 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148 #2 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148 ceph#3 0x0000559c975ef8aa in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr_base.h:1167 ceph#4 std::__shared_ptr<OSDMap const, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr_base.h:1167 ceph#5 std::shared_ptr<OSDMap const>::~shared_ptr (this=<optimized out>, __in_chrg=<optimized out>) at /usr/include/c++/8/bits/shared_ptr.h:103 ceph#6 OSD::create_context (this=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9053 ceph#7 0x0000559c97655571 in OSD::dequeue_peering_evt (this=0x559ca22ac000, sdata=0x559ca2ef2900, pg=0x559cb4aa3400, evt=std::shared_ptr<PGPeeringEvent> (use count 2, weak count 0) = {...}, handle=...) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9665 ceph#8 0x0000559c97886db6 in ceph::osd::scheduler::PGPeeringItem::run (this=<optimized out>, osd=<optimized out>, sdata=<optimized out>, pg=..., handle=...) at /usr/include/c++/8/ext/atomicity.h:96 ceph#9 0x0000559c9764862f in ceph::osd::scheduler::OpSchedulerItem::run (handle=..., pg=..., sdata=<optimized out>, osd=<optimized out>, this=0x7f7dcaf703f0) at /usr/include/c++/8/bits/unique_ptr.h:342 ceph#10 OSD::ShardedOpWQ::_process (this=<optimized out>, thread_index=<optimized out>, hb=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:10677 ceph#11 0x0000559c97c76094 in ShardedThreadPool::shardedthreadpool_worker (this=0x559ca22aca28, thread_index=14) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/common/WorkQueue.cc:311 ceph#12 0x0000559c97c78cf4 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/common/WorkQueue.h:706 ceph#13 0x00007f7df17852de in start_thread () from /lib64/libpthread.so.0 ceph#14 0x00007f7df052f133 in __libc_ifunc_impl_list () from /lib64/libc.so.6 ceph#15 0x0000000000000000 in ?? () (gdb) frame 7 ceph#7 0x0000559c97655571 in OSD::dequeue_peering_evt (this=0x559ca22ac000, sdata=0x559ca2ef2900, pg=0x559cb4aa3400, evt=std::shared_ptr<PGPeeringEvent> (use count 2, weak count 0) = {...}, handle=...) at /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc:9665 9665 in /usr/src/debug/ceph-15.0.0-10071.g5b5a3a3.el8.x86_64/src/osd/OSD.cc (gdb) print osdmap $24 = std::shared_ptr<const OSDMap> (expired, weak count 0) = {get() = 0x559cba028000} (gdb) print *osdmap # pretty sane OSDMap (gdb) print sizeof(osdmap) $26 = 16 (gdb) x/2a &osdmap 0x559ca22acef0: 0x559cba028000 0x559cba0ec900 (gdb) frame 2 #2 std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x559cba0ec900) at /usr/include/c++/8/bits/shared_ptr_base.h:148 148 /usr/include/c++/8/bits/shared_ptr_base.h: No such file or directory. (gdb) disassemble Dump of assembler code for function std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release(): ... 0x0000559c97675b1e <+62>: mov (%rdi),%rax 0x0000559c97675b21 <+65>: mov %rdi,%rbx 0x0000559c97675b24 <+68>: callq *0x10(%rax) => 0x0000559c97675b27 <+71>: test %rbp,%rbp ... End of assembler dump. (gdb) info registers rdi rbx rax rdi 0x559cba0ec900 94131624790272 rbx 0x559cba0ec900 94131624790272 rax 0x559cba0ec8a0 94131624790176 (gdb) x/a 0x559cba0ec8a0 + 0x10 0x559cba0ec8b0: 0x559cb81c3ea0 (gdb) bt #0 0x0000559cb81c3ea0 in ?? () ... (gdb) p $_siginfo._sifields._sigfault.si_addr $27 = (void *) 0x559cb81c3ea0 ``` Helgrind seems to agree: ``` ==00:00:02:54.519 510301== Possible data race during write of size 8 at 0xF123930 by thread ceph#90 ==00:00:02:54.519 510301== Locks held: 2, at addresses 0xF122A58 0xF1239A8 ==00:00:02:54.519 510301== at 0x7218DD: operator= (shared_ptr_base.h:1078) ==00:00:02:54.519 510301== by 0x7218DD: operator= (shared_ptr.h:103) ==00:00:02:54.519 510301== by 0x7218DD: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8116) ==00:00:02:54.519 510301== by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678) ==00:00:02:54.519 510301== by 0x72A06C: Context::complete(int) (Context.h:77) ==00:00:02:54.519 510301== by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66) ==00:00:02:54.519 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:02:54.519 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:02:54.519 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so) ==00:00:02:54.519 510301== ==00:00:02:54.519 510301== This conflicts with a previous read of size 8 by thread ceph#117 ==00:00:02:54.519 510301== Locks held: 1, at address 0x2123E9A0 ==00:00:02:54.519 510301== at 0x6B5842: __shared_ptr (shared_ptr_base.h:1165) ==00:00:02:54.519 510301== by 0x6B5842: shared_ptr (shared_ptr.h:129) ==00:00:02:54.519 510301== by 0x6B5842: get_osdmap (OSD.h:1700) ==00:00:02:54.519 510301== by 0x6B5842: OSD::create_context() (OSD.cc:9053) ==00:00:02:54.519 510301== by 0x71B570: OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&) (OSD.cc:9665) ==00:00:02:54.519 510301== by 0x71B997: OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&) (OSD.cc:9701) ==00:00:02:54.519 510301== by 0x70E62E: run (OpSchedulerItem.h:148) ==00:00:02:54.519 510301== by 0x70E62E: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (OSD.cc:10677) ==00:00:02:54.519 510301== by 0xD3C093: ShardedThreadPool::shardedthreadpool_worker(unsigned int) (WorkQueue.cc:311) ==00:00:02:54.519 510301== by 0xD3ECF3: ShardedThreadPool::WorkThreadSharded::entry() (WorkQueue.h:706) ==00:00:02:54.519 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:02:54.519 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:02:54.519 510301== Address 0xf123930 is 3,824 bytes inside a block of size 10,296 alloc'd ==00:00:02:54.519 510301== at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433) ==00:00:02:54.519 510301== by 0x66F766: main (ceph_osd.cc:688) ==00:00:02:54.519 510301== Block was alloc'd by thread #1 ``` Actually there is plenty of similar issues reported like: ``` ==00:00:05:04.903 510301== Possible data race during read of size 8 at 0x1E3E0588 by thread ceph#119 ==00:00:05:04.903 510301== Locks held: 1, at address 0x1EAD41D0 ==00:00:05:04.903 510301== at 0x753165: clear (hashtable.h:2051) ==00:00:05:04.903 510301== by 0x753165: std::_Hashtable<entity_addr_t, std::pair<entity_addr_t const, utime_t>, mempool::pool_allocator<(mempool::pool_index_t)15, std::pair<entity_addr_t const, utime_t> >, std::__detail::_Select1st, std::equal_to<entity_addr_t>, std::hash<entity_addr_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__deta il::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1369) ==00:00:05:04.903 510301== by 0x75331C: ~unordered_map (unordered_map.h:102) ==00:00:05:04.903 510301== by 0x75331C: OSDMap::~OSDMap() (OSDMap.h:350) ==00:00:05:04.903 510301== by 0x753606: operator() (shared_cache.hpp:100) ==00:00:05:04.903 510301== by 0x753606: std::_Sp_counted_deleter<OSDMap const*, SharedLRU<unsigned int, OSDMap const>::Cleanup, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr _base.h:471) ==00:00:05:04.903 510301== by 0x73BB26: _M_release (shared_ptr_base.h:155) ==00:00:05:04.903 510301== by 0x73BB26: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:148) ==00:00:05:04.903 510301== by 0x6B58A9: ~__shared_count (shared_ptr_base.h:728) ==00:00:05:04.903 510301== by 0x6B58A9: ~__shared_ptr (shared_ptr_base.h:1167) ==00:00:05:04.903 510301== by 0x6B58A9: ~shared_ptr (shared_ptr.h:103) ==00:00:05:04.903 510301== by 0x6B58A9: OSD::create_context() (OSD.cc:9053) ==00:00:05:04.903 510301== by 0x71B570: OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&) (OSD.cc:9665) ==00:00:05:04.903 510301== by 0x71B997: OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&) (OSD.cc:9701) ==00:00:05:04.903 510301== by 0x70E62E: run (OpSchedulerItem.h:148) ==00:00:05:04.903 510301== by 0x70E62E: OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (OSD.cc:10677) ==00:00:05:04.903 510301== by 0xD3C093: ShardedThreadPool::shardedthreadpool_worker(unsigned int) (WorkQueue.cc:311) ==00:00:05:04.903 510301== by 0xD3ECF3: ShardedThreadPool::WorkThreadSharded::entry() (WorkQueue.h:706) ==00:00:05:04.903 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:05:04.903 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:05:04.903 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so) ==00:00:05:04.903 510301== ==00:00:05:04.903 510301== This conflicts with a previous write of size 8 by thread ceph#90 ==00:00:05:04.903 510301== Locks held: 2, at addresses 0xF122A58 0xF1239A8 ==00:00:05:04.903 510301== at 0x7531E1: clear (hashtable.h:2054) ==00:00:05:04.903 510301== by 0x7531E1: std::_Hashtable<entity_addr_t, std::pair<entity_addr_t const, utime_t>, mempool::pool_allocator<(mempool::pool_index_t)15, std::pair<entity_addr_t const, utime_t> >, std::__detail::_Select1st, std::equal_to<entity_addr_t>, std::hash<entity_addr_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1369) ==00:00:05:04.903 510301== by 0x75331C: ~unordered_map (unordered_map.h:102) ==00:00:05:04.903 510301== by 0x75331C: OSDMap::~OSDMap() (OSDMap.h:350) ==00:00:05:04.903 510301== by 0x753606: operator() (shared_cache.hpp:100) ==00:00:05:04.903 510301== by 0x753606: std::_Sp_counted_deleter<OSDMap const*, SharedLRU<unsigned int, OSDMap const>::Cleanup, std::allocator<void>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr_base.h:471) ==00:00:05:04.903 510301== by 0x73BB26: _M_release (shared_ptr_base.h:155) ==00:00:05:04.903 510301== by 0x73BB26: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:148) ==00:00:05:04.903 510301== by 0x72191E: operator= (shared_ptr_base.h:747) ==00:00:05:04.903 510301== by 0x72191E: operator= (shared_ptr_base.h:1078) ==00:00:05:04.903 510301== by 0x72191E: operator= (shared_ptr.h:103) ==00:00:05:04.903 510301== by 0x72191E: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8116) ==00:00:05:04.903 510301== by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678) ==00:00:05:04.903 510301== by 0x72A06C: Context::complete(int) (Context.h:77) ==00:00:05:04.903 510301== by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66) ==00:00:05:04.903 510301== Address 0x1e3e0588 is 872 bytes inside a block of size 1,208 alloc'd ==00:00:05:04.903 510301== at 0xA7DC0C3: operator new[](unsigned long) (vg_replace_malloc.c:433) ==00:00:05:04.903 510301== by 0x6C7C0C: OSDService::try_get_map(unsigned int) (OSD.cc:1606) ==00:00:05:04.903 510301== by 0x7213BD: get_map (OSD.h:699) ==00:00:05:04.903 510301== by 0x7213BD: get_map (OSD.h:1732) ==00:00:05:04.903 510301== by 0x7213BD: OSD::_committed_osd_maps(unsigned int, unsigned int, MOSDMap*) (OSD.cc:8076) ==00:00:05:04.903 510301== by 0x7752CA: C_OnMapCommit::finish(int) (OSD.cc:7678) ==00:00:05:04.903 510301== by 0x72A06C: Context::complete(int) (Context.h:77) ==00:00:05:04.903 510301== by 0xD07F14: Finisher::finisher_thread_entry() (Finisher.cc:66) ==00:00:05:04.903 510301== by 0xA7E1203: mythread_wrapper (hg_intercepts.c:389) ==00:00:05:04.903 510301== by 0xC6182DD: start_thread (in /usr/lib64/libpthread-2.28.so) ==00:00:05:04.903 510301== by 0xD8B34B2: clone (in /usr/lib64/libc-2.28.so) ``` Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Mar 27, 2020 
    
    
      
  
    
      
    
  
* no need to discard_result(). as `output_stream::close()` returns an
  empty future<> already
* free the connected socket after the background task finishes, because:
we should not free the connected socket before the promise referencing it is fulfilled.
otherwise we have error messages from ASan, like
==287182==ERROR: AddressSanitizer: heap-use-after-free on address 0x611000019aa0 at pc 0x55e2ae2de882 bp 0x7fff7e2bf080 sp 0x7fff7e2bf078
READ of size 8 at 0x611000019aa0 thread T0
    #0 0x55e2ae2de881 in seastar::reactor_backend_aio::await_events(int, __sigset_t const*) ../src/seastar/src/core/reactor_backend.cc:396
    #1 0x55e2ae2dfb59 in seastar::reactor_backend_aio::reap_kernel_completions() ../src/seastar/src/core/reactor_backend.cc:428
    #2 0x55e2adbea397 in seastar::reactor::reap_kernel_completions_pollfn::poll() (/var/ssd/ceph/build/bin/crimson-osd+0x155e9397)
    ceph#3 0x55e2adaec6d0 in seastar::reactor::poll_once() ../src/seastar/src/core/reactor.cc:2789
    ceph#4 0x55e2adae7cf7 in operator() ../src/seastar/src/core/reactor.cc:2687
    ceph#5 0x55e2adb7c595 in __invoke_impl<bool, seastar::reactor::run()::<lambda()>&> /usr/include/c++/10/bits/invoke.h:60
    ceph#6 0x55e2adb699b0 in __invoke_r<bool, seastar::reactor::run()::<lambda()>&> /usr/include/c++/10/bits/invoke.h:113
    ceph#7 0x55e2adb50222 in _M_invoke /usr/include/c++/10/bits/std_function.h:291
    ceph#8 0x55e2adc2ba00 in std::function<bool ()>::operator()() const /usr/include/c++/10/bits/std_function.h:622
    ceph#9 0x55e2adaea491 in seastar::reactor::run() ../src/seastar/src/core/reactor.cc:2713
    ceph#10 0x55e2ad98f1c7 in seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) ../src/seastar/src/core/app-template.cc:199
    ceph#11 0x55e2a9e57538 in main ../src/crimson/osd/main.cc:148
    ceph#12 0x7fae7f20de0a in __libc_start_main ../csu/libc-start.c:308
    ceph#13 0x55e2a9d431e9 in _start (/var/ssd/ceph/build/bin/crimson-osd+0x117421e9)
0x611000019aa0 is located 96 bytes inside of 240-byte region [0x611000019a40,0x611000019b30)
freed by thread T0 here:
    #0 0x7fae80a4e487 in operator delete(void*, unsigned long) (/usr/lib/x86_64-linux-gnu/libasan.so.6+0xac487)
    #1 0x55e2ae302a0a in seastar::aio_pollable_fd_state::~aio_pollable_fd_state() ../src/seastar/src/core/reactor_backend.cc:458
    #2 0x55e2ae2e1059 in seastar::reactor_backend_aio::forget(seastar::pollable_fd_state&) ../src/seastar/src/core/reactor_backend.cc:524
    ceph#3 0x55e2adab9b9a in seastar::pollable_fd_state::forget() ../src/seastar/src/core/reactor.cc:1396
    ceph#4 0x55e2adab9d05 in seastar::intrusive_ptr_release(seastar::pollable_fd_state*) ../src/seastar/src/core/reactor.cc:1401
    ceph#5 0x55e2ace1b72b in boost::intrusive_ptr<seastar::pollable_fd_state>::~intrusive_ptr() /opt/ceph/include/boost/smart_ptr/intrusive_ptr.hpp:98
    ceph#6 0x55e2ace115a5 in seastar::pollable_fd::~pollable_fd() ../src/seastar/include/seastar/core/internal/pollable_fd.hh:109
    ceph#7 0x55e2ae0ed35c in seastar::net::posix_server_socket_impl::~posix_server_socket_impl() ../src/seastar/include/seastar/net/posix-stack.hh:161
    ceph#8 0x55e2ae0ed3cf in seastar::net::posix_server_socket_impl::~posix_server_socket_impl() ../src/seastar/include/seastar/net/posix-stack.hh:161
    ceph#9 0x55e2ae0ed943 in std::default_delete<seastar::net::api_v2::server_socket_impl>::operator()(seastar::net::api_v2::server_socket_impl*) const /usr/include/c++/10/bits/unique_ptr.h:81
    ceph#10 0x55e2ae0db357 in std::unique_ptr<seastar::net::api_v2::server_socket_impl, std::default_delete<seastar::net::api_v2::server_socket_impl> >::~unique_ptr()
	/usr/include/c++/10/bits/unique_ptr.h:357    ceph#11 0x55e2ae1438b7 in seastar::api_v2::server_socket::~server_socket() ../src/seastar/src/net/stack.cc:195
    ceph#12 0x55e2aa1c7656 in std::_Optional_payload_base<seastar::api_v2::server_socket>::_M_destroy() /usr/include/c++/10/optional:260
    ceph#13 0x55e2aa16c84b in std::_Optional_payload_base<seastar::api_v2::server_socket>::_M_reset() /usr/include/c++/10/optional:280
    ceph#14 0x55e2ac24b2b7 in std::_Optional_base_impl<seastar::api_v2::server_socket, std::_Optional_base<seastar::api_v2::server_socket, false, false> >::_M_reset() /usr/include/c++/10/optional:432
    ceph#15 0x55e2ac23f37b in std::optional<seastar::api_v2::server_socket>::reset() /usr/include/c++/10/optional:975
    ceph#16 0x55e2ac21a2e7 in crimson::admin::AdminSocket::stop() ../src/crimson/admin/admin_socket.cc:265
    ceph#17 0x55e2aa099825 in operator() ../src/crimson/osd/osd.cc:450
    ceph#18 0x55e2aa0d4e3e in apply ../src/seastar/include/seastar/core/apply.hh:36
Signed-off-by: Kefu Chai <kchai@redhat.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Apr 28, 2020 
    
    
      
  
    
      
    
  
This fixes the the following selinux error when using ceph-iscsi's
rbd-target-api daemon (rbd-target-gw has the same issue). They are
a result of the a python library, rtslib, which the daemons use.
Additional Information:
Source Context                system_u:system_r:ceph_t:s0
Target Context                system_u:object_r:configfs_t:s0
Target Objects
/sys/kernel/config/target/iscsi/iqn.2003-01.com.re
                              dhat:ceph-iscsi/tpgt_1/attrib/authentication
[
                              file ]
Source                        rbd-target-api
Source Path                   /usr/libexec/platform-python3.6
Port                          <Unknown>
Host                          ans8
Source RPM Packages           platform-python-3.6.8-15.1.el8.x86_64
Target RPM Packages
Policy RPM                    selinux-policy-3.14.3-20.el8.noarch
Selinux Enabled               True
Policy Type                   targeted
Enforcing Mode                Enforcing
Host Name                     ans8
Platform                      Linux ans8 4.18.0-147.el8.x86_64 #1 SMP
Thu Sep 26
                              15:52:44 UTC 2019 x86_64 x86_64
Alert Count                   1
First Seen                    2020-01-08 18:39:47 EST
Last Seen                     2020-01-08 18:39:47 EST
Local ID                      6f8c3415-7a50-4dc8-b3d2-2621e1d00ca3
Raw Audit Messages
type=AVC msg=audit(1578526787.577:68): avc:  denied  { ioctl } for
pid=995 comm="rbd-target-api"
path="/sys/kernel/config/target/iscsi/iqn.2003-01.com.redhat:ceph-iscsi/tpgt_1/attrib/authentication"
dev="configfs" ino=25703 ioctlcmd=0x5401
scontext=system_u:system_r:ceph_t:s0
tcontext=system_u:object_r:configfs_t:s0 tclass=file permissive=1
type=SYSCALL msg=audit(1578526787.577:68): arch=x86_64 syscall=ioctl
success=no exit=ENOTTY a0=34 a1=5401 a2=7ffd4f8f1f60 a3=3052cd2d95839b96
items=0 ppid=1 pid=995 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0
egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm=rbd-target-api
exe=/usr/libexec/platform-python3.6 subj=system_u:system_r:ceph_t:s0
key=(null)
Hash: rbd-target-api,ceph_t,configfs_t,file,ioctl
Signed-off-by: Mike Christie <mchristi@redhat.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Apr 28, 2020 
    
    
      
  
    
      
    
  
This fixes the selinux errors like this for /etc/target ----------------------------------- Additional Information: Source Context system_u:system_r:ceph_t:s0 Target Context system_u:object_r:targetd_etc_rw_t:s0 Target Objects target [ dir ] Source rbd-target-api Source Path rbd-target-api Port <Unknown> Host ans8 Source RPM Packages Target RPM Packages Policy RPM selinux-policy-3.14.3-20.el8.noarch Selinux Enabled True Policy Type targeted Enforcing Mode Enforcing Host Name ans8 Platform Linux ans8 4.18.0-147.el8.x86_64 #1 SMP Thu Sep 26 15:52:44 UTC 2019 x86_64 x86_64 Alert Count 1 First Seen 2020-01-08 18:39:48 EST Last Seen 2020-01-08 18:39:48 EST Local ID 9a13ee18-eaf2-4f2a-872f-2809ee4928f6 Raw Audit Messages type=AVC msg=audit(1578526788.148:69): avc: denied { search } for pid=995 comm="rbd-target-api" name="target" dev="sda1" ino=52198 scontext=system_u:system_r:ceph_t:s0 tcontext=system_u:object_r:targetd_etc_rw_t:s0 tclass=dir permissive=1 Hash: rbd-target-api,ceph_t,targetd_etc_rw_t,dir,search which are a result of the rtslib library the ceph-iscsi daemons use accessing /etc/target to read/write a file which stores meta data the target uses. Signed-off-by: Mike Christie <mchristi@redhat.com>
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Sep 11, 2020 
    
    
      
  
    
      
    
  
Changes addressing comments in PR - commit to be squashed prior to merge Signed-off-by: Paul Cuzner <pcuzner@redhat.com>
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      May 7, 2021 
    
    
      
  
    
      
    
  
Otherwise, if we assert, we'll hang here: Thread 1 (Thread 0x7f74eba79580 (LWP 1688617)): #0 0x00007f74eb2aa529 in futex_wait (private=<optimized out>, expected=132, futex_word=0x7ffd642b4b54) at ../sysdeps/unix/sysv/linux/futex-internal.h:61 #1 futex_wait_simple (private=<optimized out>, expected=132, futex_word=0x7ffd642b4b54) at ../sysdeps/nptl/futex-internal.h:135 #2 __pthread_cond_destroy (cond=0x7ffd642b4b30) at pthread_cond_destroy.c:54 ceph#3 0x0000563ff2e5a891 in LibRadosService_StatusFormat_Test::TestBody (this=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:78 ceph#4 0x0000563ff2e9dc3a in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (location=0x563ff2ea72e4 "the test body", method=<optimized out>, object=0x563ff422a6d0) at ./src/googletest/googletest/src/gtest.cc:2605 ceph#5 testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=object@entry=0x563ff422a6d0, method=<optimized out>, location=location@entry=0x563ff2ea72e4 "the test body") at ./src/googletest/googletest/src/gtest.cc:2641 ceph#6 0x0000563ff2e908c3 in testing::Test::Run (this=0x563ff422a6d0) at ./src/googletest/googletest/src/gtest.cc:2680 ceph#7 0x0000563ff2e90a25 in testing::TestInfo::Run (this=0x563ff41a3b70) at ./src/googletest/googletest/src/gtest.cc:2858 ceph#8 0x0000563ff2e90ec1 in testing::TestSuite::Run (this=0x563ff41b6230) at ./src/googletest/googletest/src/gtest.cc:3012 ceph#9 0x0000563ff2e92bdc in testing::internal::UnitTestImpl::RunAllTests (this=<optimized out>) at ./src/googletest/googletest/src/gtest.cc:5723 ceph#10 0x0000563ff2e9e14a in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (location=0x563ff2ea8728 "auxiliary test code (environments or event listeners)", method=<optimized out>, object=0x563ff41a2d10) at ./src/googletest/googletest/src/gtest.cc:2605 ceph#11 testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x563ff41a2d10, method=<optimized out>, location=location@entry=0x563ff2ea8728 "auxiliary test code (environments or event listeners)") at ./src/googletest/googletest/src/gtest.cc:2641 ceph#12 0x0000563ff2e90ae8 in testing::UnitTest::Run (this=0x563ff30c0660 <testing::UnitTest::GetInstance()::instance>) at ./src/googletest/googletest/src/gtest.cc:5306 Signed-off-by: Sage Weil <sage@newdream.net>
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      May 7, 2021 
    
    
      
  
    
      
    
  
An attempt to `Connection::do_auth()` may finish in one of three states:
_success_, _failure_ and _cancelation_. Unfortunately, its callers were
missing the third treating cancelation like a failure. This was the root
cause of the following failure at Sepia:
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-06_22:08:43-rados-master-distro-basic-smithi/6102605$ less ./remote/smithi204/log/ceph-osd.3.log.gz
...
WARN  2021-05-06 22:35:40,464 [shard 0] osd - ms_handle_reset
...
INFO  2021-05-06 22:35:40,465 [shard 0] monc - do_auth_single: connection closed
INFO  2021-05-06 22:35:40,465 [shard 0] ms - [osd.3(client) v2:172.21.15.204:6808/31418@57568 >> mon.? v2:172.21.15.204:3300/0] execute_connecting(): protocol aborted at CLOSING -- std::system_error (error crimson::net:6, protocol aborted)
...
ERROR 2021-05-06 22:35:40,465 [shard 0] osd - mon.osd.3 dispatch() ms_handle_reset caught exception: std::system_error (error crimson::net:3, negotiation failure)
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3909-g81233a18/rpm/el8/BUILD/ceph-17.0.0-3909-g81233a18/src/crimson/common/gated.h:36: crimson::common::Gated::dispatch(const char*, T&, Func&&) [with Func = crimson::mon::Client::ms_handle_reset(crimson::net::ConnectionRef, bool)::<lambda()>&; T = crimson::mon::Client]::<lambda(std::__exception_ptr::exception_ptr)>: Assertion `*eptr.__cxa_exception_type() == typeid(seastar::gate_closed_exception)' failed.
Aborting on shard 0.
Backtrace:
 0# 0x00005618C973932F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007F7BB592EB20 in /lib64/libpthread.so.0
 4# gsignal in /lib64/libc.so.6
 5# abort in /lib64/libc.so.6
 6# 0x00007F7BB3F29B09 in /lib64/libc.so.6
 7# 0x00007F7BB3F37DE6 in /lib64/libc.so.6
 8# 0x00005618C9FF295C in ceph-osd
 9# 0x00005618C3907313 in ceph-osd
10# 0x00005618CCA2F84F in ceph-osd
11# 0x00005618CCA34D90 in ceph-osd
12# 0x00005618CCBEC9BB in ceph-osd
13# 0x00005618CC744E9A in ceph-osd
14# main in ceph-osd
15# __libc_start_main in /lib64/libc.so.6
16# _start in ceph-osd
daemon-helper: command crashed with signal 6
```
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      May 7, 2021 
    
    
      
  
    
      
    
  
An attempt to `Connection::do_auth()` may finish in one of three states:
_success_, _failure_ and _cancellation_. Unfortunately, its callers were
missing the third treating cancellation like a failure. This was the root
cause of the following failure at Sepia:
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-06_22:08:43-rados-master-distro-basic-smithi/6102605$ less ./remote/smithi204/log/ceph-osd.3.log.gz
...
WARN  2021-05-06 22:35:40,464 [shard 0] osd - ms_handle_reset
...
INFO  2021-05-06 22:35:40,465 [shard 0] monc - do_auth_single: connection closed
INFO  2021-05-06 22:35:40,465 [shard 0] ms - [osd.3(client) v2:172.21.15.204:6808/31418@57568 >> mon.? v2:172.21.15.204:3300/0] execute_connecting(): protocol aborted at CLOSING -- std::system_error (error crimson::net:6, protocol aborted)
...
ERROR 2021-05-06 22:35:40,465 [shard 0] osd - mon.osd.3 dispatch() ms_handle_reset caught exception: std::system_error (error crimson::net:3, negotiation failure)
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3909-g81233a18/rpm/el8/BUILD/ceph-17.0.0-3909-g81233a18/src/crimson/common/gated.h:36: crimson::common::Gated::dispatch(const char*, T&, Func&&) [with Func = crimson::mon::Client::ms_handle_reset(crimson::net::ConnectionRef, bool)::<lambda()>&; T = crimson::mon::Client]::<lambda(std::__exception_ptr::exception_ptr)>: Assertion `*eptr.__cxa_exception_type() == typeid(seastar::gate_closed_exception)' failed.
Aborting on shard 0.
Backtrace:
 0# 0x00005618C973932F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007F7BB592EB20 in /lib64/libpthread.so.0
 4# gsignal in /lib64/libc.so.6
 5# abort in /lib64/libc.so.6
 6# 0x00007F7BB3F29B09 in /lib64/libc.so.6
 7# 0x00007F7BB3F37DE6 in /lib64/libc.so.6
 8# 0x00005618C9FF295C in ceph-osd
 9# 0x00005618C3907313 in ceph-osd
10# 0x00005618CCA2F84F in ceph-osd
11# 0x00005618CCA34D90 in ceph-osd
12# 0x00005618CCBEC9BB in ceph-osd
13# 0x00005618CC744E9A in ceph-osd
14# main in ceph-osd
15# __libc_start_main in /lib64/libc.so.6
16# _start in ceph-osd
daemon-helper: command crashed with signal 6
```
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      May 7, 2021 
    
    
      
  
    
      
    
  
…dling.
In `crimson/osd/main.cc` we instruct Seastar to handle `SIGHUP`.
```
        // just ignore SIGHUP, we don't reread settings
        seastar::engine().handle_signal(SIGHUP, [] {})
```
This happens using the Seastar's signal handling infrastructure
which is incompliant with the alien world.
```
void
reactor::signals::handle_signal(int signo, noncopyable_function<void ()>&& handler) {
    // ...
    struct sigaction sa;
    sa.sa_sigaction = [](int sig, siginfo_t *info, void *p) {
        engine()._backend->signal_received(sig, info, p);
    };
    // ...
}
```
```
 extern __thread reactor* local_engine;
extern __thread size_t task_quota;
inline reactor& engine() {
    return *local_engine;
}
```
The low-level signal handler above assumes `local_engine._backend`
is not null which stays true only for threads from the S*'s world.
Unfortunately, as we don't block the `SIGHUP` for alien threads,
kernel is perfectly authorized to pick up one them to run the handler
leading to weirdly-looking segfaults like this one:
```
INFO  2021-04-23 07:06:57,807 [shard 0] bluestore - stat
DEBUG 2021-04-23 07:06:58,753 [shard 0] ms - [osd.1(client) v2:172.21.15.100:6802/30478@51064 >> mgr.4105 v2:172.21.15.109:6800/29891] --> ceph#7 === pg_stats(0 pgs seq 55834574872 v 0) v2 (87)
...
INFO  2021-04-23 07:06:58,813 [shard 0] bluestore - stat
DEBUG 2021-04-23 07:06:59,753 [shard 0] osd - AdminSocket::handle_client: incoming asok string: {"prefix": "get_command_descriptions"}
INFO  2021-04-23 07:06:59,753 [shard 0] osd - asok response length: 2947
INFO  2021-04-23 07:06:59,817 [shard 0] bluestore - stat
DEBUG 2021-04-23 07:06:59,865 [shard 0] osd - AdminSocket::handle_client: incoming asok string: {"prefix": "get_command_descriptions"}
INFO  2021-04-23 07:06:59,866 [shard 0] osd - asok response length: 2947
DEBUG 2021-04-23 07:07:00,020 [shard 0] osd - AdminSocket::handle_client: incoming asok string: {"prefix": "get_command_descriptions"}
INFO  2021-04-23 07:07:00,020 [shard 0] osd - asok response length: 2947
INFO  2021-04-23 07:07:00,820 [shard 0] bluestore - stat
...
Backtrace:
 0# 0x00005600CD0D6AAF in ceph-osd
 1# FatalSignal::signaled(int) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007F5877C7EB20 in /lib64/libpthread.so.0
 4# 0x00005600CD830B81 in ceph-osd
 5# 0x00007F5877C7EB20 in /lib64/libpthread.so.0
 6# pthread_cond_timedwait in /lib64/libpthread.so.0
 7# crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) in ceph-osd
 8# 0x00007F5877999BA3 in /lib64/libstdc++.so.6
 9# 0x00007F5877C7414A in /lib64/libpthread.so.0
10# clone in /lib64/libc.so.6
daemon-helper: command crashed with signal 11
```
Ultimately, it turned out the thread came out from a syscall (`futex`)
and started crunching the `SIGHUP` handler's code in which a nullptr
dereference happened.
This patch blocks `SIGHUP` for all threads spawned by `AlienStore`.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      May 7, 2021 
    
    
      
  
    
      
    
  
An attempt to `Connection::do_auth()` may finish in one of three states:
_success_, _failure_ and _cancellation_. Unfortunately, its callers were
missing the third treating cancellation like a failure. This was the root
cause of the following failure at Sepia:
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-06_22:08:43-rados-master-distro-basic-smithi/6102605$ less ./remote/smithi204/log/ceph-osd.3.log.gz
...
WARN  2021-05-06 22:35:40,464 [shard 0] osd - ms_handle_reset
...
INFO  2021-05-06 22:35:40,465 [shard 0] monc - do_auth_single: connection closed
INFO  2021-05-06 22:35:40,465 [shard 0] ms - [osd.3(client) v2:172.21.15.204:6808/31418@57568 >> mon.? v2:172.21.15.204:3300/0] execute_connecting(): protocol aborted at CLOSING -- std::system_error (error crimson::net:6, protocol aborted)
...
ERROR 2021-05-06 22:35:40,465 [shard 0] osd - mon.osd.3 dispatch() ms_handle_reset caught exception: std::system_error (error crimson::net:3, negotiation failure)
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3909-g81233a18/rpm/el8/BUILD/ceph-17.0.0-3909-g81233a18/src/crimson/common/gated.h:36: crimson::common::Gated::dispatch(const char*, T&, Func&&) [with Func = crimson::mon::Client::ms_handle_reset(crimson::net::ConnectionRef, bool)::<lambda()>&; T = crimson::mon::Client]::<lambda(std::__exception_ptr::exception_ptr)>: Assertion `*eptr.__cxa_exception_type() == typeid(seastar::gate_closed_exception)' failed.
Aborting on shard 0.
Backtrace:
 0# 0x00005618C973932F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007F7BB592EB20 in /lib64/libpthread.so.0
 4# gsignal in /lib64/libc.so.6
 5# abort in /lib64/libc.so.6
 6# 0x00007F7BB3F29B09 in /lib64/libc.so.6
 7# 0x00007F7BB3F37DE6 in /lib64/libc.so.6
 8# 0x00005618C9FF295C in ceph-osd
 9# 0x00005618C3907313 in ceph-osd
10# 0x00005618CCA2F84F in ceph-osd
11# 0x00005618CCA34D90 in ceph-osd
12# 0x00005618CCBEC9BB in ceph-osd
13# 0x00005618CC744E9A in ceph-osd
14# main in ceph-osd
15# __libc_start_main in /lib64/libc.so.6
16# _start in ceph-osd
daemon-helper: command crashed with signal 6
```
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      May 17, 2021 
    
    
      
  
    
      
    
  
The `send_message()` method is a high-level facility for
communicating with a monitor. If there is an active conn
available, it sends the message immediatelly; otherwise
the message is queued. This method assumes the queue is
already drained if the connection is available.
`active_con` is managed by `reopen_session()` where it's
first cleared and then reset after finding new alive mon.
This is followed by draining the `pending_messages` queue
which happens in `on_session_opened()` after the `MAuth`
exchange is finished.
Unfortunately, the path from the `active_con` clearing
to draining the queue is long and divided into multiple
continuations which results in lack of atomicity. When
e.g. `run_command()` interleaves the stages, following
crash happens:
```
INFO  2021-05-07 08:13:43,914 [shard 0] monc - do_auth_single: mon v2:172.21.15.82:6805/34166 => v2:172.21.15.82:3300/0 returns auth_reply(proto 2 0 (0) Success) v1: 0
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3910-g1b18e076/rpm/el8/BUILD/ceph-17.0.0-3910-g1b18e076/src/crimson/mon/MonClient.cc:1034: seastar::future<> crimson::mon::Client::send_message(MessageRef): Assertion `pending_messages.empty()' failed.
Aborting on shard 0.
Backtrace:
 0# 0x000055CDE6DB532F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007FC1BF20BB20 in /lib64/libpthread.so.0
 4# gsignal in /lib64/libc.so.6
 5# abort in /lib64/libc.so.6
 6# 0x00007FC1BD806B09 in /lib64/libc.so.6
 7# 0x00007FC1BD814DE6 in /lib64/libc.so.6
 8# crimson::mon::Client::send_message(boost::intrusive_ptr<Message>) in ceph-osd
 9# crimson::mon::Client::renew_subs() in ceph-osd
10# 0x000055CDE764FB0B in ceph-osd
11# 0x000055CDE10457F0 in ceph-osd
12# 0x000055CDEA0AB88F in ceph-osd
13# 0x000055CDEA0B0DD0 in ceph-osd
14# 0x000055CDEA2689FB in ceph-osd
15# 0x000055CDE9DC0EDA in ceph-osd
16# main in ceph-osd
17# __libc_start_main in /lib64/libc.so.6
18# _start in ceph-osd
```
The problem caused following failure at Sepia:
http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-07_07:41:02-rados-master-distro-basic-smithi/6104549
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      May 17, 2021 
    
    
      
  
    
      
    
  
The `send_message()` method is a high-level facility for
communicating with a monitor. If there is an active conn
available, it sends the message immediately; otherwise
the message is queued. This method assumes the queue is
already drained if the connection is available.
`active_con` is managed by `reopen_session()` where it's
first cleared and then reset after finding new alive mon.
This is followed by draining the `pending_messages` queue
which happens in `on_session_opened()` after the `MAuth`
exchange is finished.
Unfortunately, the path from the `active_con` clearing
to draining the queue is long and divided into multiple
continuations which results in lack of atomicity. When
e.g. `run_command()` interleaves the stages, following
crash happens:
```
INFO  2021-05-07 08:13:43,914 [shard 0] monc - do_auth_single: mon v2:172.21.15.82:6805/34166 => v2:172.21.15.82:3300/0 returns auth_reply(proto 2 0 (0) Success) v1: 0
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3910-g1b18e076/rpm/el8/BUILD/ceph-17.0.0-3910-g1b18e076/src/crimson/mon/MonClient.cc:1034: seastar::future<> crimson::mon::Client::send_message(MessageRef): Assertion `pending_messages.empty()' failed.
Aborting on shard 0.
Backtrace:
 0# 0x000055CDE6DB532F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007FC1BF20BB20 in /lib64/libpthread.so.0
 4# gsignal in /lib64/libc.so.6
 5# abort in /lib64/libc.so.6
 6# 0x00007FC1BD806B09 in /lib64/libc.so.6
 7# 0x00007FC1BD814DE6 in /lib64/libc.so.6
 8# crimson::mon::Client::send_message(boost::intrusive_ptr<Message>) in ceph-osd
 9# crimson::mon::Client::renew_subs() in ceph-osd
10# 0x000055CDE764FB0B in ceph-osd
11# 0x000055CDE10457F0 in ceph-osd
12# 0x000055CDEA0AB88F in ceph-osd
13# 0x000055CDEA0B0DD0 in ceph-osd
14# 0x000055CDEA2689FB in ceph-osd
15# 0x000055CDE9DC0EDA in ceph-osd
16# main in ceph-osd
17# __libc_start_main in /lib64/libc.so.6
18# _start in ceph-osd
```
The problem caused following failure at Sepia:
http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-07_07:41:02-rados-master-distro-basic-smithi/6104549
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      May 24, 2021 
    
    
      
  
    
      
    
  
`OpSequencer` assumes that ID of a previous client request
is always lower than ID of current one. This is reflected
by the assertion in `OpSequencer::start_op()`. It trigerred
the following failure [1] in Teuthology:
```
DEBUG 2021-05-07 08:01:41,227 [shard 0] osd - client_request(id=1, detail=osd_op(client.4171.0:1 2.2 2.7c339972 (undecoded) ondisk+retry+read+known_if_redirected e29) v8) same_interval_since: 31
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3910-g1b18e076/rpm/el8/BUILD/ceph-
17.0.0-3910-g1b18e076/src/crimson/osd/osd_operation_sequencer.h:38: seastar::futurize_t<Result> crimson::osd::OpSequencer::start_op(HandleT&, uint64_t, uint64_t, FuncT&&) [with HandleT = crimson::PipelineHa
ndle; FuncT = crimson::interruptible::interruptor<InterruptCond>::wrap_function(Func&&) [with Func = crimson::osd::ClientRequest::start()::<lambda()> mutable::<lambda(Ref<crimson::osd::PG>)> mutable::<lambd
a()> mutable::<lambda()>; InterruptCond = crimson::osd::IOInterruptCondition]::<lambda()>; Result = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<>
>; seastar::futurize_t<Result> = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<> >; uint64_t = long unsigned int]: Assertion `prev_op < this_op' fai
led.
Aborting on shard 0.
Backtrace:
Segmentation fault.
Backtrace:
 0# 0x00005592B028932F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007F57B72E7B20 in /lib64/libpthread.so.0
 4# gsignal in /lib64/libc.so.6
 5# abort in /lib64/libc.so.6
 6# 0x00007F57B58E2B09 in /lib64/libc.so.6
 7# 0x00007F57B58F0DE6 in /lib64/libc.so.6
 8# 0x00005592ABB8484D in ceph-osd
 9# 0x00005592ABB8ACB3 in ceph-osd
10# seastar::continuation<seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<boost::intrusive_ptr<crimson::osd::PG> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > >(seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&, seastar::future_state<boost::intrusive_ptr<crimson::osd::PG> >&&)#1}, boost::intrusive_ptr<crimson::osd::PG> >::run_and_dispose() in ceph-osd
11# 0x00005592B357F88F in ceph-osd
12# 0x00005592B3584DD0 in ceph-osd
```
[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-07_07:41:02-rados-master-distro-basic-smithi/6104530
Crash analysis resulted in two observations:
1. during the request execution the acting set got
   changed, the request was interrupted and a try
   to re-execute it emerged;
2. the interrupted request was the very first client
   request the OSD has ever seen.
Code analysis showed a problem in how `ClientRequest`
establishes `prev_op_id`: although supposed to be performed
only once for a request, it can get executed twice but only
for the very first request `OpSequencer` saw.
```cpp
void ClientRequest::may_set_prev_op()
{
  // set prev_op_id if it's not set yet
  if (__builtin_expect(prev_op_id == 0, true)) {
    prev_op_id = sequencer.get_last_issued();
  }
}
```
Unfortunately, `0` isn't a distincted value that cannot
be returned by `get_last_issued()`:
```cpp
class OpSequencer {
  // ...
  uint64_t get_last_issued() const {
    return last_issued;
  }
  // ...
  // the id of last op which is issued
  uint64_t last_issued = 0;
```
As a result, `OpSequencer` returnted on the second call
a new value (actually `this_op`) violating the assertion.
The commit fixes the problem by switching from a designated
value to `std::optional`.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      May 24, 2021 
    
    
      
  
    
      
    
  
`OpSequencer` assumes that ID of a previous client request
is always lower than ID of current one. This is reflected
by the assertion in `OpSequencer::start_op()`. It triggered
the following failure [1] in Teuthology:
```
DEBUG 2021-05-07 08:01:41,227 [shard 0] osd - client_request(id=1, detail=osd_op(client.4171.0:1 2.2 2.7c339972 (undecoded) ondisk+retry+read+known_if_redirected e29) v8) same_interval_since: 31
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3910-g1b18e076/rpm/el8/BUILD/ceph-
17.0.0-3910-g1b18e076/src/crimson/osd/osd_operation_sequencer.h:38: seastar::futurize_t<Result> crimson::osd::OpSequencer::start_op(HandleT&, uint64_t, uint64_t, FuncT&&) [with HandleT = crimson::PipelineHa
ndle; FuncT = crimson::interruptible::interruptor<InterruptCond>::wrap_function(Func&&) [with Func = crimson::osd::ClientRequest::start()::<lambda()> mutable::<lambda(Ref<crimson::osd::PG>)> mutable::<lambd
a()> mutable::<lambda()>; InterruptCond = crimson::osd::IOInterruptCondition]::<lambda()>; Result = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<>
>; seastar::futurize_t<Result> = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<> >; uint64_t = long unsigned int]: Assertion `prev_op < this_op' fai
led.
Aborting on shard 0.
Backtrace:
Segmentation fault.
Backtrace:
 0# 0x00005592B028932F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007F57B72E7B20 in /lib64/libpthread.so.0
 4# gsignal in /lib64/libc.so.6
 5# abort in /lib64/libc.so.6
 6# 0x00007F57B58E2B09 in /lib64/libc.so.6
 7# 0x00007F57B58F0DE6 in /lib64/libc.so.6
 8# 0x00005592ABB8484D in ceph-osd
 9# 0x00005592ABB8ACB3 in ceph-osd
10# seastar::continuation<seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<boost::intrusive_ptr<crimson::osd::PG> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > >(seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&, seastar::future_state<boost::intrusive_ptr<crimson::osd::PG> >&&)#1}, boost::intrusive_ptr<crimson::osd::PG> >::run_and_dispose() in ceph-osd
11# 0x00005592B357F88F in ceph-osd
12# 0x00005592B3584DD0 in ceph-osd
```
[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-07_07:41:02-rados-master-distro-basic-smithi/6104530
Crash analysis resulted in two observations:
1. during the request execution the acting set got
   changed, the request was interrupted and a try
   to re-execute it emerged;
2. the interrupted request was the very first client
   request the OSD has ever seen.
Code analysis showed a problem in how `ClientRequest`
establishes `prev_op_id`: although supposed to be performed
only once for a request, it can get executed twice but only
for the very first request `OpSequencer` saw.
```cpp
void ClientRequest::may_set_prev_op()
{
  // set prev_op_id if it's not set yet
  if (__builtin_expect(prev_op_id == 0, true)) {
    prev_op_id = sequencer.get_last_issued();
  }
}
```
Unfortunately, `0` isn't a distincted value that cannot
be returned by `get_last_issued()`:
```cpp
class OpSequencer {
  // ...
  uint64_t get_last_issued() const {
    return last_issued;
  }
  // ...
  // the id of last op which is issued
  uint64_t last_issued = 0;
```
As a result, `OpSequencer` returned on the second call
a new value (actually `this_op`) violating the assertion.
The commit fixes the problem by switching from a designated
value to `std::optional`.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      May 24, 2021 
    
    
      
  
    
      
    
  
`OpSequencer` assumes that ID of a previous client request
is always lower than ID of current one. This is reflected
by the assertion in `OpSequencer::start_op()`. It triggered
the following failure [1] in Teuthology:
```
DEBUG 2021-05-07 08:01:41,227 [shard 0] osd - client_request(id=1, detail=osd_op(client.4171.0:1 2.2 2.7c339972 (undecoded) ondisk+retry+read+known_if_redirected e29) v8) same_interval_since: 31
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-3910-g1b18e076/rpm/el8/BUILD/ceph-
17.0.0-3910-g1b18e076/src/crimson/osd/osd_operation_sequencer.h:38: seastar::futurize_t<Result> crimson::osd::OpSequencer::start_op(HandleT&, uint64_t, uint64_t, FuncT&&) [with HandleT = crimson::PipelineHa
ndle; FuncT = crimson::interruptible::interruptor<InterruptCond>::wrap_function(Func&&) [with Func = crimson::osd::ClientRequest::start()::<lambda()> mutable::<lambda(Ref<crimson::osd::PG>)> mutable::<lambd
a()> mutable::<lambda()>; InterruptCond = crimson::osd::IOInterruptCondition]::<lambda()>; Result = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<>
>; seastar::futurize_t<Result> = crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, seastar::future<> >; uint64_t = long unsigned int]: Assertion `prev_op < this_op' fai
led.
Aborting on shard 0.
Backtrace:
Segmentation fault.
Backtrace:
 0# 0x00005592B028932F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<6>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007F57B72E7B20 in /lib64/libpthread.so.0
 4# gsignal in /lib64/libc.so.6
 5# abort in /lib64/libc.so.6
 6# 0x00007F57B58E2B09 in /lib64/libc.so.6
 7# 0x00007F57B58F0DE6 in /lib64/libc.so.6
 8# 0x00005592ABB8484D in ceph-osd
 9# 0x00005592ABB8ACB3 in ceph-osd
10# seastar::continuation<seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<boost::intrusive_ptr<crimson::osd::PG> >::then_impl_nrvo<seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>, seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > >(seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::noncopyable_function<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> > (boost::intrusive_ptr<crimson::osd::PG>&&)>&, seastar::future_state<boost::intrusive_ptr<crimson::osd::PG> >&&)#1}, boost::intrusive_ptr<crimson::osd::PG> >::run_and_dispose() in ceph-osd
11# 0x00005592B357F88F in ceph-osd
12# 0x00005592B3584DD0 in ceph-osd
```
[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-07_07:41:02-rados-master-distro-basic-smithi/6104530
Crash analysis resulted in two observations:
1. during the request execution the acting set got
   changed, the request was interrupted and a try
   to re-execute it emerged;
2. the interrupted request was the very first client
   request the OSD has ever seen.
Code analysis showed a problem in how `ClientRequest`
establishes `prev_op_id`: although supposed to be performed
only once for a request, it can get executed twice but only
for the very first request `OpSequencer` saw.
```cpp
void ClientRequest::may_set_prev_op()
{
  // set prev_op_id if it's not set yet
  if (__builtin_expect(prev_op_id == 0, true)) {
    prev_op_id = sequencer.get_last_issued();
  }
}
```
Unfortunately, `0` isn't a distincted value that cannot
be returned by `get_last_issued()`:
```cpp
class OpSequencer {
  // ...
  uint64_t get_last_issued() const {
    return last_issued;
  }
  // ...
  // the id of last op which is issued
  uint64_t last_issued = 0;
```
As a result, `OpSequencer` returned on the second call
a new value (actually `this_op`) violating the assertion.
The commit fixes the problem by switching from a designated
value to `std::optional`.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      May 25, 2021 
    
    
      
  
    
      
    
  
f7181ab has optimized the client parallelism. To achieve that `PG::do_osd_ops()` were converted to return basically future pair of futures. Unfortunately, the life- time management of `OpsExecuter` was kept intact. In the result, the object was valid only till fullfying the outer future while, due to the `rollbacker` instances, it should be available till `all_completed` becomes available. This issue can explain the following problem has been observed in a Teuthology job [1]. ``` DEBUG 2021-05-20 08:03:22,617 [shard 0] osd - do_op_call: method returned ret=-17, outdata.length()=0 while num_read=1, num_write=0 DEBUG 2021-05-20 08:03:22,617 [shard 0] osd - rollback_obc_if_modified: object 19:e17d4708:test-rados-api-smithi095-38404-2::foo:head got erro r generic:17, need_rollback=false ================================================================= ==33626==ERROR: AddressSanitizer: heap-use-after-free on address 0x60d0000b9320 at pc 0x560f486b8222 bp 0x7fffc467a1e0 sp 0x7fffc467a1d0 READ of size 4 at 0x60d0000b9320 thread T0 #0 0x560f486b8221 (/usr/bin/ceph-osd+0x2c610221) #1 0x560f4880c6b1 in seastar::continuation<seastar::internal::promise_base_with_type<boost::intrusive_ptr<MOSDOpReply> >, seastar::noncopy able_function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_ wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson::errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>, seastar::future<void>::then_impl_nrvo<seastar::noncopyable_function<crimson::interruptible::interruptible_future_detail<crimson::osd:: IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson: :errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>, crimson::interruptible::interruptible_future_detail<crimson::osd::IOInte rruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson::error ated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > >(seastar::noncopyable_function<crimson::interruptible::interruptible_future_detail <crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_ future<crimson::errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>&&)::{lambda(seastar::internal::promise_base_with_type<boos t::intrusive_ptr<MOSDOpReply> >&&, seastar::noncopyable_function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson::errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>::run_and_dispose() (/usr/bin/ceph-osd+0x2c7646b1) #2 0x560f5352c3ae (/usr/bin/ceph-osd+0x374843ae) ceph#3 0x560f535318ef (/usr/bin/ceph-osd+0x374898ef) ceph#4 0x560f536e395a (/usr/bin/ceph-osd+0x3763b95a) ceph#5 0x560f532413d9 (/usr/bin/ceph-osd+0x371993d9) ceph#6 0x560f476af95a in main (/usr/bin/ceph-osd+0x2b60795a) ceph#7 0x7f7aa0af97b2 in __libc_start_main (/lib64/libc.so.6+0x237b2) ceph#8 0x560f477d2e8d in _start (/usr/bin/ceph-osd+0x2b72ae8d) ``` [1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-20_07:28:16-rados-master-distro-basic-smithi/6124735/ The commit deals with the problem by repacking the outer future. An alternative could be in switching from `std::unique_ptr` to `seastar::shared_ptr` for managing `OpsExecuter`. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      May 25, 2021 
    
    
      
  
    
      
    
  
f7181ab has optimized the client parallelism. To achieve that `PG::do_osd_ops()` were converted to return basically future pair of futures. Unfortunately, the life- time management of `OpsExecuter` was kept intact. In the result, the object was valid only till fullfying the outer future while, due to the `rollbacker` instances, it should be available till `all_completed` becomes available. This issue can explain the following problem has been observed in a Teuthology job [1]. ``` DEBUG 2021-05-20 08:03:22,617 [shard 0] osd - do_op_call: method returned ret=-17, outdata.length()=0 while num_read=1, num_write=0 DEBUG 2021-05-20 08:03:22,617 [shard 0] osd - rollback_obc_if_modified: object 19:e17d4708:test-rados-api-smithi095-38404-2::foo:head got erro r generic:17, need_rollback=false ================================================================= ==33626==ERROR: AddressSanitizer: heap-use-after-free on address 0x60d0000b9320 at pc 0x560f486b8222 bp 0x7fffc467a1e0 sp 0x7fffc467a1d0 READ of size 4 at 0x60d0000b9320 thread T0 #0 0x560f486b8221 (/usr/bin/ceph-osd+0x2c610221) #1 0x560f4880c6b1 in seastar::continuation<seastar::internal::promise_base_with_type<boost::intrusive_ptr<MOSDOpReply> >, seastar::noncopy able_function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_ wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson::errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>, seastar::future<void>::then_impl_nrvo<seastar::noncopyable_function<crimson::interruptible::interruptible_future_detail<crimson::osd:: IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson: :errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>, crimson::interruptible::interruptible_future_detail<crimson::osd::IOInte rruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson::error ated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > >(seastar::noncopyable_function<crimson::interruptible::interruptible_future_detail <crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_ future<crimson::errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>&&)::{lambda(seastar::internal::promise_base_with_type<boos t::intrusive_ptr<MOSDOpReply> >&&, seastar::noncopyable_function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<(std::errc)11> > >::_future<crimson::errorated_future_marker<boost::intrusive_ptr<MOSDOpReply> > > > ()>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>::run_and_dispose() (/usr/bin/ceph-osd+0x2c7646b1) #2 0x560f5352c3ae (/usr/bin/ceph-osd+0x374843ae) ceph#3 0x560f535318ef (/usr/bin/ceph-osd+0x374898ef) ceph#4 0x560f536e395a (/usr/bin/ceph-osd+0x3763b95a) ceph#5 0x560f532413d9 (/usr/bin/ceph-osd+0x371993d9) ceph#6 0x560f476af95a in main (/usr/bin/ceph-osd+0x2b60795a) ceph#7 0x7f7aa0af97b2 in __libc_start_main (/lib64/libc.so.6+0x237b2) ceph#8 0x560f477d2e8d in _start (/usr/bin/ceph-osd+0x2b72ae8d) ``` [1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-20_07:28:16-rados-master-distro-basic-smithi/6124735/ The commit deals with the problem by repacking the outer future. An alternative could be in switching from `std::unique_ptr` to `seastar::shared_ptr` for managing `OpsExecuter`. Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      May 27, 2021 
    
    
      
  
    
      
    
  
…ore.
```
INFO  2021-05-26 20:16:32,872 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] ProtocolV2::start_accept(): targ
et_addr=172.21.15.119:55220/0
DEBUG 2021-05-26 20:16:32,872 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] TRIGGER ACCEPTING, was NONE
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] SEND(26) banner: len_payload=16,
 supported=1, required=0, banner="ceph v2
"
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] RECV(10) banner: "ceph v2
"
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] GOT banner: payload_len=16
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] RECV(16) banner features: supported=1 required=0
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] WRITE HelloFrame: my_type=osd, peer_addr=172.21.15.119:55220/0
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] GOT HelloFrame: my_type=client peer_addr=v2:172.21.15.119:6803/31733
INFO  2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> client.? -@55220] UPDATE: peer_type=client, policy(lossy=true server=true standby=false resetcheck=false)
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> client.? -@55220] GOT AuthRequestFrame: method=2, preferred_modes={1, 2}, payload_len=174
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-4622-gaa1dc559/rpm/el8/BUILD/ceph-17.0.0-4622-gaa1dc559/src/crimson/mon/MonClient.cc:399:10: runtime error: member access within null pointer of type 'struct Connection'
Segmentation fault on shard 0.
Backtrace:
 0# 0x000055E84CF44C1F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007F2BC88C0B20 in /lib64/libpthread.so.0
 4# crimson::mon::Connection::get_conn() in ceph-osd
 5# crimson::mon::Client::handle_auth_request(seastar::shared_ptr<crimson::net::Connection>, seastar::lw_shared_ptr<AuthConnectionMeta>, bool, unsigned int, ceph::buffer::v15_2_0::list const&, ceph::buffer::v15_2_0::list*) in ceph-osd
 6# crimson::net::ProtocolV2::_handle_auth_request(ceph::buffer::v15_2_0::list&, bool) in ceph-osd
 7# 0x000055E84DF67669 in ceph-osd
 8# 0x000055E84DF68775 in ceph-osd
 9# 0x000055E846F47F60 in ceph-osd
10# 0x000055E85296770F in ceph-osd
11# 0x000055E85296CC50 in ceph-osd
12# 0x000055E852B1ECBB in ceph-osd
13# 0x000055E85267C73A in ceph-osd
14# main in ceph-osd
15# __libc_start_main in /lib64/libc.so.6
16# _start in ceph-osd
Fault at location: 0x98
```
When the `handle_auth_request()` happens, there is no guarantee
`active_con` is being available. This is reflected in the classical
implementation:
```cpp
int MonClient::handle_auth_request(
  Connection *con,
  // ...
  ceph::buffer::list *reply)
{
  // ...
  bool isvalid = ah->verify_authorizer(
    cct,
    *rotating_secrets,
    payload,
    auth_meta->get_connection_secret_length(),
    reply,
    &con->peer_name,
    &con->peer_global_id,
    &con->peer_caps_info,
    &auth_meta->session_key,
    &auth_meta->connection_secret,
    ac);
```
The patch transplate the same logic to crimson.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      May 27, 2021 
    
    
      
  
    
      
    
  
Following crash occured at Sepia [1]:
```
INFO  2021-05-26 20:16:32,872 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] ProtocolV2::start_accept(): targ
et_addr=172.21.15.119:55220/0
DEBUG 2021-05-26 20:16:32,872 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] TRIGGER ACCEPTING, was NONE
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] SEND(26) banner: len_payload=16,
 supported=1, required=0, banner="ceph v2
"
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] RECV(10) banner: "ceph v2
"
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] GOT banner: payload_len=16
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] RECV(16) banner features: supported=1 required=0
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] WRITE HelloFrame: my_type=osd, peer_addr=172.21.15.119:55220/0
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> unknown.? -@55220] GOT HelloFrame: my_type=client peer_addr=v2:172.21.15.119:6803/31733
INFO  2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> client.? -@55220] UPDATE: peer_type=client, policy(lossy=true server=true standby=false resetcheck=false)
DEBUG 2021-05-26 20:16:32,873 [shard 0] ms - [osd.0(client) v2:172.21.15.119:6803/31733 >> client.? -@55220] GOT AuthRequestFrame: method=2, preferred_modes={1, 2}, payload_len=174
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-4622-gaa1dc559/rpm/el8/BUILD/ceph-17.0.0-4622-gaa1dc559/src/crimson/mon/MonClient.cc:399:10: runtime error: member access within null pointer of type 'struct Connection'
Segmentation fault on shard 0.
Backtrace:
 0# 0x000055E84CF44C1F in ceph-osd
 1# FatalSignal::signaled(int, siginfo_t const*) in ceph-osd
 2# FatalSignal::install_oneshot_signal_handler<11>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) in ceph-osd
 3# 0x00007F2BC88C0B20 in /lib64/libpthread.so.0
 4# crimson::mon::Connection::get_conn() in ceph-osd
 5# crimson::mon::Client::handle_auth_request(seastar::shared_ptr<crimson::net::Connection>, seastar::lw_shared_ptr<AuthConnectionMeta>, bool, unsigned int, ceph::buffer::v15_2_0::list const&, ceph::buffer::v15_2_0::list*) in ceph-osd
 6# crimson::net::ProtocolV2::_handle_auth_request(ceph::buffer::v15_2_0::list&, bool) in ceph-osd
 7# 0x000055E84DF67669 in ceph-osd
 8# 0x000055E84DF68775 in ceph-osd
 9# 0x000055E846F47F60 in ceph-osd
10# 0x000055E85296770F in ceph-osd
11# 0x000055E85296CC50 in ceph-osd
12# 0x000055E852B1ECBB in ceph-osd
13# 0x000055E85267C73A in ceph-osd
14# main in ceph-osd
15# __libc_start_main in /lib64/libc.so.6
16# _start in ceph-osd
Fault at location: 0x98
```
[1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/6136907
When the `handle_auth_request()` happens, there is no guarantee
`active_con` is being available. This is reflected in the classical
implementation:
```cpp
int MonClient::handle_auth_request(
  Connection *con,
  // ...
  ceph::buffer::list *reply)
{
  // ...
  bool isvalid = ah->verify_authorizer(
    cct,
    *rotating_secrets,
    payload,
    auth_meta->get_connection_secret_length(),
    reply,
    &con->peer_name,
    &con->peer_global_id,
    &con->peer_caps_info,
    &auth_meta->session_key,
    &auth_meta->connection_secret,
    ac);
```
The patch transplate the same logic to crimson.
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      May 31, 2021 
    
    
      
  
    
      
    
  
The `FuturizedStore` interface imposes the `get_attr()`
takes the `name` parameter as `std::string_view`, and
thus burdens implementations with extending the life-
time of the data the instance refers to.
Unfortunately, `AlienStore` is unaware that prolonging
the life of a `std::string_view` instance doesn't prolong
the data memory it points to. This problem has manifested
in the following use-after-free detected at Sepia:
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/6136929$ less ./remote/smithi194/log/ceph-osd.7.log.gz
...
DEBUG 2021-05-26 20:24:54,077 [shard 0] osd - do_osd_ops_execute: object 14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head - handling op
call
DEBUG 2021-05-26 20:24:54,077 [shard 0] osd - handling op call on object 14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - calling method lock.lock, num_read=0, num_write=0
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - handling op getxattr on object 14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - getxattr on obj=14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head for attr=_lock.TestLockPP1
DEBUG 2021-05-26 20:24:54,078 [shard 0] bluestore - get_attr
=================================================================
==34068==ERROR: AddressSanitizer: heap-use-after-free on address 0x6030001851d0 at pc 0x7f824d6a5b27 bp 0x7f822b4201c0 sp 0x7f822b41f968
READ of size 17 at 0x6030001851d0 thread T28 (alien-store-tp)
...
    #0 0x7f824d6a5b26  (/lib64/libasan.so.5+0x40b26)
    #1 0x55e2cbb2e00b  (/usr/bin/ceph-osd+0x2b6dc00b)
    #2 0x55e2d31f086e  (/usr/bin/ceph-osd+0x32d9e86e)
    ceph#3 0x55e2d3467607 in crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) (/usr/bin/ceph-osd+0x33015607)
    ceph#4 0x55e2d346b14a  (/usr/bin/ceph-osd+0x3301914a)
    ceph#5 0x7f8249d32ba2  (/lib64/libstdc++.so.6+0xc2ba2)
    ceph#6 0x7f824a00d149 in start_thread (/lib64/libpthread.so.0+0x8149)
    ceph#7 0x7f82486edf22 in clone (/lib64/libc.so.6+0xfcf22)
0x6030001851d0 is located 0 bytes inside of 31-byte region [0x6030001851d0,0x6030001851ef)
freed by thread T0 here:
    #0 0x7f824d757688 in operator delete(void*) (/lib64/libasan.so.5+0xf2688)
previously allocated by thread T0 here:
    #0 0x7f824d7567b0 in operator new(unsigned long) (/lib64/libasan.so.5+0xf17b0)
Thread T28 (alien-store-tp) created by T0 here:
    #0 0x7f824d6b7ea3 in __interceptor_pthread_create (/lib64/libasan.so.5+0x52ea3)
SUMMARY: AddressSanitizer: heap-use-after-free (/lib64/libasan.so.5+0x40b26)
Shadow bytes around the buggy address:
  0x0c06800289e0: fd fd fd fa fa fa fd fd fd fa fa fa 00 00 00 fa
  0x0c06800289f0: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
  0x0c0680028a00: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
  0x0c0680028a10: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
  0x0c0680028a20: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
=>0x0c0680028a30: fd fd fa fa fd fd fd fd fa fa[fd]fd fd fd fa fa
  0x0c0680028a40: fd fd fd fd fa fa fd fd fd fd fa fa 00 00 00 07
  0x0c0680028a50: fa fa 00 00 00 fa fa fa 00 00 00 fa fa fa fd fd
  0x0c0680028a60: fd fd fa fa fd fd fd fd fa fa fd fd fd fd fa fa
  0x0c0680028a70: 00 00 00 00 fa fa fd fd fd fd fa fa fd fd fd fd
  0x0c0680028a80: fa fa fd fd fd fd fa fa fd fd fd fd fa fa fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==34068==ABORTING
```
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
    
  rzarzynski 
      added a commit
      that referenced
      this pull request
    
      May 31, 2021 
    
    
      
  
    
      
    
  
The `FuturizedStore` interface imposes the `get_attr()`
takes the `name` parameter as `std::string_view`, and
thus burdens implementations with extending the life-
time of the data the instance refers to.
Unfortunately, `AlienStore` is unaware that prolonging
the life of a `std::string_view` instance doesn't prolong
the data memory it points to. This problem has manifested
in the following use-after-free detected at Sepia:
```
rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/6136929$ less ./remote/smithi194/log/ceph-osd.7.log.gz
...
DEBUG 2021-05-26 20:24:54,077 [shard 0] osd - do_osd_ops_execute: object 14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head - handling op
call
DEBUG 2021-05-26 20:24:54,077 [shard 0] osd - handling op call on object 14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - calling method lock.lock, num_read=0, num_write=0
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - handling op getxattr on object 14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head
DEBUG 2021-05-26 20:24:54,078 [shard 0] osd - getxattr on obj=14:55e1a5b4:test-rados-api-smithi067-38889-2::foo:head for attr=_lock.TestLockPP1
DEBUG 2021-05-26 20:24:54,078 [shard 0] bluestore - get_attr
=================================================================
==34068==ERROR: AddressSanitizer: heap-use-after-free on address 0x6030001851d0 at pc 0x7f824d6a5b27 bp 0x7f822b4201c0 sp 0x7f822b41f968
READ of size 17 at 0x6030001851d0 thread T28 (alien-store-tp)
...
    #0 0x7f824d6a5b26  (/lib64/libasan.so.5+0x40b26)
    #1 0x55e2cbb2e00b  (/usr/bin/ceph-osd+0x2b6dc00b)
    #2 0x55e2d31f086e  (/usr/bin/ceph-osd+0x32d9e86e)
    ceph#3 0x55e2d3467607 in crimson::os::ThreadPool::loop(std::chrono::duration<long, std::ratio<1l, 1000l> >, unsigned long) (/usr/bin/ceph-osd+0x33015607)
    ceph#4 0x55e2d346b14a  (/usr/bin/ceph-osd+0x3301914a)
    ceph#5 0x7f8249d32ba2  (/lib64/libstdc++.so.6+0xc2ba2)
    ceph#6 0x7f824a00d149 in start_thread (/lib64/libpthread.so.0+0x8149)
    ceph#7 0x7f82486edf22 in clone (/lib64/libc.so.6+0xfcf22)
0x6030001851d0 is located 0 bytes inside of 31-byte region [0x6030001851d0,0x6030001851ef)
freed by thread T0 here:
    #0 0x7f824d757688 in operator delete(void*) (/lib64/libasan.so.5+0xf2688)
previously allocated by thread T0 here:
    #0 0x7f824d7567b0 in operator new(unsigned long) (/lib64/libasan.so.5+0xf17b0)
Thread T28 (alien-store-tp) created by T0 here:
    #0 0x7f824d6b7ea3 in __interceptor_pthread_create (/lib64/libasan.so.5+0x52ea3)
SUMMARY: AddressSanitizer: heap-use-after-free (/lib64/libasan.so.5+0x40b26)
Shadow bytes around the buggy address:
  0x0c06800289e0: fd fd fd fa fa fa fd fd fd fa fa fa 00 00 00 fa
  0x0c06800289f0: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
  0x0c0680028a00: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
  0x0c0680028a10: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
  0x0c0680028a20: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
=>0x0c0680028a30: fd fd fa fa fd fd fd fd fa fa[fd]fd fd fd fa fa
  0x0c0680028a40: fd fd fd fd fa fa fd fd fd fd fa fa 00 00 00 07
  0x0c0680028a50: fa fa 00 00 00 fa fa fa 00 00 00 fa fa fa fd fd
  0x0c0680028a60: fd fd fa fa fd fd fd fd fa fa fd fd fd fd fa fa
  0x0c0680028a70: 00 00 00 00 fa fa fd fd fd fd fa fa fd fd fd fd
  0x0c0680028a80: fa fa fd fd fd fd fa fa fd fd fd fd fa fa fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==34068==ABORTING
```
Signed-off-by: Radoslaw Zarzynski <rzarzyns@redhat.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Sep 25, 2024 
    
    
      
  
    
      
    
  
Since we now co_await mut_func, we should not pass it by rvalue ref.
```
DEBUG 2024-09-01 15:54:46,212 [shard 0:main] osd - do_osd_ops_execute: object 2:c4c92e5a:::rbd_trash:head submitting txn
=================================================================
==17416==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f590008a430 at pc 0x0000040a367a bp 0x7ffc0b1d5ff0 sp 0x7ffc0b1d5fe0
Address 0x7f590008a430 is located in stack of thread T0 at offset 48 in frame
    #0 0x40b0a2b in crimson::osd::PG::do_osd_ops_execute ... lambda(std::error_code const&)#1}&&)::{lambda()#1}::operator()() const (/usr/bin/ceph-osd+0x40b0a2b)
```
Co-authored-by: Xuehan Xu <xuxuehan@qianxin.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Sep 25, 2024 
    
    
      
  
    
      
    
  
…n async task
test_messenger_thrash UT shows,
```
==461141==ERROR: AddressSanitizer: stack-use-after-return on address 0xffffb0b37c20 at pc 0xaaaad7239508 bp 0xffffeb113c50 sp 0xffffeb113c48
READ of size 4 at 0xffffb0b37c20 thread T0
    #0 0xaaaad7239504 in (anonymous namespace)::SyntheticWorkload::wait_for_done()::'lambda0'()::operator()() const /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/test/crimson/test_messenger_thrash.cc:455:13
    #1 0xaaaad723a1c0 in seastar::internal::do_until_state<(anonymous namespace)::SyntheticWorkload::wait_for_done()::'lambda'(), (anonymous namespace)::SyntheticWorkload::wait_for_done()::'lambda0'()>::run_and_dispose() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/seastar/include/seastar/core/loop.hh:303:26
    #2 0xaaaadacfb790 in seastar::reactor::run_tasks(seastar::reactor::task_queue&) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/seastar/src/core/reactor.cc:2653:14
    ceph#3 0xaaaadad04288 in seastar::reactor::run_some_tasks() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/seastar/src/core/reactor.cc:3123:9
    ceph#4 0xaaaadad07cd0 in seastar::reactor::do_run() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/seastar/src/core/reactor.cc:3291:9
    ceph#5 0xaaaadad05d60 in seastar::reactor::run() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/seastar/src/core/reactor.cc:3181:16
    ceph#6 0xaaaadaa860d8 in seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/seastar/src/core/app-template.cc:276:31
    ceph#7 0xaaaadaa83fb0 in seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/seastar/src/core/app-template.cc:167:12
    ceph#8 0xaaaad7203d88 in main /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/test/crimson/test_messenger_thrash.cc:669:14
    ceph#9 0xffffb32773f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ceph#10 0xffffb32774c8 in __libc_start_main csu/../csu/libc-start.c:392:3
    ceph#11 0xaaaad712546c in _start (/home/jenkins-build/build/workspace/ceph-pull-requests-arm64/build/bin/unittest-seastar-messenger-thrash+0x3b5546c) (BuildId: b0048d750e057d178816f94b3ce0459971785191)
```
Address 0xffffb0b37c20 is located in stack of thread T0 at offset 32 in frame
    #0 0xaaaad7493ed8 in ceph::buffer::v15_2_0::list::buffers_t::clear_and_dispose() /home/jenkins-build/build/workspace/ceph-pull-requests-arm64/src/include/buffer.h:596
Signed-off-by: arm7star <arm7star@qq.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Dec 17, 2024 
    
    
      
  
    
      
    
  
https://vote.heliosvoting.org/helios/elections/e03494ce-e04c-41d0-bb05-ec5ccc632ce4/view Question #1 Update election requirements for Ceph Executive Council Elections? Remove "ranked-choice" requirement 13 Keep "ranked-choice" requirement (no change) 16 Question #2 Require periodic elections in governance charter? No (no change) 8 Annual 15 Semi-annual 3 Quarterly 2 Question ceph#3 Update the Ceph Executive Council term length? Change to 3 years 14 Keep 2 years (no change) 14 Question ceph#4 Amend governance document to require a supermajority of votes for amendments to the governance model? The current requirement is a simple majority. Require a supermajority 20 Require a simple majority (no change) 9 Question ceph#5 Clarify "supermajority" and "majority" election requirements? Of members voting on a given question (abstaining does not bias the vote) 18 Of members voting on the election (abstaining is an implicit "no") 6 Of members in the CSC 3 Signed-off-by: Patrick Donnelly <pdonnell@ibm.com>
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Jan 14, 2025 
    
    
      
  
    
      
    
  
When sanitizer is enabled, unittest_rgw_amqp shows,
```
=================================================================
==1429129==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 416 byte(s) in 1 object(s) allocated from:
    #0 0xaaaab56a0008 in operator new(unsigned long) (/root/ceph/build/bin/unittest_rgw_amqp+0x1c0008) (BuildId: a20c317434e8d5f2ec33bbb71a69d81eb751c494)
    #1 0xaaaab57eecfc in amqp_new_connection /root/ceph/src/test/rgw/amqp_mock.cc:110:12
    #2 0xaaaab58095d8 in rgw::amqp::new_state(rgw::amqp::connection_t*, rgw::amqp::connection_id_t const&) /root/ceph/src/rgw/rgw_amqp.cc:373:16
    ceph#3 0xaaaab5813c70 in rgw::amqp::Manager::run() /root/ceph/src/rgw/rgw_amqp.cc:684:18
    ceph#4 0xaaaab5849e50 in void std::__invoke_impl<void, void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(std::__invoke_memfun_deref, void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14
    ceph#5 0xaaaab5849b48 in std::__invoke_result<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>::type std::__invoke<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    ceph#6 0xaaaab5849978 in void std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13
    ceph#7 0xaaaab584979c in std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11
    ceph#8 0xaaaab5849420 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> > >::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13
    ceph#9 0xffffb0fb31f8  (/lib/aarch64-linux-gnu/libstdc++.so.6+0xd31f8) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#10 0xffffb0d7d5c4 in start_thread nptl/./nptl/pthread_create.c:442:8
    ceph#11 0xffffb0de5ed8  misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:79
Direct leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0xaaaab5669bb8 in posix_memalign (/root/ceph/build/bin/unittest_rgw_amqp+0x189bb8) (BuildId: a20c317434e8d5f2ec33bbb71a69d81eb751c494)
    #1 0xaaaab57f5294 in boost::alignment::aligned_alloc(unsigned long, unsigned long) /root/ceph/build/boost/include/boost/align/detail/aligned_alloc_posix.hpp:26:9
    #2 0xaaaab57f4d88 in boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_ack_t_>::node, 64ul>::allocate(unsigned long, void const*) /root/ceph/build/boost/include/boost/align/aligned_allocator.hpp:70:19
    ceph#3 0xaaaab57f4204 in boost::lockfree::detail::freelist_stack<boost::lockfree::queue<amqp_basic_ack_t_>::node, boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_ack_t_>::node, 64ul> >::freelist_stack<boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_ack_t_>::node, 64ul> >(boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_ack_t_>::node, 64ul> const&, unsigned long) /root/ceph/build/boost/include/boost/lockfree/detail/freelist.hpp:62:31
    ceph#4 0xaaaab57f3728 in boost::lockfree::queue<amqp_basic_ack_t_>::queue(unsigned long) /root/ceph/build/boost/include/boost/lockfree/queue.hpp:234:9
    ceph#5 0xaaaab57f2ea8 in amqp_connection_state_t_::amqp_connection_state_t_() /root/ceph/src/test/rgw/amqp_mock.cc:90:5
    ceph#6 0xaaaab57eed04 in amqp_new_connection /root/ceph/src/test/rgw/amqp_mock.cc:110:16
    ceph#7 0xaaaab58095d8 in rgw::amqp::new_state(rgw::amqp::connection_t*, rgw::amqp::connection_id_t const&) /root/ceph/src/rgw/rgw_amqp.cc:373:16
    ceph#8 0xaaaab5813c70 in rgw::amqp::Manager::run() /root/ceph/src/rgw/rgw_amqp.cc:684:18
    ceph#9 0xaaaab5849e50 in void std::__invoke_impl<void, void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(std::__invoke_memfun_deref, void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14
    ceph#10 0xaaaab5849b48 in std::__invoke_result<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>::type std::__invoke<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    ceph#11 0xaaaab5849978 in void std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13
    ceph#12 0xaaaab584979c in std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11
    ceph#13 0xaaaab5849420 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> > >::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13
    ceph#14 0xffffb0fb31f8  (/lib/aarch64-linux-gnu/libstdc++.so.6+0xd31f8) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#15 0xffffb0d7d5c4 in start_thread nptl/./nptl/pthread_create.c:442:8
    ceph#16 0xffffb0de5ed8  misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:79
Direct leak of 64 byte(s) in 1 object(s) allocated from:
    #0 0xaaaab5669bb8 in posix_memalign (/root/ceph/build/bin/unittest_rgw_amqp+0x189bb8) (BuildId: a20c317434e8d5f2ec33bbb71a69d81eb751c494)
    #1 0xaaaab57f5294 in boost::alignment::aligned_alloc(unsigned long, unsigned long) /root/ceph/build/boost/include/boost/align/detail/aligned_alloc_posix.hpp:26:9
    #2 0xaaaab57f90bc in boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_nack_t_>::node, 64ul>::allocate(unsigned long, void const*) /root/ceph/build/boost/include/boost/align/aligned_allocator.hpp:70:19
    ceph#3 0xaaaab57f8538 in boost::lockfree::detail::freelist_stack<boost::lockfree::queue<amqp_basic_nack_t_>::node, boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_nack_t_>::node, 64ul> >::freelist_stack<boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_nack_t_>::node, 64ul> >(boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_nack_t_>::node, 64ul> const&, unsigned long) /root/ceph/build/boost/include/boost/lockfree/detail/freelist.hpp:62:31
    ceph#4 0xaaaab57f3a6c in boost::lockfree::queue<amqp_basic_nack_t_>::queue(unsigned long) /root/ceph/build/boost/include/boost/lockfree/queue.hpp:234:9
    ceph#5 0xaaaab57f2eb8 in amqp_connection_state_t_::amqp_connection_state_t_() /root/ceph/src/test/rgw/amqp_mock.cc:91:5
    ceph#6 0xaaaab57eed04 in amqp_new_connection /root/ceph/src/test/rgw/amqp_mock.cc:110:16
    ceph#7 0xaaaab58095d8 in rgw::amqp::new_state(rgw::amqp::connection_t*, rgw::amqp::connection_id_t const&) /root/ceph/src/rgw/rgw_amqp.cc:373:16
    ceph#8 0xaaaab5813c70 in rgw::amqp::Manager::run() /root/ceph/src/rgw/rgw_amqp.cc:684:18
    ceph#9 0xaaaab5849e50 in void std::__invoke_impl<void, void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(std::__invoke_memfun_deref, void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14
    ceph#10 0xaaaab5849b48 in std::__invoke_result<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>::type std::__invoke<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    ceph#11 0xaaaab5849978 in void std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13
    ceph#12 0xaaaab584979c in std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11
    ceph#13 0xaaaab5849420 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> > >::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13
    ceph#14 0xffffb0fb31f8  (/lib/aarch64-linux-gnu/libstdc++.so.6+0xd31f8) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#15 0xffffb0d7d5c4 in start_thread nptl/./nptl/pthread_create.c:442:8
    ceph#16 0xffffb0de5ed8  misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:79
Direct leak of 9 byte(s) in 1 object(s) allocated from:
    #0 0xaaaab56690a0 in malloc (/root/ceph/build/bin/unittest_rgw_amqp+0x1890a0) (BuildId: a20c317434e8d5f2ec33bbb71a69d81eb751c494)
    #1 0xaaaab57f2754 in amqp_bytes_malloc_dup /root/ceph/src/test/rgw/amqp_mock.cc:384:18
    #2 0xaaaab580b4b4 in rgw::amqp::new_state(rgw::amqp::connection_t*, rgw::amqp::connection_id_t const&) /root/ceph/src/rgw/rgw_amqp.cc:509:28
    ceph#3 0xaaaab5813c70 in rgw::amqp::Manager::run() /root/ceph/src/rgw/rgw_amqp.cc:684:18
    ceph#4 0xaaaab5849e50 in void std::__invoke_impl<void, void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(std::__invoke_memfun_deref, void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14
    ceph#5 0xaaaab5849b48 in std::__invoke_result<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>::type std::__invoke<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    ceph#6 0xaaaab5849978 in void std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13
    ceph#7 0xaaaab584979c in std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11
    ceph#8 0xaaaab5849420 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> > >::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13
    ceph#9 0xffffb0fb31f8  (/lib/aarch64-linux-gnu/libstdc++.so.6+0xd31f8) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#10 0xffffb0d7d5c4 in start_thread nptl/./nptl/pthread_create.c:442:8
    ceph#11 0xffffb0de5ed8  misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:79
Indirect leak of 65536 byte(s) in 1024 object(s) allocated from:
    #0 0xaaaab5669bb8 in posix_memalign (/root/ceph/build/bin/unittest_rgw_amqp+0x189bb8) (BuildId: a20c317434e8d5f2ec33bbb71a69d81eb751c494)
    #1 0xaaaab57f5294 in boost::alignment::aligned_alloc(unsigned long, unsigned long) /root/ceph/build/boost/include/boost/align/detail/aligned_alloc_posix.hpp:26:9
    #2 0xaaaab57f4d88 in boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_ack_t_>::node, 64ul>::allocate(unsigned long, void const*) /root/ceph/build/boost/include/boost/align/aligned_allocator.hpp:70:19
    ceph#3 0xaaaab57f4204 in boost::lockfree::detail::freelist_stack<boost::lockfree::queue<amqp_basic_ack_t_>::node, boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_ack_t_>::node, 64ul> >::freelist_stack<boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_ack_t_>::node, 64ul> >(boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_ack_t_>::node, 64ul> const&, unsigned long) /root/ceph/build/boost/include/boost/lockfree/detail/freelist.hpp:62:31
    ceph#4 0xaaaab57f3728 in boost::lockfree::queue<amqp_basic_ack_t_>::queue(unsigned long) /root/ceph/build/boost/include/boost/lockfree/queue.hpp:234:9
    ceph#5 0xaaaab57f2ea8 in amqp_connection_state_t_::amqp_connection_state_t_() /root/ceph/src/test/rgw/amqp_mock.cc:90:5
    ceph#6 0xaaaab57eed04 in amqp_new_connection /root/ceph/src/test/rgw/amqp_mock.cc:110:16
    ceph#7 0xaaaab58095d8 in rgw::amqp::new_state(rgw::amqp::connection_t*, rgw::amqp::connection_id_t const&) /root/ceph/src/rgw/rgw_amqp.cc:373:16
    ceph#8 0xaaaab5813c70 in rgw::amqp::Manager::run() /root/ceph/src/rgw/rgw_amqp.cc:684:18
    ceph#9 0xaaaab5849e50 in void std::__invoke_impl<void, void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(std::__invoke_memfun_deref, void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14
    ceph#10 0xaaaab5849b48 in std::__invoke_result<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>::type std::__invoke<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    ceph#11 0xaaaab5849978 in void std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13
    ceph#12 0xaaaab584979c in std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11
    ceph#13 0xaaaab5849420 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> > >::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13
    ceph#14 0xffffb0fb31f8  (/lib/aarch64-linux-gnu/libstdc++.so.6+0xd31f8) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#15 0xffffb0d7d5c4 in start_thread nptl/./nptl/pthread_create.c:442:8
    ceph#16 0xffffb0de5ed8  misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:79
Indirect leak of 65536 byte(s) in 1024 object(s) allocated from:
    #0 0xaaaab5669bb8 in posix_memalign (/root/ceph/build/bin/unittest_rgw_amqp+0x189bb8) (BuildId: a20c317434e8d5f2ec33bbb71a69d81eb751c494)
    #1 0xaaaab57f5294 in boost::alignment::aligned_alloc(unsigned long, unsigned long) /root/ceph/build/boost/include/boost/align/detail/aligned_alloc_posix.hpp:26:9
    #2 0xaaaab57f90bc in boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_nack_t_>::node, 64ul>::allocate(unsigned long, void const*) /root/ceph/build/boost/include/boost/align/aligned_allocator.hpp:70:19
    ceph#3 0xaaaab57f8538 in boost::lockfree::detail::freelist_stack<boost::lockfree::queue<amqp_basic_nack_t_>::node, boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_nack_t_>::node, 64ul> >::freelist_stack<boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_nack_t_>::node, 64ul> >(boost::alignment::aligned_allocator<boost::lockfree::queue<amqp_basic_nack_t_>::node, 64ul> const&, unsigned long) /root/ceph/build/boost/include/boost/lockfree/detail/freelist.hpp:62:31
    ceph#4 0xaaaab57f3a6c in boost::lockfree::queue<amqp_basic_nack_t_>::queue(unsigned long) /root/ceph/build/boost/include/boost/lockfree/queue.hpp:234:9
    ceph#5 0xaaaab57f2eb8 in amqp_connection_state_t_::amqp_connection_state_t_() /root/ceph/src/test/rgw/amqp_mock.cc:91:5
    ceph#6 0xaaaab57eed04 in amqp_new_connection /root/ceph/src/test/rgw/amqp_mock.cc:110:16
    ceph#7 0xaaaab58095d8 in rgw::amqp::new_state(rgw::amqp::connection_t*, rgw::amqp::connection_id_t const&) /root/ceph/src/rgw/rgw_amqp.cc:373:16
    ceph#8 0xaaaab5813c70 in rgw::amqp::Manager::run() /root/ceph/src/rgw/rgw_amqp.cc:684:18
    ceph#9 0xaaaab5849e50 in void std::__invoke_impl<void, void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(std::__invoke_memfun_deref, void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14
    ceph#10 0xaaaab5849b48 in std::__invoke_result<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>::type std::__invoke<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    ceph#11 0xaaaab5849978 in void std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13
    ceph#12 0xaaaab584979c in std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11
    ceph#13 0xaaaab5849420 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> > >::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13
    ceph#14 0xffffb0fb31f8  (/lib/aarch64-linux-gnu/libstdc++.so.6+0xd31f8) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#15 0xffffb0d7d5c4 in start_thread nptl/./nptl/pthread_create.c:442:8
    ceph#16 0xffffb0de5ed8  misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:79
Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0xaaaab56a0008 in operator new(unsigned long) (/root/ceph/build/bin/unittest_rgw_amqp+0x1c0008) (BuildId: a20c317434e8d5f2ec33bbb71a69d81eb751c494)
    #1 0xaaaab57eefb0 in amqp_tcp_socket_new /root/ceph/src/test/rgw/amqp_mock.cc:127:19
    #2 0xaaaab5809740 in rgw::amqp::new_state(rgw::amqp::connection_t*, rgw::amqp::connection_id_t const&) /root/ceph/src/rgw/rgw_amqp.cc:401:14
    ceph#3 0xaaaab5813c70 in rgw::amqp::Manager::run() /root/ceph/src/rgw/rgw_amqp.cc:684:18
    ceph#4 0xaaaab5849e50 in void std::__invoke_impl<void, void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(std::__invoke_memfun_deref, void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14
    ceph#5 0xaaaab5849b48 in std::__invoke_result<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>::type std::__invoke<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    ceph#6 0xaaaab5849978 in void std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13
    ceph#7 0xaaaab584979c in std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11
    ceph#8 0xaaaab5849420 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> > >::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13
    ceph#9 0xffffb0fb31f8  (/lib/aarch64-linux-gnu/libstdc++.so.6+0xd31f8) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#10 0xffffb0d7d5c4 in start_thread nptl/./nptl/pthread_create.c:442:8
    ceph#11 0xffffb0de5ed8  misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:79
Indirect leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0xaaaab56a0008 in operator new(unsigned long) (/root/ceph/build/bin/unittest_rgw_amqp+0x1c0008) (BuildId: a20c317434e8d5f2ec33bbb71a69d81eb751c494)
    #1 0xaaaab57f102c in amqp_queue_declare /root/ceph/src/test/rgw/amqp_mock.cc:283:18
    #2 0xaaaab580ad14 in rgw::amqp::new_state(rgw::amqp::connection_t*, rgw::amqp::connection_id_t const&) /root/ceph/src/rgw/rgw_amqp.cc:480:27
    ceph#3 0xaaaab5813c70 in rgw::amqp::Manager::run() /root/ceph/src/rgw/rgw_amqp.cc:684:18
    ceph#4 0xaaaab5849e50 in void std::__invoke_impl<void, void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(std::__invoke_memfun_deref, void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14
    ceph#5 0xaaaab5849b48 in std::__invoke_result<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>::type std::__invoke<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    ceph#6 0xaaaab5849978 in void std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13
    ceph#7 0xaaaab584979c in std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11
    ceph#8 0xaaaab5849420 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> > >::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13
    ceph#9 0xffffb0fb31f8  (/lib/aarch64-linux-gnu/libstdc++.so.6+0xd31f8) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#10 0xffffb0d7d5c4 in start_thread nptl/./nptl/pthread_create.c:442:8
    ceph#11 0xffffb0de5ed8  misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:79
Indirect leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0xaaaab56a0008 in operator new(unsigned long) (/root/ceph/build/bin/unittest_rgw_amqp+0x1c0008) (BuildId: a20c317434e8d5f2ec33bbb71a69d81eb751c494)
    #1 0xaaaab57f2280 in amqp_basic_consume /root/ceph/src/test/rgw/amqp_mock.cc:359:20
    #2 0xaaaab580b124 in rgw::amqp::new_state(rgw::amqp::connection_t*, rgw::amqp::connection_id_t const&) /root/ceph/src/rgw/rgw_amqp.cc:493:29
    ceph#3 0xaaaab5813c70 in rgw::amqp::Manager::run() /root/ceph/src/rgw/rgw_amqp.cc:684:18
    ceph#4 0xaaaab5849e50 in void std::__invoke_impl<void, void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(std::__invoke_memfun_deref, void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14
    ceph#5 0xaaaab5849b48 in std::__invoke_result<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>::type std::__invoke<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    ceph#6 0xaaaab5849978 in void std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13
    ceph#7 0xaaaab584979c in std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11
    ceph#8 0xaaaab5849420 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> > >::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13
    ceph#9 0xffffb0fb31f8  (/lib/aarch64-linux-gnu/libstdc++.so.6+0xd31f8) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#10 0xffffb0d7d5c4 in start_thread nptl/./nptl/pthread_create.c:442:8
    ceph#11 0xffffb0de5ed8  misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:79
Indirect leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0xaaaab56a0008 in operator new(unsigned long) (/root/ceph/build/bin/unittest_rgw_amqp+0x1c0008) (BuildId: a20c317434e8d5f2ec33bbb71a69d81eb751c494)
    #1 0xaaaab57f0214 in amqp_channel_open /root/ceph/src/test/rgw/amqp_mock.cc:213:23
    #2 0xaaaab5809e78 in rgw::amqp::new_state(rgw::amqp::connection_t*, rgw::amqp::connection_id_t const&) /root/ceph/src/rgw/rgw_amqp.cc:448:21
    ceph#3 0xaaaab5813c70 in rgw::amqp::Manager::run() /root/ceph/src/rgw/rgw_amqp.cc:684:18
    ceph#4 0xaaaab5849e50 in void std::__invoke_impl<void, void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(std::__invoke_memfun_deref, void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14
    ceph#5 0xaaaab5849b48 in std::__invoke_result<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>::type std::__invoke<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    ceph#6 0xaaaab5849978 in void std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13
    ceph#7 0xaaaab584979c in std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11
    ceph#8 0xaaaab5849420 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> > >::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13
    ceph#9 0xffffb0fb31f8  (/lib/aarch64-linux-gnu/libstdc++.so.6+0xd31f8) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#10 0xffffb0d7d5c4 in start_thread nptl/./nptl/pthread_create.c:442:8
    ceph#11 0xffffb0de5ed8  misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:79
Indirect leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0xaaaab56a0008 in operator new(unsigned long) (/root/ceph/build/bin/unittest_rgw_amqp+0x1c0008) (BuildId: a20c317434e8d5f2ec33bbb71a69d81eb751c494)
    #1 0xaaaab57f0294 in amqp_channel_open /root/ceph/src/test/rgw/amqp_mock.cc:217:21
    #2 0xaaaab580a188 in rgw::amqp::new_state(rgw::amqp::connection_t*, rgw::amqp::connection_id_t const&) /root/ceph/src/rgw/rgw_amqp.cc:453:21
    ceph#3 0xaaaab5813c70 in rgw::amqp::Manager::run() /root/ceph/src/rgw/rgw_amqp.cc:684:18
    ceph#4 0xaaaab5849e50 in void std::__invoke_impl<void, void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(std::__invoke_memfun_deref, void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14
    ceph#5 0xaaaab5849b48 in std::__invoke_result<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>::type std::__invoke<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    ceph#6 0xaaaab5849978 in void std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13
    ceph#7 0xaaaab584979c in std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11
    ceph#8 0xaaaab5849420 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> > >::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13
    ceph#9 0xffffb0fb31f8  (/lib/aarch64-linux-gnu/libstdc++.so.6+0xd31f8) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#10 0xffffb0d7d5c4 in start_thread nptl/./nptl/pthread_create.c:442:8
    ceph#11 0xffffb0de5ed8  misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:79
Indirect leak of 1 byte(s) in 1 object(s) allocated from:
    #0 0xaaaab56a0008 in operator new(unsigned long) (/root/ceph/build/bin/unittest_rgw_amqp+0x1c0008) (BuildId: a20c317434e8d5f2ec33bbb71a69d81eb751c494)
    #1 0xaaaab57f1454 in amqp_confirm_select /root/ceph/src/test/rgw/amqp_mock.cc:291:20
    #2 0xaaaab580a49c in rgw::amqp::new_state(rgw::amqp::connection_t*, rgw::amqp::connection_id_t const&) /root/ceph/src/rgw/rgw_amqp.cc:458:21
    ceph#3 0xaaaab5813c70 in rgw::amqp::Manager::run() /root/ceph/src/rgw/rgw_amqp.cc:684:18
    ceph#4 0xaaaab5849e50 in void std::__invoke_impl<void, void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(std::__invoke_memfun_deref, void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14
    ceph#5 0xaaaab5849b48 in std::__invoke_result<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>::type std::__invoke<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    ceph#6 0xaaaab5849978 in void std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13
    ceph#7 0xaaaab584979c in std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11
    ceph#8 0xaaaab5849420 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> > >::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13
    ceph#9 0xffffb0fb31f8  (/lib/aarch64-linux-gnu/libstdc++.so.6+0xd31f8) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#10 0xffffb0d7d5c4 in start_thread nptl/./nptl/pthread_create.c:442:8
    ceph#11 0xffffb0de5ed8  misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:79
Indirect leak of 1 byte(s) in 1 object(s) allocated from:
    #0 0xaaaab56a0008 in operator new(unsigned long) (/root/ceph/build/bin/unittest_rgw_amqp+0x1c0008) (BuildId: a20c317434e8d5f2ec33bbb71a69d81eb751c494)
    #1 0xaaaab57f0548 in amqp_exchange_declare /root/ceph/src/test/rgw/amqp_mock.cc:231:21
    #2 0xaaaab580a93c in rgw::amqp::new_state(rgw::amqp::connection_t*, rgw::amqp::connection_id_t const&) /root/ceph/src/rgw/rgw_amqp.cc:466:21
    ceph#3 0xaaaab5813c70 in rgw::amqp::Manager::run() /root/ceph/src/rgw/rgw_amqp.cc:684:18
    ceph#4 0xaaaab5849e50 in void std::__invoke_impl<void, void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(std::__invoke_memfun_deref, void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74:14
    ceph#5 0xaaaab5849b48 in std::__invoke_result<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>::type std::__invoke<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*>(void (rgw::amqp::Manager::*&&)() noexcept, rgw::amqp::Manager*&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    ceph#6 0xaaaab5849978 in void std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::_M_invoke<0ul, 1ul>(std::_Index_tuple<0ul, 1ul>) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:259:13
    ceph#7 0xaaaab584979c in std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> >::operator()() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:266:11
    ceph#8 0xaaaab5849420 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (rgw::amqp::Manager::*)() noexcept, rgw::amqp::Manager*> > >::_M_run() /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211:13
    ceph#9 0xffffb0fb31f8  (/lib/aarch64-linux-gnu/libstdc++.so.6+0xd31f8) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#10 0xffffb0d7d5c4 in start_thread nptl/./nptl/pthread_create.c:442:8
    ceph#11 0xffffb0de5ed8  misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:79
SUMMARY: AddressSanitizer: 131723 byte(s) leaked in 2059 allocation(s).
```
So to prevent multiple threads from operating the same element at the same time, add the lock only to erase and ensure fully create connection before emplace.
Fixes: https://tracker.ceph.com/issues/66266
Signed-off-by: Rongqi Sun <sunrongqi@huawei.com>
(cherry picked from commit 958ecba)
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Feb 12, 2025 
    
    
      
  
    
      
    
  
…lable When we have a socket failure or connection issue, we may send a mon command and never check if it completed. If we resend the command to another monitor, the resent command may complete before the first sent command. This can cause users to send the command twice, which can lead to issues in automated environments. For example: We have 2 monitors: mon.a and mon.b 1. Send command to delete pool - monclient targets mon.a 2. A socket failure occurs, and mon.a has a delay in response 3. Monclient hunts for another monitor to resend the delete pool command and finds mon.b 4. Mon.b removes the pool and sends an acknowledgment 5. The user script now sends a create pool command, but mon.a now sends the acknowledgment for the pool delete from step #1 We end up without a pool, as mon.a deleted it. The mon_client_hunt_on_resent configuration was added to control the behavior of retrying commands on monitor connection failures. By default, this option is enabled to prevent situations where a command is retried on the same monitor, potentially missing better monitor candidates. Clients experiencing specific conditions that require retrying on the same monitor can disable this feature by setting the configuration to false. Fixes: https://tracker.ceph.com/issues/63789 Signed-off-by: Nitzan Mordechai <nmordec@redhat.com>
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Mar 18, 2025 
    
    
      
  
    
      
    
  
We could also use here seastar::coroutine::parallel_for_each. However,
an `interruptible` overload must be used. Instead, use the coroutine
which wrapper is simpler this time.
```
kernel callstack:
    #0 0x56f6e7a in seastar::lw_shared_ptr<std::map<hobject_t, eversion_t, std::less<hobject_t>, std::allocator<std::pair<hobject_t const, eversion_t> > > >::operator->() const /home/matan/ceph/src/seastar/include/seastar/core/shared_ptr.hh:347
    #1 0x56f6e7a in operator() /home/matan/ceph/src/crimson/osd/recovery_backend.cc:245
    #2 0x5286c62 in std::__n4861::coroutine_handle<crimson::internal::promise_base<crimson::interruptible::interruptor<crimson::osd::IOInterruptCondition>, void, void> >::resume() const /opt/rh/gcc-toolset-13/root/usr/include/c++/13/coroutine:24
SUMMARY: AddressSanitizer: stack-use-after-return /home/matan/ceph/src/seastar/include/seastar/core/shared_ptr.hh:347 in seastar::lw_shared_ptr<std::map<hobject_t, eversion_t, std::less<hobject_t>, std::allocator<std::pair<hobject_t const, eversion_t> >
```
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Apr 4, 2025 
    
    
      
  
    
      
    
  
before this change, we allocate coefficients table with malloc() in ErasureCodeIsaDefault::prepare(), but free them using delete. so, in this change, we use new [] and delete [] to allocate and free them, to be more consistent, and to silence this warning. this issue was identified by AddressSanitizer: ``` 2025-03-22T04:27:58.210 INFO:tasks.ceph.osd.3.smithi102.stderr:==42899==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new [] vs free) on 0x611000570d40 2025-03-22T04:27:58.253 DEBUG:teuthology.orchestra.run.smithi102:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 env ASAN_OPTIONS=detect_leaks=0,detect_odr_violation=0,alloc_dealloc_mismatch=0 LD_PRELOAD=/lib64/libasan.so.6 ceph --cluster ceph osd pool set cephfs_data_ec allow_ec_overwrites true 2025-03-22T04:27:58.442 INFO:tasks.ceph.osd.3.smithi102.stderr: #0 0x7efdc30b46b7 in free (/lib64/libasan.so.6+0xb46b7) 2025-03-22T04:27:58.442 INFO:tasks.ceph.osd.3.smithi102.stderr: #1 0x7efd9e2e5f26 in ErasureCodeIsaTableCache::setEncodingTable(int, int, int, unsigned char*) (/usr/lib64/ceph/erasure-code/libec_isa.so+0x3bf26) 2025-03-22T04:27:58.442 INFO:tasks.ceph.osd.3.smithi102.stderr: #2 0x7efd9e2f0fae in ErasureCodeIsaDefault::prepare() (/usr/lib64/ceph/erasure-code/libec_isa.so+0x46fae) 2025-03-22T04:27:58.442 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#3 0x7efd9e2bfa3e in ErasureCodeIsa::init(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::ostream*) (/usr/lib64/ceph/erasure-code/libec_isa.so+0x15a3e) 2025-03-22T04:27:58.442 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#4 0x7efd9e2ff510 in ErasureCodePluginIsa::factory(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::shared_ptr<ceph::ErasureCodeInterface>*, std::ostream*) (/usr/lib64/ceph/erasure-code/libec_isa.so+0x55510) 2025-03-22T04:27:58.442 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#5 0x55a7ef19b395 in ceph::ErasureCodePluginRegistry::factory(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::shared_ptr<ceph::ErasureCodeInterface>*, std::ostream*) (/usr/bin/ceph-osd+0x4dcb395) 2025-03-22T04:27:58.442 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#6 0x55a7eb9089d8 in PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, ceph::common::CephContext*) (/usr/bin/ceph-osd+0x15389d8) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#7 0x55a7eb4b63ec in PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&, spg_t) (/usr/bin/ceph-osd+0x10e63ec) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#8 0x55a7eaef1cc7 in OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t) (/usr/bin/ceph-osd+0xb21cc7) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#9 0x55a7eaef8d69 in OSD::handle_pg_create_info(std::shared_ptr<OSDMap const> const&, PGCreateInfo const*) (/usr/bin/ceph-osd+0xb28d69) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#10 0x55a7eafeb9ed in OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (/usr/bin/ceph-osd+0xc1b9ed) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#11 0x55a7ed04c9aa in ShardedThreadPool::shardedthreadpool_worker(unsigned int) (/usr/bin/ceph-osd+0x2c7c9aa) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#12 0x55a7ed04d338 in ShardedThreadPool::WorkThreadSharded::entry() (/usr/bin/ceph-osd+0x2c7d338) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#13 0x55a7ecf92b46 in Thread::entry_wrapper() (/usr/bin/ceph-osd+0x2bc2b46) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#14 0x55a7ecf92b82 in Thread::_entry_func(void*) (/usr/bin/ceph-osd+0x2bc2b82) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#15 0x7efdc1e8a3b1 in start_thread (/lib64/libc.so.6+0x8a3b1) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#16 0x7efdc1f0f42f in __GI___clone3 (/lib64/libc.so.6+0x10f42f) 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr: 2025-03-22T04:27:58.443 INFO:tasks.ceph.osd.3.smithi102.stderr:0x611000570d40 is located 0 bytes inside of 256-byte region [0x611000570d40,0x611000570e40) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr:allocated by thread T817 here: 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: #0 0x7efdc30b64d7 in operator new[](unsigned long) (/lib64/libasan.so.6+0xb64d7) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: #1 0x7efd9e2f0e32 in ErasureCodeIsaDefault::prepare() (/usr/lib64/ceph/erasure-code/libec_isa.so+0x46e32) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: #2 0x7efd9e2bfa3e in ErasureCodeIsa::init(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::ostream*) (/usr/lib64/ceph/erasure-code/libec_isa.so+0x15a3e) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#3 0x7efd9e2ff510 in ErasureCodePluginIsa::factory(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::shared_ptr<ceph::ErasureCodeInterface>*, std::ostream*) (/usr/lib64/ceph/erasure-code/libec_isa.so+0x55510) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#4 0x55a7ef19b395 in ceph::ErasureCodePluginRegistry::factory(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::shared_ptr<ceph::ErasureCodeInterface>*, std::ostream*) (/usr/bin/ceph-osd+0x4dcb395) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#5 0x55a7eb9089d8 in PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&, PGBackend::Listener*, coll_t, boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, ceph::common::CephContext*) (/usr/bin/ceph-osd+0x15389d8) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#6 0x55a7eb4b63ec in PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap const>, PGPool const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > > const&, spg_t) (/usr/bin/ceph-osd+0x10e63ec) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#7 0x55a7eaef1cc7 in OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t) (/usr/bin/ceph-osd+0xb21cc7) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#8 0x55a7eaef8d69 in OSD::handle_pg_create_info(std::shared_ptr<OSDMap const> const&, PGCreateInfo const*) (/usr/bin/ceph-osd+0xb28d69) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#9 0x55a7eafeb9ed in OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*) (/usr/bin/ceph-osd+0xc1b9ed) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#10 0x55a7ed04c9aa in ShardedThreadPool::shardedthreadpool_worker(unsigned int) (/usr/bin/ceph-osd+0x2c7c9aa) 2025-03-22T04:27:58.444 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#11 0x55a7ed04d338 in ShardedThreadPool::WorkThreadSharded::entry() (/usr/bin/ceph-osd+0x2c7d338) 2025-03-22T04:27:58.445 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#12 0x55a7ecf92b46 in Thread::entry_wrapper() (/usr/bin/ceph-osd+0x2bc2b46) 2025-03-22T04:27:58.445 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#13 0x55a7ecf92b82 in Thread::_entry_func(void*) (/usr/bin/ceph-osd+0x2bc2b82) 2025-03-22T04:27:58.445 INFO:tasks.ceph.osd.3.smithi102.stderr: ceph#14 0x7efdc1e8a3b1 in start_thread (/lib64/libc.so.6+0x8a3b1) 2025-03-22T04:27:58.445 INFO:tasks.ceph.osd.3.smithi102.stderr: ``` Fixes: https://tracker.ceph.com/issues/70619 Signed-off-by: Kefu Chai <tchaikov@gmail.com>
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Apr 15, 2025 
    
    
      
  
    
      
    
  
… overflow()
When sanitizer is enabled, unittest_log fails as following
```
[ RUN      ] Log.StderrPipeBig
=================================================================
==3302372==ERROR: AddressSanitizer: heap-use-after-free on address 0xffff96e01d00 at pc 0xaaaadd3db754 bp 0xffffd9ebffa0 sp 0xffffd9ebf790
READ of size 4096 at 0xffff96e01d00 thread T0
    #0 0xaaaadd3db750 in __asan_memmove (/root/ceph-19.0.0/build/bin/unittest_log+0x3fb750) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
    #1 0xffffafc23734 in char const* boost::container::dtl::memmove_n_source<char const*, char*>(char const*, unsigned long, char*) /root/ceph-19.0.0/build/boost/include/boost/container/detail/copy_move_algo.hpp:261:10
    #2 0xffffafc23734 in boost::container::dtl::enable_if_memtransfer_copy_constructible<char const*, char*, char const*>::type boost::container::uninitialized_copy_alloc_n_source<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*, char*>(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char const*, unsigned long, char*) /root/ceph-19.0.0/build/boost/include/boost/container/detail/copy_move_algo.hpp:600:11
    ceph#3 0xffffafc23734 in void boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>::uninitialized_copy_n_and_update<char*>(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/detail/advanced_insert_int.hpp:85:22
    ceph#4 0xffffafc23734 in void boost::container::expand_forward_and_insert_alloc<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char*, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*> >(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char*, char*, unsigned long, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>) /root/ceph-19.0.0/build/boost/include/boost/container/detail/copy_move_algo.hpp:1469:23
    ceph#5 0xffffafc23734 in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_expand_forward<boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*> >(char*, unsigned long, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>, boost::move_detail::integral_constant<bool, false>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:3058:7
    ceph#6 0xffffafc23734 in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range<boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*> >(char* const&, unsigned long, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2890:16
    ceph#7 0xffffafc23734 in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::insert<char const*>(boost::container::vec_iterator<char*, true>, char const*, char const*, boost::move_detail::disable_if_or<void, boost::move_detail::is_convertible<char const*, unsigned long>, boost::container::dtl::is_input_iterator<char const*, has_iterator_category<char const*>::value>, boost::move_detail::bool_<false>, boost::move_detail::bool_<false> >::type*) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2088:20
    ceph#8 0xffffafc23734 in ceph::logging::ConcreteEntry::ConcreteEntry(ceph::logging::Entry const&) /root/ceph-19.0.0/src/log/Entry.h:84:9
    ceph#9 0xffffafc21a88 in decltype(new ((void*)(0))ceph::logging::ConcreteEntry(std::declval<ceph::logging::Entry>())) std::construct_at<ceph::logging::ConcreteEntry, ceph::logging::Entry>(ceph::logging::ConcreteEntry*, ceph::logging::Entry&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
    ceph#10 0xffffafc21198 in void std::allocator_traits<std::allocator<ceph::logging::ConcreteEntry> >::construct<ceph::logging::ConcreteEntry, ceph::logging::Entry>(std::allocator<ceph::logging::ConcreteEntry>&, ceph::logging::ConcreteEntry*, ceph::logging::Entry&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
    ceph#11 0xffffafc16464 in ceph::logging::ConcreteEntry& std::vector<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::emplace_back<ceph::logging::Entry>(ceph::logging::Entry&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:115:6
    ceph#12 0xffffafc0dcbc in ceph::logging::Log::submit_entry(ceph::logging::Entry&&) /root/ceph-19.0.0/src/log/Log.cc:265:9
    ceph#13 0xaaaadd41a404 in Log_StderrPipeBig_Test::TestBody() /root/ceph-19.0.0/src/log/test.cc:280:9
    ceph#14 0xaaaade0b4338 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
    ceph#15 0xaaaade061244 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
    ceph#16 0xaaaade012680 in testing::Test::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2680:5
    ceph#17 0xaaaade0145c4 in testing::TestInfo::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2858:11
    ceph#18 0xaaaade015bc4 in testing::TestSuite::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:3012:28
    ceph#19 0xaaaade031988 in testing::internal::UnitTestImpl::RunAllTests() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5723:44
    ceph#20 0xaaaade0be24c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
    ceph#21 0xaaaade0687dc in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
    ceph#22 0xaaaade030e00 in testing::UnitTest::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5306:10
    ceph#23 0xaaaadd425c48 in RUN_ALL_TESTS() /root/ceph-19.0.0/src/googletest/googletest/include/gtest/gtest.h:2486:46
    ceph#24 0xaaaadd4207a0 in main /root/ceph-19.0.0/src/log/test.cc:503:10
    ceph#25 0xffffac3473f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ceph#26 0xffffac3474c8 in __libc_start_main csu/../csu/libc-start.c:392:3
    ceph#27 0xaaaadd364d6c in _start (/root/ceph-19.0.0/build/bin/unittest_log+0x384d6c) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
0xffff96e01d00 is located 0 bytes inside of 6553-byte region [0xffff96e01d00,0xffff96e03699)
freed by thread T0 here:
    #0 0xaaaadd4136f0 in operator delete(void*) (/root/ceph-19.0.0/build/bin/unittest_log+0x4336f0) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
    #1 0xaaaadd434968 in boost::container::new_allocator<char>::deallocate(char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/new_allocator.hpp:171:7
    #2 0xaaaadd434934 in boost::container::allocator_traits<boost::container::new_allocator<char> >::deallocate(boost::container::new_allocator<char>&, char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:308:9
    ceph#3 0xaaaadd434934 in boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>::deallocate(char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/small_vector.hpp:255:10
    ceph#4 0xaaaadd43911c in boost::container::allocator_traits<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void> >::deallocate(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:308:9
    ceph#5 0xaaaadd43911c in boost::container::vector_alloc_holder<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, unsigned long, boost::move_detail::integral_constant<unsigned int, 1u> >::deallocate(char* const&, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:487:7
    ceph#6 0xaaaadd43911c in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_new_allocation<boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&> >(char*, unsigned long, char*, unsigned long, boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:3080:25
    ceph#7 0xaaaadd438aec in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_no_capacity<boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&> >(char*, unsigned long, boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&>, boost::move_detail::integral_constant<unsigned int, 1u>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2830:13
    ceph#8 0xaaaadd4328bc in char& boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::emplace_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1888:24
    ceph#9 0xaaaadd4328bc in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_push_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2746:13
    ceph#10 0xaaaadd4328bc in boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::push_back(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1996:4
    ceph#11 0xaaaadd4328bc in StackStringBuf<4096ul>::overflow(int) /root/ceph-19.0.0/src/common/StackStringStream.h:79:11
    ceph#12 0xffffac6d3dac in std::ostream::put(char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x133dac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#13 0xffffac6d4aac in std::basic_ostream<char, std::char_traits<char> >& std::operator<<<std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x134aac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#14 0xaaaadd41a3c8 in Log_StderrPipeBig_Test::TestBody() /root/ceph-19.0.0/src/log/test.cc:278:9
    ceph#15 0xaaaade0b4338 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
    ceph#16 0xaaaade061244 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
    ceph#17 0xaaaade012680 in testing::Test::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2680:5
    ceph#18 0xaaaade0145c4 in testing::TestInfo::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2858:11
    ceph#19 0xaaaade015bc4 in testing::TestSuite::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:3012:28
    ceph#20 0xaaaade031988 in testing::internal::UnitTestImpl::RunAllTests() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5723:44
    ceph#21 0xaaaade0be24c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
    ceph#22 0xaaaade0687dc in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
    ceph#23 0xaaaade030e00 in testing::UnitTest::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5306:10
    ceph#24 0xaaaadd425c48 in RUN_ALL_TESTS() /root/ceph-19.0.0/src/googletest/googletest/include/gtest/gtest.h:2486:46
    ceph#25 0xaaaadd4207a0 in main /root/ceph-19.0.0/src/log/test.cc:503:10
    ceph#26 0xffffac3473f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ceph#27 0xffffac3474c8 in __libc_start_main csu/../csu/libc-start.c:392:3
    ceph#28 0xaaaadd364d6c in _start (/root/ceph-19.0.0/build/bin/unittest_log+0x384d6c) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
previously allocated by thread T0 here:
    #0 0xaaaadd412e88 in operator new(unsigned long) (/root/ceph-19.0.0/build/bin/unittest_log+0x432e88) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
    #1 0xaaaadd433ec0 in boost::container::new_allocator<char>::allocate(unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/new_allocator.hpp:160:30
    #2 0xaaaadd438a68 in boost::container::allocator_traits<boost::container::new_allocator<char> >::priv_allocate(boost::move_detail::integral_constant<bool, false>, boost::container::new_allocator<char>&, unsigned long, void const*) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:395:16
    ceph#3 0xaaaadd438a68 in boost::container::allocator_traits<boost::container::new_allocator<char> >::allocate(boost::container::new_allocator<char>&, unsigned long, void const*) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:318:14
    ceph#4 0xaaaadd438a68 in boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>::allocate(unsigned long, void const*) /root/ceph-19.0.0/build/boost/include/boost/container/small_vector.hpp:248:14
    ceph#5 0xaaaadd438a68 in boost::container::allocator_traits<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void> >::allocate(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:302:16
    ceph#6 0xaaaadd438a68 in boost::container::vector_alloc_holder<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, unsigned long, boost::move_detail::integral_constant<unsigned int, 1u> >::allocate(unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:482:14
    ceph#7 0xaaaadd438a68 in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_no_capacity<boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&> >(char*, unsigned long, boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&>, boost::move_detail::integral_constant<unsigned int, 1u>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2826:73
    ceph#8 0xaaaadd4328bc in char& boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::emplace_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1888:24
    ceph#9 0xaaaadd4328bc in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_push_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2746:13
    ceph#10 0xaaaadd4328bc in boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::push_back(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1996:4
    ceph#11 0xaaaadd4328bc in StackStringBuf<4096ul>::overflow(int) /root/ceph-19.0.0/src/common/StackStringStream.h:79:11
    ceph#12 0xffffac6d3dac in std::ostream::put(char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x133dac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#13 0xffffac6d4aac in std::basic_ostream<char, std::char_traits<char> >& std::operator<<<std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x134aac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#14 0xaaaadd41a3c8 in Log_StderrPipeBig_Test::TestBody() /root/ceph-19.0.0/src/log/test.cc:278:9
    ceph#15 0xaaaade0b4338 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
    ceph#16 0xaaaade061244 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
    ceph#17 0xaaaade012680 in testing::Test::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2680:5
    ceph#18 0xaaaade0145c4 in testing::TestInfo::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2858:11
    ceph#19 0xaaaade015bc4 in testing::TestSuite::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:3012:28
    ceph#20 0xaaaade031988 in testing::internal::UnitTestImpl::RunAllTests() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5723:44
    ceph#21 0xaaaade0be24c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
    ceph#22 0xaaaade0687dc in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
    ceph#23 0xaaaade030e00 in testing::UnitTest::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5306:10
    ceph#24 0xaaaadd425c48 in RUN_ALL_TESTS() /root/ceph-19.0.0/src/googletest/googletest/include/gtest/gtest.h:2486:46
    ceph#25 0xaaaadd4207a0 in main /root/ceph-19.0.0/src/log/test.cc:503:10
    ceph#26 0xffffac3473f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ceph#27 0xffffac3474c8 in __libc_start_main csu/../csu/libc-start.c:392:3
    ceph#28 0xaaaadd364d6c in _start (/root/ceph-19.0.0/build/bin/unittest_log+0x384d6c) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
SUMMARY: AddressSanitizer: heap-use-after-free (/root/ceph-19.0.0/build/bin/unittest_log+0x3fb750) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409) in __asan_memmove
Shadow bytes around the buggy address:
  0x200ff2dc0350: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff2dc0360: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff2dc0370: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff2dc0380: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff2dc0390: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x200ff2dc03a0:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x200ff2dc03b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x200ff2dc03c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x200ff2dc03d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x200ff2dc03e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x200ff2dc03f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==3302372==ABORTING
```
vec.push_back(str) will allocate memory and release the old one once
there is insufficient memory which causing the old one to be invalid. So
streambuf's data pointer and insertion position should be updated to
newly allocated memory's address in vec.
Fixes: https://tracker.ceph.com/issues/65805
Signed-off-by: Rongqi Sun <sunrongqi@huawei.com>
(cherry picked from commit c8d51b9)
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      May 25, 2025 
    
    
      
  
    
      
    
  
…ms in destructor
Fix memory leak in test_not_before_queue identified by AddressSanitizer.
Previously, the test was terminating without properly dequeueing all elements,
causing resource leaks during teardown.
This change implements a proper cleanup mechanism that:
1. Advances the queue's time until all elements become eligible for processing
2. Dequeues all remaining elements to ensure proper destruction
3. Guarantees clean teardown even for elements with future timestamps
The fix eliminates ASan-reported leaks occurring in the not_before_queue_t::enqueue
method when test cases were torn down prematurely.
A sample of the error from ASan:
```
Direct leak of 1800 byte(s) in 15 object(s) allocated from:
    #0 0x7f71b1f1ab5b in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:86
    #1 0x55fb4c977058 in void not_before_queue_t<tv_t, test_time_t>::enqueue<tv_t const&>(tv_t const&) /home/kefu/dev/ceph/src/common/not_before_queue.h:164
    #2 0x55fb4c97748b in NotBeforeTest::load_test_data(std::vector<tv_t, std::allocator<tv_t> > const&) /home/kefu/dev/ceph/src/test/test_not_before_queue.cc:67
    ceph#3 0x55fb4c961112 in NotBeforeTest_RemoveIfByClass_no_cond_Test::TestBody() /home/kefu/dev/ceph/src/test/test_not_before_queue.cc:213
    ceph#4 0x55fb4ca05f67 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
    ceph#5 0x55fb4ca1c4f7 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
    ceph#6 0x55fb4c9e1104 in testing::Test::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2728
    ceph#7 0x55fb4c9e16e2 in testing::TestInfo::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2874
    ceph#8 0x55fb4c9e73b4 in testing::TestSuite::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:3052
    ceph#9 0x55fb4c9f059b in testing::internal::UnitTestImpl::RunAllTests() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:6004
    ceph#10 0x55fb4ca064ff in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
    ceph#11 0x55fb4ca1d1bf in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
    ceph#12 0x55fb4c9e124d in testing::UnitTest::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:5583
    ceph#13 0x55fb4c97a0b6 in RUN_ALL_TESTS() /home/kefu/dev/ceph/src/googletest/googletest/include/gtest/gtest.h:2334
    ceph#14 0x55fb4c979ffc in main /home/kefu/dev/ceph/src/googletest/googlemock/src/gmock_main.cc:71
    ceph#15 0x7f71ae833ca7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
SUMMARY: AddressSanitizer: 1800 byte(s) leaked in 15 allocation(s).
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      May 25, 2025 
    
    
      
  
    
      
    
  
Previously, in __ceph_abort and related abort handlers, we allocated
ClibBackTrace instances using raw pointers without proper cleanup. Since
these handlers terminate execution, the leaks didn't affect production
systems but were correctly flagged by ASan during testing:
```
Direct leak of 288 byte(s) in 1 object(s) allocated from:
    #0 0x55aefe8cb65d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_ceph_assert+0x1f465d) (BuildId: a4faeddac80b0d81062bd53ede3388c0c10680bc)
    #1 0x7f3b84da988d in ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...) /home/jenkins-build/build/workspace/ceph-pull-requests/src/common/assert.cc:157:21
    #2 0x55aefe8cf04b in supressed_assertf_line22() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/ceph_assert.cc:22:3
    ceph#3 0x55aefe8ce4e6 in CephAssertDeathTest_ceph_assert_supresssions_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/ceph_assert.cc:31:3
    ceph#4 0x55aefe99135d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2653:10
    ceph#5 0x55aefe94f015 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2689:14
...
```
This commit resolves the issue by using std::unique_ptr to manage the
lifecycle of backtrace objects, ensuring proper cleanup even in
non-returning functions. While these leaks had no practical impact in
production (as the process terminates anyway), fixing them improves code
quality and eliminates false positives in memory analysis tools.
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Jun 26, 2025 
    
    
      
  
    
      
    
  
Replace raw pointers with unique_ptr for AioCompletion instances in
test_librados_completion to prevent memory leaks. Previously, each test
case allocated AioCompletion objects but never freed them, causing
AddressSanitizer to report leaks.
The unique_ptr automatically manages the object lifecycle, ensuring
cleanup when the pointer goes out of scope.
Fixes ASan error:
```
`==1199357==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x5602d9f0eaad in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_librados_completion+0x1f2aad) (BuildId: ef5b4b8f0a479e21b6a2686519ff4c3ef71b9f94)
    #1 0x7f3ac9b776f4 in librados::v14_2_0::Rados::aio_create_completion() /home/jenkins-build/build/workspace/ceph-pull-requests/src/librados/librados_cxx.cc:2892:10
    #2 0x5602d9f11a0a in CoroExcept_AioComplete_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/common/test_librados_completion.cc:54:14
    ceph#3 0x5602da01d69d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2653:10
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Jun 26, 2025 
    
    
      
  
    
      
    
  
Previously, we had memory leak in the test_bluestore_types.cc tests where
`BufferCacheShard` and `OnodeCacheShard` objects were allocated with
raw pointers but never freed, causing leaks detected by AddressSanitizer.
ASan rightly pointed this out:
```
Direct leak of 224 byte(s) in 1 object(s) allocated from:
    #0 0x55a7432a079d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_bluestore_types+0xf2e79d) (BuildId: c3bec647afa97df6bb147bc82eac937531fc6272)
    #1 0x55a743523340 in BlueStore::BufferCacheShard::create(BlueStore*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, ceph::common::PerfCounters*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/os/bluestore/Bl
ueStore.cc:1678:9
    #2 0x55a74330b617 in ExtentMap_seek_lextent_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/objectstore/test_bluestore_types.cc:1077:7
    ceph#3 0x55a7434f2b2d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.
cc:2653:10
    ceph#4 0x55a7434b5775 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:
2689:14
    ceph#5 0x55a74347005d in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2728:5
```
```
Direct leak of 9928 byte(s) in 1 object(s) allocated from:
    #0 0x7ff249d21a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
    #1 0x6048ed878b76 in BlueStore::OnodeCacheShard::create(ceph::common::CephContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::common::PerfCounters*) /home/kefu/dev/ceph/src/os/bluestore/BlueStore.cc:1219
    #2 0x6048ed66d4f9 in GarbageCollector_BasicTest_Test::TestBody() /home/kefu/dev/ceph/src/test/objectstore/test_bluestore_types.cc:2662
    ceph#3 0x6048ed820555 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
    ceph#4 0x6048ed80c78a in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
    ceph#5 0x6048ed7b8bfa in testing::Test::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2728
```
In this change, we replace raw pointer allocation with unique_ptr to
ensure automatic cleanup when the objects go out of scope.
`
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Jun 26, 2025 
    
    
      
  
    
      
    
  
Memory leak was detected by AddressSanitizer in dbstore tests. Objects
inserted into a static objectmap were not being properly freed when
tests completed without explicitly deleting buckets.
The leak occurred because:
- Objects were preserved in a static map after insertion
- Objects were only freed by DB::objectmapDelete() when deleting the
  corresponding bucket
- In tests, buckets weren't always deleted after insertion, leaving
  objects allocated
ASan rightly pointed this out:
```
Direct leak of 200 byte(s) in 1 object(s) allocated from:
    #0 0x55c420f5274d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_dbstore_tests+0x4df74d) (BuildId: c19ed2d2b1ead306fdc59fc311f546a287300010)
    #1 0x55c42143cfaf in SQLGetBucket::Execute(DoutPrefixProvider const*, rgw::store::DBOpParams*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/sqlite/sqliteDB.cc:1689:11
    #2 0x55c42102751c in rgw::store::DB::ProcessOp(DoutPrefixProvider const*, std::basic_string_view<char, std::char_traits<char>>, rgw::store::DBOpParams*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/common/dbstore.cc:251:16
    ceph#3 0x55c4210328c9 in rgw::store::DB::get_bucket_info(DoutPrefixProvider const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, RGWBucketInfo&, std::map<std::__cxx11
::basic_string<char, std::char_traits<char>, std::allocator<char>>, ceph::buffer::v15_2_0::list, std::less<std::__cxx11::basic_string<char, std::c
har_traits<char>, std::allocator<char>>>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, ceph::buffer::v15_2_0::list>>>*, std::chrono::time_point<ceph::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l>>>*, obj_version*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/common/dbstore.cc:450:9
    ceph#4 0x55c421034357 in rgw::store::DB::create_bucket(DoutPrefixProvider const*, std::variant<rgw_user, rgw_account_id> const&, rgw_bucket const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&, rgw_placement_rule const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, ceph::buffer::v15_2_0::list, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const, ceph::buffer::v15_2_0::list>>> const&, std::optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>> const&, std::optional<RGWQuotaInfo> const&, std::optional<std::chrono::time_point<ceph::real_clock, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l>>>>, obj_version*, RGWBucketInfo&, optional_yield) /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/common/dbstore.cc:504:9
    ceph#5 0x55c420f6c3ed in DBStoreTest_CreateBucket_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/rgw/driver/dbstore/tests/dbstore_tests.cc:495:13
    ceph#6 0x55c42174fa0d in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2653:10
    ceph#7 0x55c421717f25 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2689:14
    ceph#8 0x55c4216d4bbd in testing::Test::Run() /home/jenkins-build/build/workspace/ceph-pull-requests/src/googletest/googletest/src/gtest.cc:2728:5
```
In this change, we:
- Free objectmap entries in DB::Destroy() to ensure cleanup on shutdown
- Call DB::Destroy() in DBStoreTest::TearDown() to guarantee cleanup after each test
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Jun 26, 2025 
    
    
      
  
    
      
    
  
Previously, we had Resource leak in Directory::for_each() where
directory streams opened with fdopendir() were never properly closed,
causing a 32KB leak per unclosed directory handle.
In this change, we add closedir() call to properly release directory stream
resources after enumeration completes.
ASan report:
```
Direct leak of 32816 byte(s) in 1 object(s) allocated from:
    #0 0x738387320e15 in malloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:67
    #1 0x738381383514  (/usr/lib/libc.so.6+0xe3514) (BuildId: 468e3585c794491a48ea75fceb9e4d6b1464fc35)
    #2 0x738381383418 in fdopendir (/usr/lib/libc.so.6+0xe3418) (BuildId: 468e3585c794491a48ea75fceb9e4d6b1464fc35)
    ceph#3 0x6433e1aefac4 in for_each<rgw::sal::Directory::fill_cache(const DoutPrefixProvider*, optional_yield, rgw::sal::fill_cache_cb_t&)::<lambda(char const*)> > /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:874
    ceph#4 0x6433e1aa10ab in rgw::sal::Directory::fill_cache(DoutPrefixProvider const*, optional_yield, fu2::abi_310::detail::function<fu2::abi_310::detail::config<true, false, 16ul>, fu2::abi_310::detail::property<true, false, int (DoutPrefixProvider const*, rgw_bucket_dir_ent
ry&) const> > const&) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:1077
    ceph#5 0x6433e1aa8974 in rgw::sal::MPDirectory::fill_cache(DoutPrefixProvider const*, optional_yield, fu2::abi_310::detail::function<fu2::abi_310::detail::config<true, false, 16ul>, fu2::abi_310::detail::property<true, false, int (DoutPrefixProvider const*, rgw_bucket_dir_e
ntry&) const> > const&) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:1384
    ceph#6 0x6433e1abf1ee in rgw::sal::POSIXBucket::fill_cache(DoutPrefixProvider const*, optional_yield, fu2::abi_310::detail::function<fu2::abi_310::detail::config<true, false, 16ul>, fu2::abi_310::detail::property<true, false, int (DoutPrefixProvider const*, rgw_bucket_dir_e
ntry&) const> > const&) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:2236
    ceph#7 0x6433e1c43956 in file::listing::BucketCache<rgw::sal::POSIXDriver, rgw::sal::POSIXBucket>::fill(DoutPrefixProvider const*, file::listing::BucketCacheEntry<rgw::sal::POSIXDriver, rgw::sal::POSIXBucket>*, rgw::sal::POSIXBucket*, unsigned int, optional_yield) /home/kef
u/dev/ceph/src/rgw/driver/posix/bucket_cache.h:368
    ceph#8 0x6433e1c1c71c in file::listing::BucketCache<rgw::sal::POSIXDriver, rgw::sal::POSIXBucket>::list_bucket(DoutPrefixProvider const*, optional_yield, rgw::sal::POSIXBucket*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, fu2::abi_310::d
etail::function<fu2::abi_310::detail::config<true, false, 16ul>, fu2::abi_310::detail::property<true, false, bool (rgw_bucket_dir_entry const&) const> >) /home/kefu/dev/ceph/src/rgw/driver/posix/bucket_cache.h:410
    ceph#9 0x6433e1ac1829 in rgw::sal::POSIXBucket::list(DoutPrefixProvider const*, rgw::sal::Bucket::ListParams&, int, rgw::sal::Bucket::ListResults&, optional_yield) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:2256
    ceph#10 0x6433e1ae1302 in rgw::sal::POSIXMultipartUpload::list_parts(DoutPrefixProvider const*, ceph::common::CephContext*, int, int, int*, bool*, optional_yield, bool) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:3692
    ceph#11 0x6433e1ae2f3b in rgw::sal::POSIXMultipartUpload::complete(DoutPrefixProvider const*, optional_yield, ceph::common::CephContext*, std::map<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<int>, std::allocator<std::pair<
int const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&, std::__cxx11::list<rgw_obj_index_key, std::allocator<rgw_obj_index_key> >&, unsigned long&, bool&, RGWCompressionInfo&, long&, std::__cxx11::basic_string<char, std::char_trait
s<char>, std::allocator<char> >&, ACLOwner&, unsigned long, rgw::sal::Object*, boost::container::flat_map<unsigned int, boost::container::flat_set<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std
::char_traits<char>, std::allocator<char> > >, void>, std::less<unsigned int>, void>&) /home/kefu/dev/ceph/src/rgw/driver/posix/rgw_sal_posix.cc:3770
    ceph#12 0x6433e0fffb73 in POSIXMPObjectTest::create_MPObj(rgw::sal::MultipartUpload*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) /home/kefu/dev/ceph/src/test/rgw/test_rgw_posix_driver.cc:1717
    ceph#13 0x6433e0e95ab9 in POSIXMPObjectTest_BucketList_Test::TestBody() /home/kefu/dev/ceph/src/test/rgw/test_rgw_posix_driver.cc:1863
    ceph#14 0x6433e1308d17 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
```
Fixes https://tracker.ceph.com/issues/71505
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Jun 26, 2025 
    
    
      
  
    
      
    
  
Previously, error messages passed to luaL_error() were formatted using
std::string concatenation. Since luaL_error() never returns (it throws
a Lua exception via longjmp), the allocated std::string memory was
leaked, as detected by AddressSanitizer:
```
Direct leak of 105 byte(s) in 1 object(s) allocated from:
    #0 0x7fc5f1921a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
    #1 0x563bd89144c7 in std::__new_allocator<char>::allocate(unsigned long, void const*) /usr/include/c++/15.1.1/bits/new_allocator.h:151
    #2 0x563bd89144c7 in std::allocator<char>::allocate(unsigned long) /usr/include/c++/15.1.1/bits/allocator.h:203
    ceph#3 0x563bd89144c7 in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/alloc_traits.h:614
    ceph#4 0x563bd89144c7 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.h:142
    ceph#5 0x563bd89144c7 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:164
    ceph#6 0x563bd896ae1b in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:363
    ceph#7 0x563bd896b256 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:455
    ceph#8 0x563bd896b2bb in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::append(char const*) /usr/include/c++/15.1.1/bits/basic_string.h:1585
    ceph#9 0x563bd943c2f2 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, char const*) /usr/include/c++/15.1.1/bits/basic_string.h:3977
    ceph#10 0x563bd943c2f2 in rgw::lua::lua_state_guard::runtime_hook(lua_State*, lua_Debug*) /home/kefu/dev/ceph/src/rgw/rgw_lua_utils.cc:245
    ceph#11 0x7fc5f139f8ef  (/usr/lib/liblua.so.5.4+0xe8ef) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    ceph#12 0x7fc5f139fbfe  (/usr/lib/liblua.so.5.4+0xebfe) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    ceph#13 0x7fc5f13b26fe  (/usr/lib/liblua.so.5.4+0x216fe) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    ceph#14 0x7fc5f139f581  (/usr/lib/liblua.so.5.4+0xe581) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    ceph#15 0x7fc5f139b735  (/usr/lib/liblua.so.5.4+0xa735) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    ceph#16 0x7fc5f139ba8f  (/usr/lib/liblua.so.5.4+0xaa8f) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    ceph#17 0x7fc5f139f696 in lua_pcallk (/usr/lib/liblua.so.5.4+0xe696) (BuildId: b7533e2973d4b0d82e10fc29973ec5a8d355d2b8)
    ceph#18 0x563bd8a925ef in rgw::lua::request::execute(rgw::sal::Driver*, RGWREST*, OpsLogSink*, req_state*, RGWOp*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/kefu/dev/ceph/src/rgw/rgw_lua_request.cc:824
    ceph#19 0x563bd8952e3d in TestRGWLua_LuaRuntimeLimit_Test::TestBody() /home/kefu/dev/ceph/src/test/rgw/test_rgw_lua.cc:1628
    ceph#20 0x563bd8a37817 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
    ceph#21 0x563bd8a509b5 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (/home/kefu/dev/ceph/build/bin/unittest_rgw_lua+0x11199b5) (BuildId: b2628caba5290d882d25f7bea166f058b682bc85)`
```
This change replaces std::string formatting with stack-allocated buffer
and std::to_chars() to eliminate the memory leak.
Note: We cannot format int64_t directly through luaL_error() because
lua_pushfstring() does not support long long or int64_t format specifiers,
even in Lua 5.4 (see https://www.lua.org/manual/5.4/manual.html#lua_pushfstring).
Since libstdc++ uses int64_t for std::chrono::milliseconds::rep, we use
std::to_chars() for safe, efficient conversion without heap allocation.
The maximum runtime limit was a configuration introduced by 3e3cb15.
Fixes: https://tracker.ceph.com/issues/71595
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Jun 26, 2025 
    
    
      
  
    
      
    
  
…write
Fix 4KB memory leak in ErasureCodePlugin_parity_delta_write_Test caused by
unmanaged raw buffer allocation. The test was allocating a 4096-byte raw
buffer to replace shard 4 for delta encoding validation, but the buffer::ptr
constructed from the raw pointer did not manage the buffer's lifecycle.
Detected by AddressSanitizer:
```
Direct leak of 4096 byte(s) in 1 object(s) allocated from:
    #0 0x7fb73a720e15 in malloc /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_malloc_linux.cpp:67
    #1 0x5562f4062ccc in ErasureCodePlugin_parity_delta_write_Test::TestBody() /home/kefu/dev/ceph/src/test/erasure-code/TestErasureCodePluginJerasure.cc:122
    #2 0x5562f41081a1 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
    ceph#3 0x5562f40f3004 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689
    ceph#4 0x5562f409cbba in testing::Test::Run() /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2728```
```
In this change, we replace raw pointer allocation with
create_bufferptr() to ensure proper memory management by buffer::ptr.
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Jun 26, 2025 
    
    
      
  
    
      
    
  
Previously, SyncPoint allocated two C_Gather instances tracked by raw
pointers but failed to properly clean them up when only a single sync
point existed, causing memory leaks detected by AddressSanitizer.
This change fixes the leak by modifying AbstractWriteLog::shut_down()
to check for prior sync points in the chain. When the current sync point
is the only one present, we now activate the m_prior_log_entries_persisted
context to ensure:
- The onfinish callback executes and releases the captured strong
  reference to the enclosing SyncPoint
- The parent m_sync_point_persist context completes and gets properly
  released
This ensures all allocated contexts are cleaned up correctly during
shutdown, eliminating the memory leak.
The ASan report:
```
Indirect leak of 2064 byte(s) in 1 object(s) allocated from:
    #0 0x56440919ae2d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_librbd+0x2f3de2d) (BuildId: 6a04677c6ee5235f1a41815df807f97c5b96d4cd)
    #1 0x56440bd67751 in __gnu_cxx::new_allocator<Context*>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
    #2 0x56440bd676e0 in std::allocator<Context*>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
    ceph#3 0x56440bd676e0 in std::allocator_traits<std::allocator<Context*>>::allocate(std::allocator<Context*>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
    ceph#4 0x56440bd6730b in std::_Vector_base<Context*, std::allocator<Context*>>::_M_allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
    ceph#5 0x7fd33e00e8d1 in std::vector<Context*, std::allocator<Context*>>::reserve(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:78:22
    ceph#6 0x7fd33e00c51c in librbd::cache::pwl::SyncPoint::SyncPoint(unsigned long, ceph::common::CephContext*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/SyncPoint.cc:20:27
    ceph#7 0x56440bd65f26 in decltype(::new((void*)(0)) librbd::cache::pwl::SyncPoint(std::declval<unsigned long&>(), std::declval<ceph::common::CephContext*&>())) std::construct_at<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
    ceph#8 0x56440bd65b98 in void std::allocator_traits<std::allocator<librbd::cache::pwl::SyncPoint>>::construct<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>&, librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
    ceph#9 0x56440bd657d3 in std::_Sp_counted_ptr_inplace<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:519:4
    ceph#10 0x56440bd65371 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*&, std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:651:6
    ceph#11 0x56440bd65163 in std::__shared_ptr<librbd::cache::pwl::SyncPoint, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1342:14
    ceph#12 0x56440bd650e6 in std::shared_ptr<librbd::cache::pwl::SyncPoint>::shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:409:4
    ceph#13 0x56440bd65057 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::allocate_shared<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint> const&, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:862:14
    ceph#14 0x56440bca97e7 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::make_shared<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:878:14
    ceph#15 0x56440bd443c8 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::new_sync_point(librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1905:20
    ceph#16 0x56440bd42e4c in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1951:3
    ceph#17 0x56440bd9cbf2 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point_if_needed(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1990:5
    ceph#18 0x56440bd9c636 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&)::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2152:9
    ceph#19 0x56440bd9b9b4 in boost::detail::function::void_function_obj_invoker<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&), void, librbd::cache::pwl::GuardedRequestFunctionContext&>::invoke(boost::detail::function::function_buffer&, librbd::cache::pwl::GuardedRequestFunctionContext&) /opt/ceph/include/boost/function/function_template.hpp:100:11
    ceph#20 0x56440bd29321 in boost::function_n<void, librbd::cache::pwl::GuardedRequestFunctionContext&>::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /opt/ceph/include/boost/function/function_template.hpp:789:14
    ceph#21 0x56440bd28d85 in librbd::cache::pwl::GuardedRequestFunctionContext::finish(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/Request.h:335:5
    ceph#22 0x5644091e0fe0 in Context::complete(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/include/Context.h:102:5
    ceph#23 0x56440bd9b378 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::detain_guarded_request(librbd::cache::pwl::C_BlockIORequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::GuardedRequestFunctionContext*, bool) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1202:20
    ceph#24 0x56440bd96c50 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2154:3
    ceph#25 0x56440bd1e4b5 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::shut_down(Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:703:3
    ceph#26 0x56440bdb9022 in librbd::cache::pwl::TestMockCacheSSDWriteLog_compare_and_write_compare_matched_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/librbd/cache/pwl/test_mock_SSDWriteLog.cc:403:7
```
Fixes: https://tracker.ceph.com/issues/71335
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Jul 4, 2025 
    
    
      
  
    
      
    
  
…_list with array
Previously, we used std::array<std::initializer_list<int>, 27> to store
a multi-dimensional array. However, initializer_list objects only hold
pointers to their underlying data, not the data itself. When initialized
with brace-enclosed lists like {0,1,2,3}, the temporary arrays created
by these literals are destroyed after the initialization expression
completes, leaving the initializer_list objects pointing to deallocated
memory.
This caused AddressSanitizer to detect stack-use-after-scope errors when
getint() attempted to iterate over the initializer_list contents:
```
==2085499==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7f5fe9803580 at pc 0x55d851bea586 bp 0x7ffc9816a5b0 sp 0x7ffc9816a5a8
READ of size 4 at 0x7f5fe9803580 thread T0
    #0 0x55d851bea585 in getint(std::initializer_list<int>) /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/erasure-code/TestErasureCodeShec_arguments.cc:46:21
    #1 0x55d851bf0258 in int std::__invoke_impl<int, int (*&)(std::initializer_list<int>), std::initializer_list<int>&>(std::__invoke_other, int (*&)(std::initializer_list<int>), std::initializer_list<int>&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
...
Address 0x7f5fe9803580 is located in stack of thread T0 at offset 1408 in frame
    #0 0x55d851bdd07f in create_table_shec432() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/erasure-code/TestErasureCodeShec_arguments.cc:52
```
Fix this by using std::array<std::array<int, 4>, 27> instead, which
actually owns and stores the data rather than just pointing to it.
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Jul 8, 2025 
    
    
      
  
    
      
    
  
Previously, SyncPoint allocated two C_Gather instances tracked by raw
pointers but failed to properly clean them up when only a single sync
point existed, causing memory leaks detected by AddressSanitizer.
This change fixes the leak by modifying AbstractWriteLog::shut_down()
to check for prior sync points in the chain. When the current sync point
is the only one present, we now activate the m_prior_log_entries_persisted
context to ensure:
- The onfinish callback executes and releases the captured strong
  reference to the enclosing SyncPoint
- The parent m_sync_point_persist context completes and gets properly
  released
This ensures all allocated contexts are cleaned up correctly during
shutdown, eliminating the memory leak.
The ASan report:
```
Indirect leak of 2064 byte(s) in 1 object(s) allocated from:
    #0 0x56440919ae2d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_librbd+0x2f3de2d) (BuildId: 6a04677c6ee5235f1a41815df807f97c5b96d4cd)
    #1 0x56440bd67751 in __gnu_cxx::new_allocator<Context*>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
    #2 0x56440bd676e0 in std::allocator<Context*>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
    ceph#3 0x56440bd676e0 in std::allocator_traits<std::allocator<Context*>>::allocate(std::allocator<Context*>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
    ceph#4 0x56440bd6730b in std::_Vector_base<Context*, std::allocator<Context*>>::_M_allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
    ceph#5 0x7fd33e00e8d1 in std::vector<Context*, std::allocator<Context*>>::reserve(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:78:22
    ceph#6 0x7fd33e00c51c in librbd::cache::pwl::SyncPoint::SyncPoint(unsigned long, ceph::common::CephContext*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/SyncPoint.cc:20:27
    ceph#7 0x56440bd65f26 in decltype(::new((void*)(0)) librbd::cache::pwl::SyncPoint(std::declval<unsigned long&>(), std::declval<ceph::common::CephContext*&>())) std::construct_at<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
    ceph#8 0x56440bd65b98 in void std::allocator_traits<std::allocator<librbd::cache::pwl::SyncPoint>>::construct<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>&, librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
    ceph#9 0x56440bd657d3 in std::_Sp_counted_ptr_inplace<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:519:4
    ceph#10 0x56440bd65371 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*&, std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:651:6
    ceph#11 0x56440bd65163 in std::__shared_ptr<librbd::cache::pwl::SyncPoint, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1342:14
    ceph#12 0x56440bd650e6 in std::shared_ptr<librbd::cache::pwl::SyncPoint>::shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:409:4
    ceph#13 0x56440bd65057 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::allocate_shared<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint> const&, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:862:14
    ceph#14 0x56440bca97e7 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::make_shared<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:878:14
    ceph#15 0x56440bd443c8 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::new_sync_point(librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1905:20
    ceph#16 0x56440bd42e4c in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1951:3
    ceph#17 0x56440bd9cbf2 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point_if_needed(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1990:5
    ceph#18 0x56440bd9c636 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&)::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2152:9
    ceph#19 0x56440bd9b9b4 in boost::detail::function::void_function_obj_invoker<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&), void, librbd::cache::pwl::GuardedRequestFunctionContext&>::invoke(boost::detail::function::function_buffer&, librbd::cache::pwl::GuardedRequestFunctionContext&) /opt/ceph/include/boost/function/function_template.hpp:100:11
    ceph#20 0x56440bd29321 in boost::function_n<void, librbd::cache::pwl::GuardedRequestFunctionContext&>::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /opt/ceph/include/boost/function/function_template.hpp:789:14
    ceph#21 0x56440bd28d85 in librbd::cache::pwl::GuardedRequestFunctionContext::finish(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/Request.h:335:5
    ceph#22 0x5644091e0fe0 in Context::complete(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/include/Context.h:102:5
    ceph#23 0x56440bd9b378 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::detain_guarded_request(librbd::cache::pwl::C_BlockIORequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::GuardedRequestFunctionContext*, bool) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1202:20
    ceph#24 0x56440bd96c50 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2154:3
    ceph#25 0x56440bd1e4b5 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::shut_down(Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:703:3
    ceph#26 0x56440bdb9022 in librbd::cache::pwl::TestMockCacheSSDWriteLog_compare_and_write_compare_matched_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/librbd/cache/pwl/test_mock_SSDWriteLog.cc:403:7
```
Fixes: https://tracker.ceph.com/issues/71335
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 05fd6f9)
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Jul 17, 2025 
    
    
      
  
    
      
    
  
Fix AddressSanitizer ODR (One Definition Rule) violation caused by
osdc/error_code.cc being compiled into both the osdc library and
ceph-common library.
The violation occurred because osdc_error_category was defined in
both the rbd binary (via osdc) and libceph-common.so, creating
duplicate symbols at runtime.
Since all targets that link against osdc also link against
ceph-common, removing osdc/error_code.cc from osdc doesn't break the
build while eliminating the duplicate definition.
ASan error sample:
```
  ==857433==ERROR: AddressSanitizer: odr-violation (0x5612ad665760):
    [1] size=22 'typeinfo name for osdc_error_category' /home/jenkins-build/build/workspace/ceph-pull-requests/src/osdc/error_code.cc in /home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/rbd
    [2] size=22 'typeinfo name for osdc_error_category' /home/jenkins-build/build/workspace/ceph-pull-requests/src/osdc/error_code.cc in /home/jenkins-build/build/workspace/ceph-pull-requests/build/lib/libceph-common.so.2
  These globals were registered at these points:
    [1]:
      #0 0x5612acd62c88 in __asan_register_globals (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/rbd+0x815c88) (BuildId: 62a02cbbf3426e5470e16372e3b53a18cb18ce0f)
      #1 0x5612acd63d59 in __asan_register_elf_globals (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/rbd+0x816d59) (BuildId: 62a02cbbf3426e5470e16372e3b53a18cb18ce0f)
      #2 0x7f28d3b02eba in call_init csu/../csu/libc-start.c:145:3
      ceph#3 0x7f28d3b02eba in __libc_start_main csu/../csu/libc-start.c:379:5
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Aug 13, 2025 
    
    
      
  
    
      
    
  
Removing vxattr 'ceph.dir.subvolume' on a directory without it being set causes the mds to crash. This is because the snaprealm would be null for the directory and the null check is missing. Setting the vxattr, creates the snaprealm for the directory as part of it. Hence, mds doesn't crash when the vxattr is set and then removed. This patch fixes the same. Reproducer: $mkdir /mnt/dir1 $setfattr -x "ceph.dir.subvolume" /mnt/dir1 Traceback: ------- Core was generated by `./ceph/build/bin/ceph-mds -i a -c ./ceph/build/ceph.conf'. Program terminated with signal SIGSEGV, Segmentation fault. (gdb) bt #0 0x00007f33f1aa8034 in __pthread_kill_implementation () from /lib64/libc.so.6 #1 0x00007f33f1a4ef1e in raise () from /lib64/libc.so.6 #2 0x0000562b148a6fd0 in reraise_fatal (signum=signum@entry=11) at /ceph/src/global/signal_handler.cc:88 ceph#3 0x0000562b148a83d9 in handle_oneshot_fatal_signal (signum=11) at /ceph/src/global/signal_handler.cc:367 ceph#4 <signal handler called> ceph#5 Server::handle_client_setvxattr (this=0x562b4ee3f800, mdr=..., cur=0x562b4ef9cc00) at /ceph/src/mds/Server.cc:6406 ceph#6 0x0000562b145fadc2 in Server::handle_client_removexattr (this=0x562b4ee3f800, mdr=...) at /ceph/src/mds/Server.cc:7022 ceph#7 0x0000562b145fbff0 in Server::dispatch_client_request (this=0x562b4ee3f800, mdr=...) at /ceph/src/mds/Server.cc:2825 ceph#8 0x0000562b145fcfa2 in Server::handle_client_request (this=0x562b4ee3f800, req=...) at /ceph/src/mds/Server.cc:2676 ceph#9 0x0000562b1460063c in Server::dispatch (this=0x562b4ee3f800, m=...) at /ceph/src/mds/Server.cc:382 ceph#10 0x0000562b1450eb22 in MDSRank::handle_message (this=this@entry=0x562b4ef42008, m=...) at /ceph/src/mds/MDSRank.cc:1222 ceph#11 0x0000562b14510c93 in MDSRank::_dispatch (this=this@entry=0x562b4ef42008, m=..., new_msg=new_msg@entry=true) at /ceph/src/mds/MDSRank.cc:1045 ceph#12 0x0000562b14511620 in MDSRankDispatcher::ms_dispatch (this=this@entry=0x562b4ef42000, m=...) at /ceph/src/mds/MDSRank.cc:1019 ceph#13 0x0000562b144ff117 in MDSDaemon::ms_dispatch2 (this=0x562b4ee64000, m=...) at /ceph/src/common/RefCountedObj.h:56 ceph#14 0x00007f33f2f4974a in Messenger::ms_deliver_dispatch (this=0x562b4ee70000, m=...) at /ceph/src/msg/Messenger.h:746 ceph#15 0x00007f33f2f467e2 in DispatchQueue::entry (this=0x562b4ee703b8) at /ceph/src/msg/DispatchQueue.cc:202 ceph#16 0x00007f33f2ff61cb in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /ceph/src/msg/DispatchQueue.h:101 ceph#17 0x00007f33f2df4b5d in Thread::entry_wrapper (this=0x562b4ee70518) at /ceph/src/common/Thread.cc:87 ceph#18 0x00007f33f2df4b6f in Thread::_entry_func (arg=<optimized out>) at /ceph/src/common/Thread.cc:74 ceph#19 0x00007f33f1aa6088 in start_thread () from /lib64/libc.so.6 ceph#20 0x00007f33f1b29f8c in clone3 () from /lib64/libc.so.6 --------- Fixes: https://tracker.ceph.com/issues/70794 Signed-off-by: Kotresh HR <khiremat@redhat.com>
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Aug 13, 2025 
    
    
      
  
    
      
    
  
Fix a memory leak where HybridAllocator's embedded BitmapAllocator
(bmap_alloc) was not properly cleaned up during destruction. The issue
occurred because virtual function calls in destructors don't dispatch
to derived class implementations - when AvlAllocator::~AvlAllocator()
calls shutdown(), it invokes AvlAllocator::shutdown() rather than
HybridAllocatorBase::shutdown(), leaving bmap_alloc unfreed.
This issue only affects unit tests and does not impact production,
as BlueFS always calls shutdown() explicitly when stopping allocators
in BlueFS::_stop_alloc()
This change has minimal impact on existing code that already calls
shutdown() explicitly, since shutdown() has idempotent behavior and
can be safely called multiple times. Additionally, destructors are
only invoked during system teardown, so the potential double shutdown()
call does not affect runtime performance.
Changes:
- Replace raw pointer bmap_alloc with unique_ptr for automatic cleanup
- Add explicit shutdown() call in BitmapAllocator destructor for RAII
- Replace manual bmap_alloc->shutdown() with bmap_alloc.reset()
- Add explicit default destructor to HybridAllocatorBase for clarity
This eliminates the need for manual shutdown() calls and ensures
proper resource cleanup through RAII principles.
Fixes ASan-reported indirect leak of BitmapAllocator resources.
Please see the leak reported by ASan for more details:
```
Indirect leak of 23 byte(s) in 1 object(s) allocated from:
    #0 0x7fe30b121a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
    #1 0x5614e057ee11 in std::__new_allocator<char>::allocate(unsigned long, void const*) /usr/include/c++/15.1.1/bits/new_allocator.h:151
    #2 0x5614e057d279 in std::allocator<char>::allocate(unsigned long) /usr/include/c++/15.1.1/bits/allocator.h:203
    ceph#3 0x5614e057d279 in std::allocator_traits<std::allocator<char> >::allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/alloc_traits.h:614
    ceph#4 0x5614e057d279 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_allocate(std::allocator<char>&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.h:142
    ceph#5 0x5614e057cf27 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_create(unsigned long&, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:164
    ceph#6 0x5614e05fe91b in void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<true>(char const*, unsigned long) /usr/include/c++/15.1.1/bits/basic_string.tcc:291
    ceph#7 0x5614e05f3cd4 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /usr/include/c++/15.1.
1/bits/basic_string.h:617
    ceph#8 0x5614e069b892 in ceph::mutex_debug_detail::mutex_debug_impl<false>::mutex_debug_impl(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool) /home/kefu/dev/ceph/src/common/mutex_debug.
h:103
    ceph#9 0x5614e06b8586 in ceph::mutex_debug_detail::mutex_debug_impl<false> ceph::make_mutex<char const (&) [23]>(char const (&) [23]) /home/kefu/dev/ceph/src/common/ceph_mutex.h:118
    ceph#10 0x5614e06b8350 in AllocatorLevel02<AllocatorLevel01Loose>::AllocatorLevel02() /home/kefu/dev/ceph/src/os/bluestore/fastbmap_allocator_impl.h:526
    ceph#11 0x5614e06b393b in BitmapAllocator::BitmapAllocator(ceph::common::CephContext*, long, long, std::basic_string_view<char, std::char_traits<char> >) /home/kefu/dev/ceph/src/os/bluestore/BitmapAllocator.cc:16
    ceph#12 0x5614e072755f in HybridAllocatorBase<AvlAllocator>::_spillover_range(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/HybridAllocator_impl.h:123
    ceph#13 0x5614e06c9045 in AvlAllocator::_try_insert_range(unsigned long, unsigned long, boost::intrusive::tree_iterator<boost::intrusive::mhtraits<range_seg_t, boost::intrusive::avl_set_member_hook<>, &range_seg_t::offset_hook>,
false>*) /home/kefu/dev/ceph/src/os/bluestore/AvlAllocator.h:210
    ceph#14 0x5614e06c0104 in AvlAllocator::_add_to_tree(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/AvlAllocator.cc:136
    ceph#15 0x5614e0586e5f in HybridAllocatorBase<AvlAllocator>::_add_to_tree(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/HybridAllocator.h:110
    ceph#16 0x5614e06c6559 in AvlAllocator::init_add_free(unsigned long, unsigned long) /home/kefu/dev/ceph/src/os/bluestore/AvlAllocator.cc:494
    ceph#17 0x5614e0582943 in HybridAllocator_fragmentation_Test::TestBody() /home/kefu/dev/ceph/src/test/objectstore/hybrid_allocator_test.cc:285
    ceph#18 0x5614e061bee3 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2653
    ceph#19 0x5614e0606562 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/kefu/dev/ceph/src/googletest/googletest/src/gtest.cc:2689`
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Aug 13, 2025 
    
    
      
  
    
      
    
  
Fix a memory leak in ErasureCodePluginExample when plugin registration
fails. The allocated ErasureCodePluginExample instance was not being
freed if ErasureCodePluginRegistry::add() failed, which occurs in tests
that intentionally register duplicate plugins.
ASan detected the leak:
```
Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7f4501321a2d in operator new(unsigned long) /usr/src/debug/gcc/gcc/libsanitizer/asan/asan_new_delete.cpp:86
    #1 0x7f4501a5914d in __erasure_code_init /home/kefu/dev/ceph/src/test/erasure-code/ErasureCodePluginExample.cc:44
    #2 0x5589985be68d in ceph::ErasureCodePluginRegistry::load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>
> const&, ceph::ErasureCodePlugin**, std::ostream*) /home/kefu/dev/ceph/src/erasure-code/ErasureCodePlugin.cc:149
    ceph#3 0x5589984984ee in ErasureCodePluginRegistryTest_all_Test::TestBody() /home/kefu/dev/ceph/src/test/erasure-code/TestErasureCodePlugin.cc:116
```
Use unique_ptr to manage the plugin instance lifecycle, following the
pattern used by other erasure code plugins. The instance is now
automatically destroyed if registry addition fails.
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Aug 13, 2025 
    
    
      
  
    
      
    
  
Previously, run-cli-tests ignored all environment variables from the parent
process to ensure a clean test environment. However, this also dropped
sanitizer settings (ASAN_OPTIONS and LSAN_OPTIONS) needed when AddressSanitizer
is enabled.
This causes test failures with TCMalloc due to false-positive leak reports
from TCMalloc's internal objects, which is a known issue documented in
Google's C++ style guide. While recent gperftools releases have fixed this,
Ubuntu Jammy still ships with an older version that triggers these warnings.
This change preserves sanitizer environment variables while maintaining
the clean test environment for other variables.
Note: Once we upgrade to newer gperftools, we can remove the related
suppression rule in qa/lsan.supp.
The test failure with TCMalloc looks like:
```
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/ceph-kvstore-tool/help.t: failed
--- /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/ceph-kvstore-tool/help.t
+++ /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/cli/ceph-kvstore-tool/help.t.err
@@ -21,3 +21,19 @@
     stats
     histogram [prefix]
+
+  =================================================================
+  ==87908==ERROR: LeakSanitizer: detected memory leaks
+
+  Direct leak of 45 byte(s) in 1 object(s) allocated from:
+      #0 0x562fd797265d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/ceph-kvstore-tool+0xe5e65d) (BuildId: 7eb56077b615aeb3c7aedafa0818ad89fdf3702d)
+      #1 0x562fd79815c8 in std::__new_allocator<char>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/new_allocator.h:137:27
+      #2 0x562fd7981520 in std::allocator<char>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/allocator.h:188:32
+      ceph#3 0x562fd7981520 in std::allocator_traits<std::allocator<char>>::allocate(std::allocator<char>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/alloc_traits.h:464:20
+      ceph#4 0x562fd798115a in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_create(unsigned long&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/basic_string.tcc:155:14
+      ceph#5 0x562fd798787f in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/basic_string.tcc:328:21
+      ceph#6 0x562fd79876a7 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>::_M_append(char const*, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/basic_string.tcc:420:8
+      ceph#7 0x7fa1aa0286f0 in MallocExtension::Initialize() (/lib/x86_64-linux-gnu/libtcmalloc.so.4+0x2a6f0) (BuildId: eeef3d1257388a806e122398dbce3157ee568ef4)
+
+  SUMMARY: AddressSanitizer: 45 byte(s) leaked in 1 allocation(s).
```
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Aug 27, 2025 
    
    
      
  
    
      
    
  
```
=================================================================
==598386==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x7fc37fefd578 in operator new(unsigned long) (/lib64/libasan.so.8+0xfd578) (BuildId: 8843146064a37d00e17ca27fd774b31ebc6d40e2)
    #1 0x7fc37fa2fb71 in MallocExtension::Register(MallocExtension*) (/lib64/libtcmalloc.so.4+0x2fb71) (BuildId: 5fec9a5a81e329ba0c333f14020599650d07af6f)
SUMMARY: AddressSanitizer: 8 byte(s) leaked in 1 allocation(s).
```
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Sep 10, 2025 
    
    
      
  
    
      
    
  
Previously, SyncPoint allocated two C_Gather instances tracked by raw
pointers but failed to properly clean them up when only a single sync
point existed, causing memory leaks detected by AddressSanitizer.
This change fixes the leak by modifying AbstractWriteLog::shut_down()
to check for prior sync points in the chain. When the current sync point
is the only one present, we now activate the m_prior_log_entries_persisted
context to ensure:
- The onfinish callback executes and releases the captured strong
  reference to the enclosing SyncPoint
- The parent m_sync_point_persist context completes and gets properly
  released
This ensures all allocated contexts are cleaned up correctly during
shutdown, eliminating the memory leak.
The ASan report:
```
Indirect leak of 2064 byte(s) in 1 object(s) allocated from:
    #0 0x56440919ae2d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_librbd+0x2f3de2d) (BuildId: 6a04677c6ee5235f1a41815df807f97c5b96d4cd)
    #1 0x56440bd67751 in __gnu_cxx::new_allocator<Context*>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
    #2 0x56440bd676e0 in std::allocator<Context*>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
    ceph#3 0x56440bd676e0 in std::allocator_traits<std::allocator<Context*>>::allocate(std::allocator<Context*>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
    ceph#4 0x56440bd6730b in std::_Vector_base<Context*, std::allocator<Context*>>::_M_allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
    ceph#5 0x7fd33e00e8d1 in std::vector<Context*, std::allocator<Context*>>::reserve(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:78:22
    ceph#6 0x7fd33e00c51c in librbd::cache::pwl::SyncPoint::SyncPoint(unsigned long, ceph::common::CephContext*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/SyncPoint.cc:20:27
    ceph#7 0x56440bd65f26 in decltype(::new((void*)(0)) librbd::cache::pwl::SyncPoint(std::declval<unsigned long&>(), std::declval<ceph::common::CephContext*&>())) std::construct_at<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
    ceph#8 0x56440bd65b98 in void std::allocator_traits<std::allocator<librbd::cache::pwl::SyncPoint>>::construct<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>&, librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
    ceph#9 0x56440bd657d3 in std::_Sp_counted_ptr_inplace<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:519:4
    ceph#10 0x56440bd65371 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*&, std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:651:6
    ceph#11 0x56440bd65163 in std::__shared_ptr<librbd::cache::pwl::SyncPoint, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1342:14
    ceph#12 0x56440bd650e6 in std::shared_ptr<librbd::cache::pwl::SyncPoint>::shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:409:4
    ceph#13 0x56440bd65057 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::allocate_shared<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint> const&, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:862:14
    ceph#14 0x56440bca97e7 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::make_shared<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:878:14
    ceph#15 0x56440bd443c8 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::new_sync_point(librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1905:20
    ceph#16 0x56440bd42e4c in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1951:3
    ceph#17 0x56440bd9cbf2 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point_if_needed(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1990:5
    ceph#18 0x56440bd9c636 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&)::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2152:9
    ceph#19 0x56440bd9b9b4 in boost::detail::function::void_function_obj_invoker<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&), void, librbd::cache::pwl::GuardedRequestFunctionContext&>::invoke(boost::detail::function::function_buffer&, librbd::cache::pwl::GuardedRequestFunctionContext&) /opt/ceph/include/boost/function/function_template.hpp:100:11
    ceph#20 0x56440bd29321 in boost::function_n<void, librbd::cache::pwl::GuardedRequestFunctionContext&>::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /opt/ceph/include/boost/function/function_template.hpp:789:14
    ceph#21 0x56440bd28d85 in librbd::cache::pwl::GuardedRequestFunctionContext::finish(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/Request.h:335:5
    ceph#22 0x5644091e0fe0 in Context::complete(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/include/Context.h:102:5
    ceph#23 0x56440bd9b378 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::detain_guarded_request(librbd::cache::pwl::C_BlockIORequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::GuardedRequestFunctionContext*, bool) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1202:20
    ceph#24 0x56440bd96c50 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2154:3
    ceph#25 0x56440bd1e4b5 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::shut_down(Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:703:3
    ceph#26 0x56440bdb9022 in librbd::cache::pwl::TestMockCacheSSDWriteLog_compare_and_write_compare_matched_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/librbd/cache/pwl/test_mock_SSDWriteLog.cc:403:7
```
Fixes: https://tracker.ceph.com/issues/71335
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 05fd6f9)
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Sep 17, 2025 
    
    
      
  
    
      
    
  
Replace unsafe string construction with bufferlist::length() to avoid reading beyond buffer boundaries. In commit 92ccbff, we introduced a bug when checking if a digest was empty by constructing a std::string from bufferlist: ``` std::string(digest.second.c_str()).empty() ``` This is unsafe because bufferlist data is not guaranteed to be null- terminated. The std::string constructor searches for a null terminator and may read beyond the bufferlist's allocated memory, causing a heap-buffer-overflow detected by AddressSanitizer: ``` ==66092==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7e0c65215004 at pc 0x7fbc6e27c597 bp 0x7ffe29fb6100 sp 0x7ffe29fb58b8 READ of size 5 at 0x7e0c65215004 thread T0 #0 0x7fbc6e27c596 in strlen /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:425 #1 0x562c75fad91a in std::char_traits<char>::length(char const*) /usr/include/c++/15.2.1/bits/char_traits.h:393 #2 0x562c75fb4222 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string<std::allocator<char> >(char const*, std::allocator<char> const&) /usr/include/c++/15.2.1/bits/b asic_string.h:713 ceph#3 0x562c761b81ae in operator() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:1300 ceph#4 0x562c761d7d53 in operator()<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false> > /usr/include/c++/15.2.1/bits/predefined_ops.h:318 ceph#5 0x562c761d789c in __find_if<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, __gnu_cxx::__ops::_Iter_pred<ScrubBackend::match_in_shards(const hobject_t&, auth_selection_ t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > > /usr/include/c++/15.2.1/bits/stl_algobase.h:2095 ceph#6 0x562c761d72b2 in find_if<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:3921 ceph#7 0x562c761d5f6f in none_of<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:431 ceph#8 0x562c761d4a50 in any_of<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, s td::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:450 ceph#9 0x562c761bb84b in ScrubBackend::match_in_shards(hobject_t const&, auth_selection_t&, inconsistent_obj_wrapper&, std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >&) /home/k efu/dev/ceph/src/osd/scrubber/scrub_backend.cc:1297 ceph#10 0x562c761b4282 in ScrubBackend::compare_obj_in_maps[abi:cxx11](hobject_t const&) /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:941 ceph#11 0x562c761d44af in operator()<hobject_t> /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:887 ceph#12 0x562c761d4836 in for_each<std::_Rb_tree_const_iterator<hobject_t>, ScrubBackend::compare_smaps()::<lambda(const auto:422&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:3798 ceph#13 0x562c761b3259 in ScrubBackend::compare_smaps() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:884 ceph#14 0x562c761a478d in ScrubBackend::update_authoritative() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:315` ``` Fix by using bufferlist::length() which tells if the given buffer is empty instead of converting the buffer content to a string. Signed-off-by: Kefu Chai <tchaikov@gmail.com> (cherry picked from commit 100c20b)
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Sep 18, 2025 
    
    
      
  
    
      
    
  
… overflow()
When sanitizer is enabled, unittest_log fails as following
```
[ RUN      ] Log.StderrPipeBig
=================================================================
==3302372==ERROR: AddressSanitizer: heap-use-after-free on address 0xffff96e01d00 at pc 0xaaaadd3db754 bp 0xffffd9ebffa0 sp 0xffffd9ebf790
READ of size 4096 at 0xffff96e01d00 thread T0
    #0 0xaaaadd3db750 in __asan_memmove (/root/ceph-19.0.0/build/bin/unittest_log+0x3fb750) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
    #1 0xffffafc23734 in char const* boost::container::dtl::memmove_n_source<char const*, char*>(char const*, unsigned long, char*) /root/ceph-19.0.0/build/boost/include/boost/container/detail/copy_move_algo.hpp:261:10
    #2 0xffffafc23734 in boost::container::dtl::enable_if_memtransfer_copy_constructible<char const*, char*, char const*>::type boost::container::uninitialized_copy_alloc_n_source<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*, char*>(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char const*, unsigned long, char*) /root/ceph-19.0.0/build/boost/include/boost/container/detail/copy_move_algo.hpp:600:11
    ceph#3 0xffffafc23734 in void boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>::uninitialized_copy_n_and_update<char*>(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/detail/advanced_insert_int.hpp:85:22
    ceph#4 0xffffafc23734 in void boost::container::expand_forward_and_insert_alloc<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char*, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*> >(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char*, char*, unsigned long, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>) /root/ceph-19.0.0/build/boost/include/boost/container/detail/copy_move_algo.hpp:1469:23
    ceph#5 0xffffafc23734 in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_expand_forward<boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*> >(char*, unsigned long, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>, boost::move_detail::integral_constant<bool, false>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:3058:7
    ceph#6 0xffffafc23734 in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range<boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*> >(char* const&, unsigned long, boost::container::dtl::insert_range_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const*>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2890:16
    ceph#7 0xffffafc23734 in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::insert<char const*>(boost::container::vec_iterator<char*, true>, char const*, char const*, boost::move_detail::disable_if_or<void, boost::move_detail::is_convertible<char const*, unsigned long>, boost::container::dtl::is_input_iterator<char const*, has_iterator_category<char const*>::value>, boost::move_detail::bool_<false>, boost::move_detail::bool_<false> >::type*) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2088:20
    ceph#8 0xffffafc23734 in ceph::logging::ConcreteEntry::ConcreteEntry(ceph::logging::Entry const&) /root/ceph-19.0.0/src/log/Entry.h:84:9
    ceph#9 0xffffafc21a88 in decltype(new ((void*)(0))ceph::logging::ConcreteEntry(std::declval<ceph::logging::Entry>())) std::construct_at<ceph::logging::ConcreteEntry, ceph::logging::Entry>(ceph::logging::ConcreteEntry*, ceph::logging::Entry&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
    ceph#10 0xffffafc21198 in void std::allocator_traits<std::allocator<ceph::logging::ConcreteEntry> >::construct<ceph::logging::ConcreteEntry, ceph::logging::Entry>(std::allocator<ceph::logging::ConcreteEntry>&, ceph::logging::ConcreteEntry*, ceph::logging::Entry&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
    ceph#11 0xffffafc16464 in ceph::logging::ConcreteEntry& std::vector<ceph::logging::ConcreteEntry, std::allocator<ceph::logging::ConcreteEntry> >::emplace_back<ceph::logging::Entry>(ceph::logging::Entry&&) /usr/bin/../lib/gcc/aarch64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:115:6
    ceph#12 0xffffafc0dcbc in ceph::logging::Log::submit_entry(ceph::logging::Entry&&) /root/ceph-19.0.0/src/log/Log.cc:265:9
    ceph#13 0xaaaadd41a404 in Log_StderrPipeBig_Test::TestBody() /root/ceph-19.0.0/src/log/test.cc:280:9
    ceph#14 0xaaaade0b4338 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
    ceph#15 0xaaaade061244 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
    ceph#16 0xaaaade012680 in testing::Test::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2680:5
    ceph#17 0xaaaade0145c4 in testing::TestInfo::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2858:11
    ceph#18 0xaaaade015bc4 in testing::TestSuite::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:3012:28
    ceph#19 0xaaaade031988 in testing::internal::UnitTestImpl::RunAllTests() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5723:44
    ceph#20 0xaaaade0be24c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
    ceph#21 0xaaaade0687dc in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
    ceph#22 0xaaaade030e00 in testing::UnitTest::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5306:10
    ceph#23 0xaaaadd425c48 in RUN_ALL_TESTS() /root/ceph-19.0.0/src/googletest/googletest/include/gtest/gtest.h:2486:46
    ceph#24 0xaaaadd4207a0 in main /root/ceph-19.0.0/src/log/test.cc:503:10
    ceph#25 0xffffac3473f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ceph#26 0xffffac3474c8 in __libc_start_main csu/../csu/libc-start.c:392:3
    ceph#27 0xaaaadd364d6c in _start (/root/ceph-19.0.0/build/bin/unittest_log+0x384d6c) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
0xffff96e01d00 is located 0 bytes inside of 6553-byte region [0xffff96e01d00,0xffff96e03699)
freed by thread T0 here:
    #0 0xaaaadd4136f0 in operator delete(void*) (/root/ceph-19.0.0/build/bin/unittest_log+0x4336f0) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
    #1 0xaaaadd434968 in boost::container::new_allocator<char>::deallocate(char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/new_allocator.hpp:171:7
    #2 0xaaaadd434934 in boost::container::allocator_traits<boost::container::new_allocator<char> >::deallocate(boost::container::new_allocator<char>&, char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:308:9
    ceph#3 0xaaaadd434934 in boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>::deallocate(char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/small_vector.hpp:255:10
    ceph#4 0xaaaadd43911c in boost::container::allocator_traits<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void> >::deallocate(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, char*, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:308:9
    ceph#5 0xaaaadd43911c in boost::container::vector_alloc_holder<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, unsigned long, boost::move_detail::integral_constant<unsigned int, 1u> >::deallocate(char* const&, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:487:7
    ceph#6 0xaaaadd43911c in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_new_allocation<boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&> >(char*, unsigned long, char*, unsigned long, boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:3080:25
    ceph#7 0xaaaadd438aec in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_no_capacity<boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&> >(char*, unsigned long, boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&>, boost::move_detail::integral_constant<unsigned int, 1u>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2830:13
    ceph#8 0xaaaadd4328bc in char& boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::emplace_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1888:24
    ceph#9 0xaaaadd4328bc in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_push_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2746:13
    ceph#10 0xaaaadd4328bc in boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::push_back(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1996:4
    ceph#11 0xaaaadd4328bc in StackStringBuf<4096ul>::overflow(int) /root/ceph-19.0.0/src/common/StackStringStream.h:79:11
    ceph#12 0xffffac6d3dac in std::ostream::put(char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x133dac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#13 0xffffac6d4aac in std::basic_ostream<char, std::char_traits<char> >& std::operator<<<std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x134aac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#14 0xaaaadd41a3c8 in Log_StderrPipeBig_Test::TestBody() /root/ceph-19.0.0/src/log/test.cc:278:9
    ceph#15 0xaaaade0b4338 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
    ceph#16 0xaaaade061244 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
    ceph#17 0xaaaade012680 in testing::Test::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2680:5
    ceph#18 0xaaaade0145c4 in testing::TestInfo::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2858:11
    ceph#19 0xaaaade015bc4 in testing::TestSuite::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:3012:28
    ceph#20 0xaaaade031988 in testing::internal::UnitTestImpl::RunAllTests() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5723:44
    ceph#21 0xaaaade0be24c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
    ceph#22 0xaaaade0687dc in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
    ceph#23 0xaaaade030e00 in testing::UnitTest::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5306:10
    ceph#24 0xaaaadd425c48 in RUN_ALL_TESTS() /root/ceph-19.0.0/src/googletest/googletest/include/gtest/gtest.h:2486:46
    ceph#25 0xaaaadd4207a0 in main /root/ceph-19.0.0/src/log/test.cc:503:10
    ceph#26 0xffffac3473f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ceph#27 0xffffac3474c8 in __libc_start_main csu/../csu/libc-start.c:392:3
    ceph#28 0xaaaadd364d6c in _start (/root/ceph-19.0.0/build/bin/unittest_log+0x384d6c) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
previously allocated by thread T0 here:
    #0 0xaaaadd412e88 in operator new(unsigned long) (/root/ceph-19.0.0/build/bin/unittest_log+0x432e88) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
    #1 0xaaaadd433ec0 in boost::container::new_allocator<char>::allocate(unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/new_allocator.hpp:160:30
    #2 0xaaaadd438a68 in boost::container::allocator_traits<boost::container::new_allocator<char> >::priv_allocate(boost::move_detail::integral_constant<bool, false>, boost::container::new_allocator<char>&, unsigned long, void const*) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:395:16
    ceph#3 0xaaaadd438a68 in boost::container::allocator_traits<boost::container::new_allocator<char> >::allocate(boost::container::new_allocator<char>&, unsigned long, void const*) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:318:14
    ceph#4 0xaaaadd438a68 in boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>::allocate(unsigned long, void const*) /root/ceph-19.0.0/build/boost/include/boost/container/small_vector.hpp:248:14
    ceph#5 0xaaaadd438a68 in boost::container::allocator_traits<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void> >::allocate(boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>&, unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/allocator_traits.hpp:302:16
    ceph#6 0xaaaadd438a68 in boost::container::vector_alloc_holder<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, unsigned long, boost::move_detail::integral_constant<unsigned int, 1u> >::allocate(unsigned long) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:482:14
    ceph#7 0xaaaadd438a68 in boost::container::vec_iterator<char*, false> boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_insert_forward_range_no_capacity<boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&> >(char*, unsigned long, boost::container::dtl::insert_emplace_proxy<boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, char const&>, boost::move_detail::integral_constant<unsigned int, 1u>) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2826:73
    ceph#8 0xaaaadd4328bc in char& boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::emplace_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1888:24
    ceph#9 0xaaaadd4328bc in void boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::priv_push_back<char const&>(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:2746:13
    ceph#10 0xaaaadd4328bc in boost::container::vector<char, boost::container::small_vector_allocator<char, boost::container::new_allocator<void>, void>, void>::push_back(char const&) /root/ceph-19.0.0/build/boost/include/boost/container/vector.hpp:1996:4
    ceph#11 0xaaaadd4328bc in StackStringBuf<4096ul>::overflow(int) /root/ceph-19.0.0/src/common/StackStringStream.h:79:11
    ceph#12 0xffffac6d3dac in std::ostream::put(char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x133dac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#13 0xffffac6d4aac in std::basic_ostream<char, std::char_traits<char> >& std::operator<<<std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char) (/lib/aarch64-linux-gnu/libstdc++.so.6+0x134aac) (BuildId: a012b2bb77110e84b266cd7425b50e57427abb02)
    ceph#14 0xaaaadd41a3c8 in Log_StderrPipeBig_Test::TestBody() /root/ceph-19.0.0/src/log/test.cc:278:9
    ceph#15 0xaaaade0b4338 in void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
    ceph#16 0xaaaade061244 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
    ceph#17 0xaaaade012680 in testing::Test::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2680:5
    ceph#18 0xaaaade0145c4 in testing::TestInfo::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2858:11
    ceph#19 0xaaaade015bc4 in testing::TestSuite::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:3012:28
    ceph#20 0xaaaade031988 in testing::internal::UnitTestImpl::RunAllTests() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5723:44
    ceph#21 0xaaaade0be24c in bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2605:10
    ceph#22 0xaaaade0687dc in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:2641:14
    ceph#23 0xaaaade030e00 in testing::UnitTest::Run() /root/ceph-19.0.0/src/googletest/googletest/src/gtest.cc:5306:10
    ceph#24 0xaaaadd425c48 in RUN_ALL_TESTS() /root/ceph-19.0.0/src/googletest/googletest/include/gtest/gtest.h:2486:46
    ceph#25 0xaaaadd4207a0 in main /root/ceph-19.0.0/src/log/test.cc:503:10
    ceph#26 0xffffac3473f8 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    ceph#27 0xffffac3474c8 in __libc_start_main csu/../csu/libc-start.c:392:3
    ceph#28 0xaaaadd364d6c in _start (/root/ceph-19.0.0/build/bin/unittest_log+0x384d6c) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409)
SUMMARY: AddressSanitizer: heap-use-after-free (/root/ceph-19.0.0/build/bin/unittest_log+0x3fb750) (BuildId: 6fd965435d12fd345de38dddc8723053b9877409) in __asan_memmove
Shadow bytes around the buggy address:
  0x200ff2dc0350: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff2dc0360: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff2dc0370: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff2dc0380: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x200ff2dc0390: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x200ff2dc03a0:[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x200ff2dc03b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x200ff2dc03c0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x200ff2dc03d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x200ff2dc03e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x200ff2dc03f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==3302372==ABORTING
```
vec.push_back(str) will allocate memory and release the old one once
there is insufficient memory which causing the old one to be invalid. So
streambuf's data pointer and insertion position should be updated to
newly allocated memory's address in vec.
Fixes: https://tracker.ceph.com/issues/65805
Signed-off-by: Rongqi Sun <sunrongqi@huawei.com>
(cherry picked from commit c8d51b9)
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Sep 18, 2025 
    
    
      
  
    
      
    
  
Previously, SyncPoint allocated two C_Gather instances tracked by raw
pointers but failed to properly clean them up when only a single sync
point existed, causing memory leaks detected by AddressSanitizer.
This change fixes the leak by modifying AbstractWriteLog::shut_down()
to check for prior sync points in the chain. When the current sync point
is the only one present, we now activate the m_prior_log_entries_persisted
context to ensure:
- The onfinish callback executes and releases the captured strong
  reference to the enclosing SyncPoint
- The parent m_sync_point_persist context completes and gets properly
  released
This ensures all allocated contexts are cleaned up correctly during
shutdown, eliminating the memory leak.
The ASan report:
```
Indirect leak of 2064 byte(s) in 1 object(s) allocated from:
    #0 0x56440919ae2d in operator new(unsigned long) (/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/unittest_librbd+0x2f3de2d) (BuildId: 6a04677c6ee5235f1a41815df807f97c5b96d4cd)
    #1 0x56440bd67751 in __gnu_cxx::new_allocator<Context*>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/ext/new_allocator.h:127:27
    #2 0x56440bd676e0 in std::allocator<Context*>::allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/allocator.h:185:32
    ceph#3 0x56440bd676e0 in std::allocator_traits<std::allocator<Context*>>::allocate(std::allocator<Context*>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:464:20
    ceph#4 0x56440bd6730b in std::_Vector_base<Context*, std::allocator<Context*>>::_M_allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_vector.h:346:20
    ceph#5 0x7fd33e00e8d1 in std::vector<Context*, std::allocator<Context*>>::reserve(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/vector.tcc:78:22
    ceph#6 0x7fd33e00c51c in librbd::cache::pwl::SyncPoint::SyncPoint(unsigned long, ceph::common::CephContext*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/SyncPoint.cc:20:27
    ceph#7 0x56440bd65f26 in decltype(::new((void*)(0)) librbd::cache::pwl::SyncPoint(std::declval<unsigned long&>(), std::declval<ceph::common::CephContext*&>())) std::construct_at<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:97:39
    ceph#8 0x56440bd65b98 in void std::allocator_traits<std::allocator<librbd::cache::pwl::SyncPoint>>::construct<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>&, librbd::cache::pwl::SyncPoint*, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:518:4
    ceph#9 0x56440bd657d3 in std::_Sp_counted_ptr_inplace<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:519:4
    ceph#10 0x56440bd65371 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(librbd::cache::pwl::SyncPoint*&, std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:651:6
    ceph#11 0x56440bd65163 in std::__shared_ptr<librbd::cache::pwl::SyncPoint, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:1342:14
    ceph#12 0x56440bd650e6 in std::shared_ptr<librbd::cache::pwl::SyncPoint>::shared_ptr<std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::_Sp_alloc_shared_tag<std::allocator<librbd::cache::pwl::SyncPoint>>, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:409:4
    ceph#13 0x56440bd65057 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::allocate_shared<librbd::cache::pwl::SyncPoint, std::allocator<librbd::cache::pwl::SyncPoint>, unsigned long&, ceph::common::CephContext*&>(std::allocator<librbd::cache::pwl::SyncPoint> const&, unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:862:14
    ceph#14 0x56440bca97e7 in std::shared_ptr<librbd::cache::pwl::SyncPoint> std::make_shared<librbd::cache::pwl::SyncPoint, unsigned long&, ceph::common::CephContext*&>(unsigned long&, ceph::common::CephContext*&) /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr.h:878:14
    ceph#15 0x56440bd443c8 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::new_sync_point(librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1905:20
    ceph#16 0x56440bd42e4c in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1951:3
    ceph#17 0x56440bd9cbf2 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::flush_new_sync_point_if_needed(librbd::cache::pwl::C_FlushRequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::DeferredContexts&) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1990:5
    ceph#18 0x56440bd9c636 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&)::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2152:9
    ceph#19 0x56440bd9b9b4 in boost::detail::function::void_function_obj_invoker<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*)::'lambda'(librbd::cache::pwl::GuardedRequestFunctionContext&), void, librbd::cache::pwl::GuardedRequestFunctionContext&>::invoke(boost::detail::function::function_buffer&, librbd::cache::pwl::GuardedRequestFunctionContext&) /opt/ceph/include/boost/function/function_template.hpp:100:11
    ceph#20 0x56440bd29321 in boost::function_n<void, librbd::cache::pwl::GuardedRequestFunctionContext&>::operator()(librbd::cache::pwl::GuardedRequestFunctionContext&) const /opt/ceph/include/boost/function/function_template.hpp:789:14
    ceph#21 0x56440bd28d85 in librbd::cache::pwl::GuardedRequestFunctionContext::finish(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/Request.h:335:5
    ceph#22 0x5644091e0fe0 in Context::complete(int) /home/jenkins-build/build/workspace/ceph-pull-requests/src/include/Context.h:102:5
    ceph#23 0x56440bd9b378 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::detain_guarded_request(librbd::cache::pwl::C_BlockIORequest<librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>>*, librbd::cache::pwl::GuardedRequestFunctionContext*, bool) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:1202:20
    ceph#24 0x56440bd96c50 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::internal_flush(bool, Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:2154:3
    ceph#25 0x56440bd1e4b5 in librbd::cache::pwl::AbstractWriteLog<librbd::MockImageCtx>::shut_down(Context*) /home/jenkins-build/build/workspace/ceph-pull-requests/src/librbd/cache/pwl/AbstractWriteLog.cc:703:3
    ceph#26 0x56440bdb9022 in librbd::cache::pwl::TestMockCacheSSDWriteLog_compare_and_write_compare_matched_Test::TestBody() /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/librbd/cache/pwl/test_mock_SSDWriteLog.cc:403:7
```
Fixes: https://tracker.ceph.com/issues/71335
Signed-off-by: Kefu Chai <tchaikov@gmail.com>
(cherry picked from commit 05fd6f9)
    
    
  rzarzynski 
      pushed a commit
      that referenced
      this pull request
    
      Oct 13, 2025 
    
    
      
  
    
      
    
  
Replace unsafe string construction with bufferlist::length() to avoid reading beyond buffer boundaries. In commit 92ccbff, we introduced a bug when checking if a digest was empty by constructing a std::string from bufferlist: ``` std::string(digest.second.c_str()).empty() ``` This is unsafe because bufferlist data is not guaranteed to be null- terminated. The std::string constructor searches for a null terminator and may read beyond the bufferlist's allocated memory, causing a heap-buffer-overflow detected by AddressSanitizer: ``` ==66092==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7e0c65215004 at pc 0x7fbc6e27c597 bp 0x7ffe29fb6100 sp 0x7ffe29fb58b8 READ of size 5 at 0x7e0c65215004 thread T0 #0 0x7fbc6e27c596 in strlen /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:425 #1 0x562c75fad91a in std::char_traits<char>::length(char const*) /usr/include/c++/15.2.1/bits/char_traits.h:393 #2 0x562c75fb4222 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string<std::allocator<char> >(char const*, std::allocator<char> const&) /usr/include/c++/15.2.1/bits/b asic_string.h:713 ceph#3 0x562c761b81ae in operator() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:1300 ceph#4 0x562c761d7d53 in operator()<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false> > /usr/include/c++/15.2.1/bits/predefined_ops.h:318 ceph#5 0x562c761d789c in __find_if<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, __gnu_cxx::__ops::_Iter_pred<ScrubBackend::match_in_shards(const hobject_t&, auth_selection_ t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > > /usr/include/c++/15.2.1/bits/stl_algobase.h:2095 ceph#6 0x562c761d72b2 in find_if<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:3921 ceph#7 0x562c761d5f6f in none_of<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, std::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:431 ceph#8 0x562c761d4a50 in any_of<mini_flat_map<shard_id_t, ceph::buffer::v15_2_0::list, signed char>::_iterator<false>, ScrubBackend::match_in_shards(const hobject_t&, auth_selection_t&, inconsistent_obj_wrapper&, s td::stringstream&)::<lambda(const std::pair<const shard_id_t, ceph::buffer::v15_2_0::list&>&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:450 ceph#9 0x562c761bb84b in ScrubBackend::match_in_shards(hobject_t const&, auth_selection_t&, inconsistent_obj_wrapper&, std::__cxx11::basic_stringstream<char, std::char_traits<char>, std::allocator<char> >&) /home/k efu/dev/ceph/src/osd/scrubber/scrub_backend.cc:1297 ceph#10 0x562c761b4282 in ScrubBackend::compare_obj_in_maps[abi:cxx11](hobject_t const&) /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:941 ceph#11 0x562c761d44af in operator()<hobject_t> /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:887 ceph#12 0x562c761d4836 in for_each<std::_Rb_tree_const_iterator<hobject_t>, ScrubBackend::compare_smaps()::<lambda(const auto:422&)> > /usr/include/c++/15.2.1/bits/stl_algo.h:3798 ceph#13 0x562c761b3259 in ScrubBackend::compare_smaps() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:884 ceph#14 0x562c761a478d in ScrubBackend::update_authoritative() /home/kefu/dev/ceph/src/osd/scrubber/scrub_backend.cc:315` ``` Fix by using bufferlist::length() which tells if the given buffer is empty instead of converting the buffer content to a string. Signed-off-by: Kefu Chai <tchaikov@gmail.com>
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
      
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
Signed-off-by: Kefu Chai kchai@redhat.com